Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This article could have been published 30 years ago. In professional unix admin circles this was already well known back them. Although I could be misreading it as the article is not very clear. I think this are the points it is trying to make:

1. Once upon a time you could rely on the passwd file and shell behavior as an effective means of authentication and access control.

2. It has been a very long time since that was an effective approach, for a variety of reasons, and you should not do this on modern production systems.



Even 30 years ago, the core argument would have been nonsensical.

1. They introduce their argument as if it is solely about shell access (the conclusion also only mentions "login access control"), but then the first example/statement they make is about non-shell access (Samba, IMAP, Apache).

2. The second argument conflates authentication and authorization, and concludes that to implement shell authorization properly, your only choice is to provide multiple authentication systems.

Zero effort is spent on explaining why existing/historic shell authorization systems (such as simple DAC groups or rbash) are inadequate, and it's not clear to me what threat model they are using to arrive at their conclusion.

edit: rethinking this, I think TFA is just lacking a clear problem statement. They seem to be talking specifically about non-shell services that (ab)use the user's shell field in /etc/passwd as authorization information, and then complaining that many services did not follow suit.


Few contractions foment confusion as much as “auth”. Don’t do it.


authn vs authz: Authentication vs Authorization

authn/authentication: user proves who they are, with username/password or otherwise

authz/authorization: based on who the user is, system determines what they are allowed to do, via group membership or otherwise


authz may be confusing to non USA English speakers. I wouldn't make the connection without it spelled out to me. Unfortunately I don't have a better suggestion because auths as short for authorisation is probably worse.


If you work with computers (rather than using them) and don't default to USA English when discussing and using them you are likely in for a bad time.


I think it is less confusion than just calling it auth. I have read many articles about basic auth vs oauth. But the auth here isn't the same.


You can't pronounce authn and authz very well, but to be perfectly honest I'm not sure if that falls under the 'pro' or 'con' column.


I think it's a pro. in saying auth-enn and auth-zee (zed), it's clear which of the two you're talking about.


To me they look like the kind of abbreviations I'd only do when writing. I just say authentication or authorisation when reading them (out loud or in my own mind)


TBH, we'd be better if without any of the contracted forms.


the only exception is if you mean both, but even that's confusing if the context isn't clear.

spell them out or use authn/authz.


You're not thinking like a thought leader.


This blog of generally gibberish hits the HN front page with an astounding frequency. IMHO, there are many interesting blogs on "system administration" topics that are submitted to HN every week that never reach the front page while there are a handful of familiar, low-quality ones that routinely appear on page one.


Much of tech is a theatre, a jobs program that keeps people employed in a middle class salary so long as they diligently pretend to be engineers. This theatre serves as a prop for a higher level theatre in our virtual economy for investors and their game of financialization.

Its expected that as tech grows in number of workers clutching to that middle-class life-raft that the baseline of knowledge discussed in tech spheres (like this site) will sink lower.


Is it September already?


it has been for 30 years


I think the point being made is that the fact that a user has rksh as their shell means nothing to samba, ftp, some features of ssh, httpd, cron, and etc. Fundamentally unix has pretty simple permissions, you're either root or you're not. The existence of a user account on a system is often enough to enable SMB and SSH access even if the only purpose of the account is to own files and an application process and is never intended to have interactive logins or to transfer data to and from the server.


> I think the point being made is that the fact that a user has rksh as their shell means nothing to samba, ftp, some features of ssh, httpd, cron, and etc

Which has been true for...... 30 years? If not longer?


Would be possible to share those reasons?


Here are some:

* Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.

* In computer clusters you want user identity to "stick" to the user when they use multiple machines, containers etc. That's why you have LDAP... but it doesn't help all that much because user id is encoded into the file system (huge mistake...) which makes it very difficult to contain users to things they should control. If your only mechanism was the /etc/passwd, it would mean you'd have to constantly synchronize this file across all those machines and containers you have.


It may be old and not particularly appealing, but LDAP has been serving that role effectively at many companies.[0]

Using a terminal remains standard practice for sysadmins and devops.[1]

I believe there's some confusion in both the article and the comment between authentication and authorization. LDAP is fully equipped to handle both tasks.

[0]https://access.redhat.com/documentation/en-us/red_hat_enterp...

[1]https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-co...


To anyone reading this and thinking "yeah dummy, of course it doesn't scale because you're not supposed to store passwords in plain text in the first place" I'll direct you to Chapter 7ish of The Linux Programming Interface.

If you look in your /etc/passwd right now, you'll almost certainly see a single "x" where the (EDIT: no, it was still encrypted!) password originally was - nowadays that single "x" is an instruction to go look in /etc/shadow instead, for the salted hash of the password you're trying to check.

I think this minimizes the number of users who need read permissions to /etc/shadow, and the amount of time they need it for.

This has been your seemingly useless bit of Linux trivia for today. :)


/etc/shadow was born not because /etc/passwd had a plain text password but because the hashes became crackable and /etc/passwd is a public read file. Linux has never had them. Here's the man page indicating encrypted passwords for Unix v7 /etc/passwd release in 1979: https://man.cat-v.org/unix_7th/5/passwd


Whoops! My bad, this is an even better bit of trivia.

My mistaken memory really sells the underlying point that everything old is new again.


I have a vague recollection of my 20 floppy of Slackware already having /etc/shadow. That would have been fall of 92 or winter 93, based on where I was living at the time.


I have a vague recollection of being given the choice on a 90s vintage distribution with some warning about security and password length if I did not use shadow passwords. At some point in the early 2000s we started authenticating regular users against AD but the shadow file was still there for root.


We had a couple of labs of sparcstations that just went away a couple of times a year because something bad would happen with all of the NFS mounted partitions and they'd have to turn the cluster on one box at a time to prevent thundering herd issues with NFS.

I think they may have been mounting parts of /etc as well. People get the idea that managing accounts for a cluster of boxes should be centralized. It's all fun and games until the network mount disappears.


Shadow file was definitely from quite a while ago.

Can't remember exact date, but might have been around time of SVR4 intro.

I know because I remember going "ugh", but without investigating the reason why it (shadow) was introduced :) - which was of course wrong on my part.


That was not a "plaintext password," it was a DES hash (from 7th edition onwards).

This is the same format used by the classic htpasswd utility.

https://en.wikipedia.org/wiki/Crypt_(C)#Traditional_DES-base...


plaintext vs plain text

unencrypted vs unstructured

Of course, unstructured is also incorrect; the passwd and shadow files have structured records, one per line.


…and being structured, the passwd file content should be accessed with the getpwent family of functions.


Which unfortunately are not thread safe.


"unencrypted" is normally written as "cleartext". "plaintext" means "(readable / intended to be read) without a special viewer". Your.ssh /id_rsa is plaintext but not cleartext.


My school had 30k students, including grad and doctoral students. When they gave students shell accounts, they tried to put them all onto a single Sequent box. They got my incoming class, the following, and anyone previous who asked for one onto that box. I’m pretty sure it had an /etc/password file, and would have had about 8-10k people on it.

After that they gave up. Even with aggressive ulimits it was too hard, and each new class was apportioned to a separate Sparc. Which was a shame because we learned an awful lot about Unix administration from users pranking each other, figuring out what they did and how they did it, protecting yourself and retaliating (some people would prank others without first protecting themselves).

Made it a lot harder chatting with people from other classes as well. For electives and humanities you weren’t always in classes with people your exact age. They could be ahead of you or behind.


> Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.

This simply cannot be true. A users directory that is frequently read will be in cache. The file is also fairly simple to parse. Even on old hardware you should be able to parse thousands of users.


It's more of a problem of the number of nodes your users will be accessing than the number of users. It's a PITA to make mechanisms for password changes, for adding and removing users, for unlocking accounts, etc. when you have a distinct file on each server that needs to be changed. As well as adding new nodes to the environment and bringing them up to date on the list of users and etc.


Fair point, networking always makes things more complicated. I'm not sure it's really that much more complicated though. Like any concurrency problem with shared state, a single-writer and multiple readers keeps things simple, eg. a single host is authoritative and any changes are made there, and any file changes trigger automated scripts that distribute the changed file over NFS, FTP, etc.

As long as new nodes start with the same base configuration/image this seems manageable. Simple to do with inotify, but even prior to that some simple C programs to broadcast that a change is available and listen for a change broadcast and pull the changed file is totally within the realm of a classic sysadmin's skillset.


You would have to lock the file, or guarantee consistency in some other way. Right now, I don't believe Linux does anything about consistency of reads / writes to that file... which is bad, but we pretend not to notice.

So... the system is kind of broken to begin with, and it's kind of pointless to try to assess its performance.

Also, it would obviously make a lot of difference if you had a hundred of users with only a handful being active users, or if you had a hundred of active users. I meant active users. Running programs all the time.

NB. You might have heard about this language called Python. Upon starting the interpreter it reads /etc/passwd (because it needs to populate some "static" data in os module). Bet a bunch of similar tools do the same thing. If you have a bunch of users all running Python scripts while there are some changes to the user directory... things are going to get interesting.


> You would have to lock the file, or guarantee consistency in some other way.

I think the standard approach to atomicity is to copy, change the copy, then move that copy overwriting the original (edit: file moves are sorta atomic). Not perfect but generally works.

I agree that this approach is not good for a users directory, I'm just disagreeing that the reason it's not good is performance-related.


Moves are atomic. During the move, at no time is it possible to get the contents of file 1 and file 2 confused when reading from the file descriptors. (Confusion by the human operating things is eminently possible.)


Most systems come with "vipw" which does the atomic-rename dance to avoid problems with /etc/password. In practice this works fine. Things get more complicated when you have alternate PAM arrangements.

A whole bunch of standard functions like getpwents() are defined to read /etc/password, so that can't be changed.


`getpwents()` is not defined to only read `/etc/passwd`. There is only a requirement that there is some abstract "user database" or "password database" (depending on if you're reading the linux man pages or the Single Unix Specification man pages).

In practice, `getpwent` on linux uses the nsswitch mechanism to provide return values from `getpwent`. One can probably disable using `/etc/passwd` entirely when using glibc if: all users do use `getpwent`, and you remove `files` from the `passwd` entry in `/etc/nsswitch.conf`.


I had >30k users on a bunch of systems in 2001 (I inherited that approach, mind, I'm not -recommending- it).

We moved to LDAP a couple years later because it was a much nicer architecture to deal with overall, but performance and consistency weren't problems in practice.


Exactly, even large textfile based DNS servers have capability to "compile" the textfile to a db file for faster access.


So what if they do?

This file is a public interface exposed by Linux to other programs. So what if Linux caches its contents when eg. Python interpreter on launch will read this file. And it's not coming from some "fake" filesystem like procfs or sysfs. It's an actual physical file most of the time.


the python interpreter reads /etc/passwd on launch? i guess it's looking for home directory


> user id is encoded into the file system

This is kind of unavoidable, but you do have 32 bits to play with. Windows did it slightly better with the SID: https://learn.microsoft.com/en-us/windows-server/identity/ad...

> which makes it very difficult to contain users to things they should control

It's not the file system that's the problem here, it's that "everything is a file" is not true for a whole bunch of important stuff that you might want to apply access control to on a UNIX system. Such as the right to initiate TCP connections. This sort of thing is why containers are so popular.

NIS and LDAP do let you have a large number of users. Heck, we managed a few thousand users in /etc/password back when I was running https://www.srcf.net/ .. in 2000.


> it's not the file system that's the problem here, it's that "everything is a file" is not true for a whole bunch of important stuff that you might want to apply access control to on a UNIX system

I wonder if there has ever been an attempt to really lean into, and push the limits of sticking with the "everything is a file" philosophy in this realm.

I.e. how far could you get with having special files for fine grained permissions like "right to initiate a TCP connection", and making access control management be, essentially, managing which groups a user belonged to?


Plan 9 probably took this the furthest. Sad it didn't take off. https://en.m.wikipedia.org/wiki/Plan_9_from_Bell_Labs


I think that was Plan 9.


I think Hurd and Plan 9 take the EIAF further.


Plan9 tried to "remedy this".

But in reality a file is not a good abstraction for an internet socket. The ACLs would in essence spell out firewall rules. Because the bigger question is where can it connect to than "user" that is connecting.

That's why this is done on the level of kernel networking, where kernel knows what process is trying to open a socket and can firewall it.


This sounds like a completely unrelated thing and you are not constrained by the plain text password/shadow file for scale. NIS existed for many decades. You can even use Active Directory (or samba) for authentication and user management.

But the article is not about this at all.


This shows you either didn't read what you replied to or don't understand the subject.

This file is the public interface of the Linux system to everyone who wants to get information about users on the system. It doesn't matter that alternative tools exist: they were already mentioned in the post you replied to. It's not the point...


If I didn't understand I'd be grateful if you could explain it.

As far as I know that file does not get referenced for the users in an external directory server. That's how the systems scale without needing to put the users in the file. Aren't we talking about a high number of users (and their authorization levels) when talking about scalibility in this case?


> This file is the public interface of the Linux system to everyone who wants to get information about users on the system

No it isn't. PAM is. The password file is only one of the places where users might be defined.


I've wondered why we don't have a passwd.d folder, the way we do with other things in the UNIX filesystem, with individual user accounts represented by individual files. Could even retain the same line-oriented format, just stored separately.


I think a directory of 10000 files is worse than a file of 10000 lines.


Maybe, but account synchronization across machines might be a lot simpler and easier.


> * Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.

Why not? A file 100,000 line file will only take a moment to scan.


That's the first understandable explanation I've ever heard of what exactly LDAP is and what it's for!


If you have services with even a few users you should set up LDAP (openldap is fine) and then experiment with it.

LDAP is pretty old and very well supported overall. At one point I configured ldap for postfix virtual hosting. I chose LDAP rather than a database backed solution because of widespread first class support everywhere. The same directory later enabled me to use a ton of other things including SSO. You're always finding new ways to use it once you have it.

It's a great skill to have and is nowhere near as complicated as people make it out to be, including live replication.


Several are outlined in TFA


Tangent, I can't find what "TFA" stands for. My gut tells me it's "The Fucking Article". Am I correct?


Yes.

TFA = The Fucking Article.

Much like:

RTFM = Read The Fucking Manual.


As I said to my mother-in-law, the 'F' is silent.


When Gwen Shotwell was asked about the acronym BFR on international television, she replied "Big Falcon Rocket". Lovely response, quite dependent on context!


This is like how I use 'ofc' to abbreviate 'of course'. Once upon a time the 'f' may have stood for something, but I never use it that way. For me, 'the F is silent'.


I pretend its The Forementioned Article


I like to read it as The Featured Article


Yes, but in polite company you can pretend it means "The Freaking Article"


In polite company you say "The article" or "read the manual" and patiently await your reward, which is their asking "what does the 'F' stand for?", to which you reply only with a condescending look and raised eyebrow(s).

That look of realisation is precious.

I've manufactured this experience once or twice, and it's wonderful.


Fine instead of Freaking is much nicer.


I think "Friendly" was/is popular explanation where devs tried to translate their subculture into something generally digestible.


Digestible maybe, but swapping "friendly" in for "fucking" changes the tone and intent of someone's statement. At least "freaking" expresses similar, if muted, exclamation.

A recipient who is being not-so-subtly reproached with an F-bomb acronym might misunderstand what is being implied.


But when the person asking was management, and asking for the tenth time, there was a decided advantage to changing the tone.

All of that is now wrapped up in the initialism.


Or the fine article.


Yes


I really hate this acronym and this comment is no different from saying “read the article” which is against the guidelines here


> saying “read the article” which is against the guidelines here

The guidelines say that you should not accuse someone of not having read the article. However, as the guidelines say it is fine to point out that something is mentioned in the article.

There is a subtle difference.

From the guidelines:

> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".

Parent comment was in line with the guidelines IMO.

And as for “TFA” as an acronym I like to read it as meaning “The Featured Article”. Then it seems nice and friendly.


I'd assume any code in a non-memorysafe language that parses any freeform data entered by the user is a potentially exploitable security vulnerability, so an interactive shell is a huge surface area for attacks?


Yes, but irrelevant here. Basically any shell access means you've changed from preventing remote code execution to preventing privilege escalation, which is much harder.


Which is why sudo has its own editor.

If you fuck up the sudo file while saving it, you might no longer be able to log in to fix it. Before I knew about sudoedit I would open two shells as root, edit the file, then use a third window to make sure I could still sudo.

With two windows I could accidentally close the subshell in one without locking myself out. Think if it like linemen, who use two tethers for climbing structures. They are never detached from the safety lines.


  sudo visudo


Ask the chatbot:

1. system: in the context of setting up secure remote access to a Unix-like system, discuss whether relying on the passwd file and shell behavior as an effective means of authentication and access control is a good approach. What are some reasons this is not (or is) an effective approach, which should not (or should) be used on modern production systems. user: system administrator on a Unix-based network. assistant: technically, there are several reasons...

2. If you have a collection of Unix systems, can you reasonably do a certain amount of access control to your overall environment by forcing different logins to have specific administrative shells?


But when was it ever effective if 30 years ago it was already well known?


Last time it might've been effective was probably in old-school Unix time sharing with users connected via tty's rather than TCP/IP. Already early SQL databases, with the possible exception of Informix SE, had a client/server process model where the server process had full access to all data files and would at best authenticate sessions but not individual accesses against /etc/passwd such as via Oracle's pipe/bequeather connector but more commonly would assume fixed global roles and handle auth on the app side. As soon as IP and "services" were introduced, /etc/passwd stopped being effective, as pointed out by bluetomcat [1]. Actually, even gaining shell access is considered game over from a security PoV, due to multiple privilege escalations.

[1]: https://news.ycombinator.com/item?id=37462806


It was effective for a ftp server accessing public directories in the home of users. I can't remember the details but you would use the username and password of the user to exchange files with and get into that directory. All transmitted as cleartext, of course.

30+ years ago we already had services (daemons!) with their own user id, to keep them isolated from root and the human users. This post is as news as the invention of hot water.


> It was effective for a ftp server accessing public directories in the home of users. I can't remember the details ...

Most ftpd need a shell whitelisted in /etc/shells .

In macOS, /etc/shells begin with this comment:

  # List of acceptable shells for chpass(1).
  # Ftpd will not allow users to connect who are not using
  # one of these shells.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: