This article could have been published 30 years ago. In professional unix admin circles this was already well known back them. Although I could be misreading it as the article is not very clear. I think this are the points it is trying to make:
1. Once upon a time you could rely on the passwd file and shell behavior as an effective means of authentication and access control.
2. It has been a very long time since that was an effective approach, for a variety of reasons, and you should not do this on modern production systems.
Even 30 years ago, the core argument would have been nonsensical.
1. They introduce their argument as if it is solely about shell access (the conclusion also only mentions "login access control"), but then the first example/statement they make is about non-shell access (Samba, IMAP, Apache).
2. The second argument conflates authentication and authorization, and concludes that to implement shell authorization properly, your only choice is to provide multiple authentication systems.
Zero effort is spent on explaining why existing/historic shell authorization systems (such as simple DAC groups or rbash) are inadequate, and it's not clear to me what threat model they are using to arrive at their conclusion.
edit: rethinking this, I think TFA is just lacking a clear problem statement. They seem to be talking specifically about non-shell services that (ab)use the user's shell field in /etc/passwd as authorization information, and then complaining that many services did not follow suit.
authz may be confusing to non USA English speakers. I wouldn't make the connection without it spelled out to me. Unfortunately I don't have a better suggestion because auths as short for authorisation is probably worse.
To me they look like the kind of abbreviations I'd only do when writing. I just say authentication or authorisation when reading them (out loud or in my own mind)
This blog of generally gibberish hits the HN front page with an astounding frequency. IMHO, there are many interesting blogs on "system administration" topics that are submitted to HN every week that never reach the front page while there are a handful of familiar, low-quality ones that routinely appear on page one.
Much of tech is a theatre, a jobs program that keeps people employed in a middle class salary so long as they diligently pretend to be engineers. This theatre serves as a prop for a higher level theatre in our virtual economy for investors and their game of financialization.
Its expected that as tech grows in number of workers clutching to that middle-class life-raft that the baseline of knowledge discussed in tech spheres (like this site) will sink lower.
I think the point being made is that the fact that a user has rksh as their shell means nothing to samba, ftp, some features of ssh, httpd, cron, and etc. Fundamentally unix has pretty simple permissions, you're either root or you're not. The existence of a user account on a system is often enough to enable SMB and SSH access even if the only purpose of the account is to own files and an application process and is never intended to have interactive logins or to transfer data to and from the server.
> I think the point being made is that the fact that a user has rksh as their shell means nothing to samba, ftp, some features of ssh, httpd, cron, and etc
Which has been true for...... 30 years? If not longer?
* Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.
* In computer clusters you want user identity to "stick" to the user when they use multiple machines, containers etc. That's why you have LDAP... but it doesn't help all that much because user id is encoded into the file system (huge mistake...) which makes it very difficult to contain users to things they should control. If your only mechanism was the /etc/passwd, it would mean you'd have to constantly synchronize this file across all those machines and containers you have.
It may be old and not particularly appealing, but LDAP has been serving that role effectively at many companies.[0]
Using a terminal remains standard practice for sysadmins and devops.[1]
I believe there's some confusion in both the article and the comment between authentication and authorization. LDAP is fully equipped to handle both tasks.
To anyone reading this and thinking "yeah dummy, of course it doesn't scale because you're not supposed to store passwords in plain text in the first place" I'll direct you to Chapter 7ish of The Linux Programming Interface.
If you look in your /etc/passwd right now, you'll almost certainly see a single "x" where the (EDIT: no, it was still encrypted!) password originally was - nowadays that single "x" is an instruction to go look in /etc/shadow instead, for the salted hash of the password you're trying to check.
I think this minimizes the number of users who need read permissions to /etc/shadow, and the amount of time they need it for.
This has been your seemingly useless bit of Linux trivia for today. :)
/etc/shadow was born not because /etc/passwd had a plain text password but because the hashes became crackable and /etc/passwd is a public read file. Linux has never had them. Here's the man page indicating encrypted passwords for Unix v7 /etc/passwd release in 1979: https://man.cat-v.org/unix_7th/5/passwd
I have a vague recollection of my 20 floppy of Slackware already having /etc/shadow. That would have been fall of 92 or winter 93, based on where I was living at the time.
I have a vague recollection of being given the choice on a 90s vintage distribution with some warning about security and password length if I did not use shadow passwords. At some point in the early 2000s we started authenticating regular users against AD but the shadow file was still there for root.
We had a couple of labs of sparcstations that just went away a couple of times a year because something bad would happen with all of the NFS mounted partitions and they'd have to turn the cluster on one box at a time to prevent thundering herd issues with NFS.
I think they may have been mounting parts of /etc as well. People get the idea that managing accounts for a cluster of boxes should be centralized. It's all fun and games until the network mount disappears.
"unencrypted" is normally written as "cleartext". "plaintext" means "(readable / intended to be read) without a special viewer". Your.ssh /id_rsa is plaintext but not cleartext.
My school had 30k students, including grad and doctoral students. When they gave students shell accounts, they tried to put them all onto a single Sequent box. They got my incoming class, the following, and anyone previous who asked for one onto that box. I’m pretty sure it had an /etc/password file, and would have had about 8-10k people on it.
After that they gave up. Even with aggressive ulimits it was too hard, and each new class was apportioned to a separate Sparc. Which was a shame because we learned an awful lot about Unix administration from users pranking each other, figuring out what they did and how they did it, protecting yourself and retaliating (some people would prank others without first protecting themselves).
Made it a lot harder chatting with people from other classes as well. For electives and humanities you weren’t always in classes with people your exact age. They could be ahead of you or behind.
> Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.
This simply cannot be true. A users directory that is frequently read will be in cache. The file is also fairly simple to parse. Even on old hardware you should be able to parse thousands of users.
It's more of a problem of the number of nodes your users will be accessing than the number of users. It's a PITA to make mechanisms for password changes, for adding and removing users, for unlocking accounts, etc. when you have a distinct file on each server that needs to be changed. As well as adding new nodes to the environment and bringing them up to date on the list of users and etc.
Fair point, networking always makes things more complicated. I'm not sure it's really that much more complicated though. Like any concurrency problem with shared state, a single-writer and multiple readers keeps things simple, eg. a single host is authoritative and any changes are made there, and any file changes trigger automated scripts that distribute the changed file over NFS, FTP, etc.
As long as new nodes start with the same base configuration/image this seems manageable. Simple to do with inotify, but even prior to that some simple C programs to broadcast that a change is available and listen for a change broadcast and pull the changed file is totally within the realm of a classic sysadmin's skillset.
You would have to lock the file, or guarantee consistency in some other way. Right now, I don't believe Linux does anything about consistency of reads / writes to that file... which is bad, but we pretend not to notice.
So... the system is kind of broken to begin with, and it's kind of pointless to try to assess its performance.
Also, it would obviously make a lot of difference if you had a hundred of users with only a handful being active users, or if you had a hundred of active users. I meant active users. Running programs all the time.
NB. You might have heard about this language called Python. Upon starting the interpreter it reads /etc/passwd (because it needs to populate some "static" data in os module). Bet a bunch of similar tools do the same thing. If you have a bunch of users all running Python scripts while there are some changes to the user directory... things are going to get interesting.
> You would have to lock the file, or guarantee consistency in some other way.
I think the standard approach to atomicity is to copy, change the copy, then move that copy overwriting the original (edit: file moves are sorta atomic). Not perfect but generally works.
I agree that this approach is not good for a users directory, I'm just disagreeing that the reason it's not good is performance-related.
Moves are atomic. During the move, at no time is it possible to get the contents of file 1 and file 2 confused when reading from the file descriptors. (Confusion by the human operating things is eminently possible.)
Most systems come with "vipw" which does the atomic-rename dance to avoid problems with /etc/password. In practice this works fine. Things get more complicated when you have alternate PAM arrangements.
A whole bunch of standard functions like getpwents() are defined to read /etc/password, so that can't be changed.
`getpwents()` is not defined to only read `/etc/passwd`. There is only a requirement that there is some abstract "user database" or "password database" (depending on if you're reading the linux man pages or the Single Unix Specification man pages).
In practice, `getpwent` on linux uses the nsswitch mechanism to provide return values from `getpwent`. One can probably disable using `/etc/passwd` entirely when using glibc if: all users do use `getpwent`, and you remove `files` from the `passwd` entry in `/etc/nsswitch.conf`.
I had >30k users on a bunch of systems in 2001 (I inherited that approach, mind, I'm not -recommending- it).
We moved to LDAP a couple years later because it was a much nicer architecture to deal with overall, but performance and consistency weren't problems in practice.
This file is a public interface exposed by Linux to other programs. So what if Linux caches its contents when eg. Python interpreter on launch will read this file. And it's not coming from some "fake" filesystem like procfs or sysfs. It's an actual physical file most of the time.
> which makes it very difficult to contain users to things they should control
It's not the file system that's the problem here, it's that "everything is a file" is not true for a whole bunch of important stuff that you might want to apply access control to on a UNIX system. Such as the right to initiate TCP connections. This sort of thing is why containers are so popular.
NIS and LDAP do let you have a large number of users. Heck, we managed a few thousand users in /etc/password back when I was running https://www.srcf.net/ .. in 2000.
> it's not the file system that's the problem here, it's that "everything is a file" is not true for a whole bunch of important stuff that you might want to apply access control to on a UNIX system
I wonder if there has ever been an attempt to really lean into, and push the limits of sticking with the "everything is a file" philosophy in this realm.
I.e. how far could you get with having special files for fine grained permissions like "right to initiate a TCP connection", and making access control management be, essentially, managing which groups a user belonged to?
But in reality a file is not a good abstraction for an internet socket. The ACLs would in essence spell out firewall rules. Because the bigger question is where can it connect to than "user" that is connecting.
That's why this is done on the level of kernel networking, where kernel knows what process is trying to open a socket and can firewall it.
This sounds like a completely unrelated thing and you are not constrained by the plain text password/shadow file for scale. NIS existed for many decades. You can even use Active Directory (or samba) for authentication and user management.
This shows you either didn't read what you replied to or don't understand the subject.
This file is the public interface of the Linux system to everyone who wants to get information about users on the system. It doesn't matter that alternative tools exist: they were already mentioned in the post you replied to. It's not the point...
If I didn't understand I'd be grateful if you could explain it.
As far as I know that file does not get referenced for the users in an external directory server. That's how the systems scale without needing to put the users in the file. Aren't we talking about a high number of users (and their authorization levels) when talking about scalibility in this case?
I've wondered why we don't have a passwd.d folder, the way we do with other things in the UNIX filesystem, with individual user accounts represented by individual files. Could even retain the same line-oriented format, just stored separately.
> * Doesn't scale. Having passwords in a plain text file is not a scalable solution for users directory. Can probably go up to a hundred users, but not much more.
Why not? A file 100,000 line file will only take a moment to scan.
If you have services with even a few users you should set up LDAP (openldap is fine) and then experiment with it.
LDAP is pretty old and very well supported overall. At one point I configured ldap for postfix virtual hosting. I chose LDAP rather than a database backed solution because of widespread first class support everywhere. The same directory later enabled me to use a ton of other things including SSO. You're always finding new ways to use it once you have it.
It's a great skill to have and is nowhere near as complicated as people make it out to be, including live replication.
When Gwen Shotwell was asked about the acronym BFR on international television, she replied "Big Falcon Rocket". Lovely response, quite dependent on context!
This is like how I use 'ofc' to abbreviate 'of course'. Once upon a time the 'f' may have stood for something, but I never use it that way. For me, 'the F is silent'.
In polite company you say "The article" or "read the manual" and patiently await your reward, which is their asking "what does the 'F' stand for?", to which you reply only with a condescending look and raised eyebrow(s).
That look of realisation is precious.
I've manufactured this experience once or twice, and it's wonderful.
Digestible maybe, but swapping "friendly" in for "fucking" changes the tone and intent of someone's statement. At least "freaking" expresses similar, if muted, exclamation.
A recipient who is being not-so-subtly reproached with an F-bomb acronym might misunderstand what is being implied.
> saying “read the article” which is against the guidelines here
The guidelines say that you should not accuse someone of not having read the article. However, as the guidelines say it is fine to point out that something is mentioned in the article.
There is a subtle difference.
From the guidelines:
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".
Parent comment was in line with the guidelines IMO.
And as for “TFA” as an acronym I like to read it as meaning “The Featured Article”. Then it seems nice and friendly.
I'd assume any code in a non-memorysafe language that parses any freeform data entered by the user is a potentially exploitable security vulnerability, so an interactive shell is a huge surface area for attacks?
Yes, but irrelevant here. Basically any shell access means you've changed from preventing remote code execution to preventing privilege escalation, which is much harder.
If you fuck up the sudo file while saving it, you might no longer be able to log in to fix it. Before I knew about sudoedit I would open two shells as root, edit the file, then use a third window to make sure I could still sudo.
With two windows I could accidentally close the subshell in one without locking myself out. Think if it like linemen, who use two tethers for climbing structures. They are never detached from the safety lines.
1. system: in the context of setting up secure remote access to a Unix-like system, discuss whether relying on the passwd file and shell behavior as an effective means of authentication and access control is a good approach. What are some reasons this is not (or is) an effective approach, which should not (or should) be used on modern production systems. user: system administrator on a Unix-based network. assistant: technically, there are several reasons...
2. If you have a collection of Unix systems, can you reasonably do a certain amount of access control to your overall environment by forcing different logins to have specific administrative shells?
Last time it might've been effective was probably in old-school Unix time sharing with users connected via tty's rather than TCP/IP. Already early SQL databases, with the possible exception of Informix SE, had a client/server process model where the server process had full access to all data files and would at best authenticate sessions but not individual accesses against /etc/passwd such as via Oracle's pipe/bequeather connector but more commonly would assume fixed global roles and handle auth on the app side. As soon as IP and "services" were introduced, /etc/passwd stopped being effective, as pointed out by bluetomcat [1]. Actually, even gaining shell access is considered game over from a security PoV, due to multiple privilege escalations.
It was effective for a ftp server accessing public directories in the home of users. I can't remember the details but you would use the username and password of the user to exchange files with and get into that directory. All transmitted as cleartext, of course.
30+ years ago we already had services (daemons!) with their own user id, to keep them isolated from root and the human users. This post is as news as the invention of hot water.
1. Once upon a time you could rely on the passwd file and shell behavior as an effective means of authentication and access control.
2. It has been a very long time since that was an effective approach, for a variety of reasons, and you should not do this on modern production systems.