NPM security update: Attack campaign using stolen OAuth tokens

tommoor · on May 27, 2022

> Using their initial foothold of OAuth user tokens for GitHub.com, the actor was able to exfiltrate a set of private npm repositories, some of which included secrets such as AWS access keys.

> Using one of these AWS access keys, the actor was able to gain access to npm’s AWS infrastructure.

How many individual best practices were not followed to result in this nightmare? Sigh.

dlsa · on May 28, 2022

All of them to at least some extent and several we will soon have names for because they are still emerging.

jeremymcanally · on May 27, 2022

> Using their initial foothold of OAuth user tokens for GitHub.com, the actor was able to exfiltrate a set of private npm repositories, some of which included secrets such as AWS access keys.

Keep those keys out of source control, folks. There are a lot of options for secrets management these days, and making it harder for attackers to totally own you if they only manage to crack one piece of your infrastructure is key to limiting damage from this sort of attack.

joshstrange · on May 27, 2022

I agree fully with you but I don't think secret management is as easy/cheap as some people pretend. On AWS, for example, each secret you store is $0.40 + $0.05 per 10,000 API calls and that can add up if you only have 1 api key/password/etc per "Secret" (for an individual at least, I hate bleeding off $10+/mo to store tiny bits of strings). Then, once you have the secret stored, you need setup roles/policies to be able to retrieve them.

I have this setup pretty well in my code now but getting there wasn't simple or easy from my perspective and keeping the list of secrets your IAM user can access up to date can be a pain as well.

I'm working in a lambda environment so my options might be more limited but I'm interested to see how other people are solving this issue (maybe specifically for small/side projects). As it stands my lambdas all get a role applied to them that gives them access to the secrets but something not AWS-specific would need a "bootstrap secret" to be injected before the code could call out to the third-party to get the other secrets. For Lambdas I suppose I could inject that "bootstrap secret" in via environmental variables but now I've got a new issue to deal with. Injecting at build time via something like GitHub Actions Secrets is an option I guess.

All that to say, while I agree secrets should never be in source, in practice it's not super easy (I'd love to be proven wrong, maybe I'm not doing it right).

capableweb · on May 27, 2022

> I agree fully with you but I don't think secret management is as easy/cheap as some people pretend. On AWS, for example, each secret you store is $0.40 + $0.05 per 10,000 API calls and that can add up if you only have 1 api key/password/etc per "Secret" (for an individual at least, I hate bleeding off $10+/mo to store tiny bits of strings). Then, once you have the secret stored, you need setup roles/policies to be able to retrieve them.

There is a middle-step between "lets have API tokens committed in SCM" and "lets deploy a full-authentication system/use this costly solution", and that is using environment variables. In your code, do `process.env.MY_SECRET_KEY` instead of `myGitHubPersonalToken` and then when you run the program, run it with ` MY_SECRET_KEY=myGitHubPersonalToken npm start`. Magically, you can commit your code without exposing any secrets, and share the secret where you need it out-of-band.

Zero-cost, actually easier to configure your software when you need it, and as a bonus, people won't get access to your infrastructure in case someone gets a hold of your source code.

That npm inc isn't aware (or failed to uphold the code quality) of environment variables for secrets is embarrassing.

tmp538394722 · on May 27, 2022

> when you run the program, run it with ` MY_SECRET_KEY=myGitHubPersonalToken npm start`

But where does this live? Or do you literally mean that Jane The Sys Admin is supposed to type this into her terminal every time the service restarts in the middle of the night?

What if I need to replace a node? Or scale a service? How do these secrets get there?

capableweb · on May 27, 2022

> But where does this live? Or do you literally mean that Jane The Sys Admin is supposed to type this into her terminal every time the service restarts in the middle of the night?

Depends on how the service is deployed. If you're just running it on a Digital Ocean instance by manually SSHing into the instance and running systemd services, define it in the .service file (it supports defining environment variables).

If you're doing instances via automation (like Terraform), most of them (including Terraform) supports loading things from environment variables. So you run `MY_SECRET_KEY=myGitHubPersonalToken terraform apply` when you create the instance, and use the environment variable in your hcl definitions.

jeremymcanally · on May 27, 2022

We use Doppler and some other custom built tooling, and it works really well. The Doppler pricing is really fair IMO, and their tooling adds a lot of value for us so I'm OK with paying for it.

Software security is rarely free (even with an OSS tool you've got infra and management costs), but the cost is almost always cheaper than a major breach that could stem from something like this incident, which fortunately was pretty contained.

danenania · on May 27, 2022

You could check out https://www.envkey.com (disclaimer: I'm the founder).

Our goal is to simplify this as much as possible. We use client-side end-to-end encryption so you don't have to trust a third party, and pricing on our paid plans is based on number of users, not number of secrets.

philsnow · on May 27, 2022

Having helped build the production secret management bits at Airbnb [0], I would encourage most people to buy, not build. It was a large task and I'm 95% sure we didn't add more value than the cost of all the people working on it for as long as it took, compared to buying a solution.

Just like the rest of Airbnb started before VPCs were GA and thus required a large engineering investment to move everything to VPCs, we started on the secret management stuff before there were a lot of good other options available (though arguably Hashicorp Vault was around and mature enough at the time, and would have been the best alternative). I haven't looked at envkey for production use but I've definitely considered it for home use since it's just so deliciously simple.

[0] https://medium.com/airbnb-engineering/production-secret-mana...

eins1234 · on May 27, 2022

I was looking at this with great interest, until I looked at the pricing and found out the lowest hosted pricing tier starts at $150/month.

I know there's a free "community" hosted version, but I'm not sure what the differences is outside of the limits and support, and I'd prefer to see the pricing scale up a bit more gently than 0 -> $150 as soon as I reach the limits of the free offering.

eins1234 · on May 27, 2022

Also, the list of limitations on the self-hosted open-source version reads like something that is only offered to tick an open-source checkbox: https://docs-v2.envkey.com/docs/self-hosting-open-source-env...

Very clearly cut down to the point where it's not feasible for anything outside of toy projects.

danenania · on May 27, 2022

What would need to change for you to feel that's not the case?

The open source version is fully functional and can definitely scale beyond toy projects. You don't get high availability, multi-instance clustering, auto-scaling, multi-region failover etc. built in, but if you put it on a beefy host it can easily handle a large number of users and a very high request rate.

The way I think of it is we give you the fully functional server, but charge for advanced infrastructure and a few advanced features (SSO/Teams).

It's comparable to the open source version of a tool like Vault where you get the server, but need to implement advanced stuff like HA, auto-scaling, networking, etc. yourself, or else use a paid version.

eins1234 · on May 27, 2022

I would consider HA/clustering/auto-scaling to be table stakes for something as mission critical as secrets management.

As far as I can see, this is fully available in the open-source version of Vault: https://learn.hashicorp.com/tutorials/vault/ha-with-consul

eins1234 · on May 27, 2022

Even though I would personally choose to use the hosted version, I still consider the availability of an open-source solution to be a great hedge against vendor lock-in/failure.

This is only true if I can be confident that I can replicate the hosted setup with the open-source version if I invested the necessary resources. Otherwise, the existence of an open-source option adds little value. In fact it can turn me off from a product since it'd seem like they're using open source as a marketing hook with no real intention of empowering users to be able to actually move off their hosted platforms.

danenania · on May 27, 2022

We could potentially enable clustering for the Open Source version. One of the reasons it isn't already is that our clustering implementation is currently AWS-specific, since it relies on the AWS metadata endpoint to look up a host's internal IP, as well as networking rules that allow hosts to talk to each other.

This is why Vault requires another piece like Consul (plus a whole lot of tricky infra/networking work) to achieve HA.

That said, we could allow users of the Open Source version to specify a url via an env var to look up a host's internal IP so that clustering would work.

Auto-scaling is provider-specific though, so I don't see how that could be baked in. Same with secure networking.

I'll also just say that while we do want the open source version to be fully functional (if a bit more DIY), another motivation for us that I see as equally important for a security product is transparency.

While it's inarguably crucial for any clients implementing end-to-end encryption to be open source, I think there's a lot of value in open sourcing the server as well (regardless of how practical it is to actually run) so that users can know what's happening on the server-side, see that the code is high quality and tested, and so on.

danenania · on June 2, 2022

Perhaps you'll see this if you get notified of replies: per your suggestion, we have now introduced a lower tier in between the free tier and the ~150/month tier.

danenania · on May 27, 2022

Thanks for the feedback. We're considering some changes here: specifically, reducing the limits of the free tier a bit and then adding another tier in the $50/mo range between the free tier and the $150/mo tier. Would that be a better fit for you?

detaro · on May 27, 2022

Security software that puts SSO support behind a high pricing tier (and as far as I can tell has no other way to get e.g. any kind of 2FA) is a very bad look.

> EnvKey Business Self-Hosted runs in an AWS account you control. You can use it with any host or cloud provider.

Is a bit confusing, that section should be clarified a bit. Does it mean that you can run the systems using it somewhere else? Does it mean other variants of EnvKey can be run everywhere? ...

danenania · on May 27, 2022

On price-gating SSO, I'm sympathetic to your argument. I'll consider adding it to the lower-priced tier.

2FA is already effectively built-in to EnvKey through device-based authorization. A user can only sign in to EnvKey from an authorized device, so an email account compromise won't be enough for an attacker to gain access--they would also need access to an authorized device.

A passphrase can optionally be supplied on top of this for an additional factor (though it's unnecessary if you're already using OS-level disk encryption).

It's basically the same model as SSH. And imo it's superior to SMS or app-based 2FA (perhaps not token-based, which I'm open to adding). It handles the main threat models (phishing/email account compromise) with far better UX and convenience.

"Is a bit confusing, that section should be clarified a bit. Does it mean that you can run the systems using it somewhere else? Does it mean other variants of EnvKey can be run everywhere? ..."

I agree this could be less confusing.

The EnvKey host server runs in your AWS account, but that doesn't mean that apps you integrate EnvKey with are in any way limited to AWS. You could have your apps running in Heroku, GCP, Azure, or whatever, and integrate with your self-hosted EnvKey installation for configuration and secrets management with no problem.

Is that clearer?

danenania · on June 2, 2022

Perhaps you'll see this if you get notified of replies: per your suggestion, we have now added SSO to a lower-priced tier.

detaro · on June 6, 2022

nice, thank you!

scarface74 · on May 28, 2022

Don’t use Secret Manager then. You can use SSM Parameter Store with a type of “secret string”. It’s much cheaper.

urda · on May 27, 2022

A rule of thumb is: if the key is pushed into a git remote you should consider it compromised and roll a new key.

celticninja · on May 27, 2022

What if they key is in a git-crypted file? I get what you are saying about an open file, but surely best practice is to use encrypted files to store secrets that are needed e.g for deployment

urda · on May 27, 2022

An encrypted file stored along with the git repo (without the decryption key) has a different attack surface. My original comment was more targeted towards users storing their keys in plaintext.

dgb23 · on May 27, 2022

This is also a thing that will not get past some interviewers.

alexghr · on May 27, 2022

This is probably a good time to remind people to check their authorized OAuth applications on Github[1] and make sure that any unused apps have their access revoked.

[1]: https://github.com/settings/applications

willcipriano · on May 27, 2022

I've never had this problem but I thought of a partial solution. Say you have you unit tests and they are using the same auth and logging mechanisms as prod. Create a user with a password like "ThisStringIsAPassword1234" and run the unit tests having them output logs to the disk. Then see if the logs contain that value.

Anybody ever do something like that? How effective it would be probably depends on unit test coverage.

You could also probably just do the same thing in prod with a dummy user.

er4hn · on May 27, 2022

That is one thing that RFC8959 is intended to solve. If you see "secret-token:" in any logs after running tests you flag that as a problem and fail the test.

willcipriano · on May 27, 2022

Thanks I assumed someone else must've had the same idea.

danpalmer · on May 27, 2022

I agree it should be this simple, but I'd bet they have tests like this and unfortunately it's never quite this simple in a production system.

I've always wanted to apply strong type systems to this problem – wrapping sensitive data in types that do not have the ability to be printed to logs would theoretically allow you to know after type-checking that passwords can't be output to logs. However again I think this is wishful thinking as a password needs to be sent somewhere at some point, and that creates places where issues can occur.

rvnx · on May 27, 2022

Well it's their choice to ask the user to send a plain-text password.

There are alternatives to avoid this, on the same model of SSH key authentication where the secret stays on the client-side.

Nothing prevents them from using a password to derive a private key using PBKDF2 in client-side and answer to a specific challenge.

willcipriano · on May 27, 2022

That's kind of how I've always solved it at the smaller companies I've worked at. I'm on Hacker News all day and know all about these kinds of footguns so I write all the auth code myself and then provide a couple of helper functions/methods for the team to use that are unable to do anything too silly. Creating an object that throws an exception when you try to get it as a string or print it though is something I'll have to try next time.

tmlb · on May 27, 2022

I've worked on a service that handled credentials where we added tests like this to try to catch if a log statement gets added containing the username/password. We used a few end to end tests rather than attempting to include something like this is the unit tests for every function.

Our tests would set up the app's full context, get a hook into the logging framework to watch for log statements, then make requests to the service containing a set of dummy credentials, like { username: "foo", password: "bar" }. If a log statement containing "foo" or "bar" was detected the test failed.

It's not going to catch every type of issue, but at least some potential footguns can be preventing this way.

willcipriano · on May 27, 2022

That gives me an idea. Create a decorator or otherwise wrap the logging function as you build the apps test context, and feed it a list of sensitive strings you want to detect. Then each time as logger is called have it assert that all those strings are not within the log message.

This way it would blow up on the test that is leaking the credential so you could track it right down and it would transparently apply to all current and future unit tests without any more effort.

chaps · on May 27, 2022

I saw this at a.. scary-large.. company I was at in 2014ish. If a client changed their password, it would get logged into a log file, plaintext. I asked a coworker why they did this, and he said it was to tell the client their password. Hm.

They did actually patch it before I got there though.. but they didn't get rid of the years-old log files with the passwords. Found them while trying to find the root password (unsuccessfully) for a host that we couldn't reboot. The ones I tested still worked.

I wouldn't be surprised if something similar happened here. Old log files in backups and such.

richardfey · on May 27, 2022

You say "attack campaign", I say bad habits catching up with secrets scanners *and* someone noticing it. Black hats might have been exploiting this already in the past.

jabiko · on May 27, 2022

https://github.blog/2022-05-26-npm-security-update-oauth-tok...

> Using their initial foothold of OAuth user tokens for GitHub.com, the actor was able to exfiltrate a set of private npm repositories, some of which included secrets such as AWS access keys.

So NPM was storing AWS secrets in their (private) git repos. IMHO that was an accident waiting to happen.

philsnow · on May 27, 2022

> Unrelated to the OAuth token attack, we also recently internally discovered the storage of plaintext credentials in GitHub’s internal logging system for npm services

This isn't the first time GitHub has found logging of plaintext credentials [0]; it's not a good look, for a company with the resources that GitHub has, to have to disclose it again almost exactly 4 years later.

[0] https://www.zdnet.com/article/github-says-bug-exposed-accoun...

tinus_hn · on May 27, 2022

This happens all the time in a lot of companies, you just don’t hear about it.

philsnow · on May 27, 2022

100% agree, but for most companies disclosing it once makes it a high-priority issue for it to never be an issue again; 0.1%-1% of the engineering org gets tasked with making some framework for logging, making sure there's no way to log outside the framework, and chasing down stragglers who can't/won't use it to get their management chain to accept the risk of it recurring in their little corner of production.

rmbyrro · on May 27, 2022

Never seen plain text credentials being logged in systems I've looked at.

PII, sure, even login usernames or email, also sure, but not credentials (plain text passwords, tokens, etc).

ezekg · on May 27, 2022

I've seen it quite a bit, even at fintech companies I've worked at. I mean, I've seen passwords show up in SQL slow query logs before (and other SQL errors, e.g. on conflicts).

It happens. You fix the issue, purge the logs, and learn from it.

tinus_hn · on May 27, 2022

If users login through some web form the credentials can end up in a log because the web server logs the form parameters (or the URL and the parameters are in the URL). Not good but it happens.

rmbyrro · on May 27, 2022

I think inserting credentials in a URL in the first place is the problem. Always send it in the request body. Never configure a webserver to log request bodies, they'll eventually contain sensitive information.

bastawhiz · on May 27, 2022

You're assuming they were in the URL and that only the URL was logged.

APIs are extremely fraught, because users like to build integrations that jam credentials into the wrong places until they get a 200. You haven't lived until you've added a regexp for your API keys appearing in the wrong HTTP headers

bastardoperator · on May 27, 2022

LOL, literally entire products designed to prevent and report on these issues. Just because you haven't seen it doesn't mean it isn't happening daily unfortunately.

mschuster91 · on May 27, 2022

Say someone implemented oauth or whatever fancy auth mechanism in a dockerized software and had the problem that the environment variable containing the secrets were not passed through the layers to the application server, and so the developer (as they didn't have SSH access to the machine or the container) was forced to build in a lot of logging to discover where in the path the secrets got lost. Then the developer forgot to remove all logging incantations.

blip54321 · on May 27, 2022

This isn't uncommon. For example, when MIT/Harvard took over edX from the original researcher who built it, they found they didn't know how to build software. The new team introduced this same issue: passwords in log files.

It was fixed much later. You can look over the git logs.

I'm not sure this is so much of a "how good this happen," but a "thank you for being transparent." Most organizations cover this sort of thing up, like MIT/Harvard (and now 2U).

Good of github to announce this openly!

junon · on May 27, 2022

Top 10 maintainer here, got a few emails this morning about it.

Meh. Shit happens. If we all Pikachu face every time an exploit happens we are lying to ourselves. We'll never reach perfect security. It's a pipe dream.

What matters more is the disclosure and response. I'm not a huge advocate of npm personally, but I respect their response to this thus far. From what I gather (the email was a bit long-winded) nothing vastly detrimental occurred, they automatically invalidated passwords and going to publish again next time will require a couple minutes tops of extra work. I'll take it.

Let's all stop acting like products need to be perfectly and eternally secure. That's not how threat modelling works, any security professional knows that's impossible, and it's unfair to expect that from anyone, including big corporations.

Npm has done a lot of relevant and good work toward their security efforts over the years, in some cases going a bit far even in my own opinion. The comments I've seen so far have been a bit unfair.

capableweb · on May 27, 2022

If this was a clever hack, I'd agree with you, shit happens.

But for gods sake, having secrets hardcoded in VCS??

You seem to understand threat modelling. What's the threat towards one of the biggest and most used package registry?

It's not like npm Inc just started running the registry. They have been doing this for years. To let such a beginner mistake risk the supply chain of basically the entire JS ecosystem is not only sloppy, it's completely unprofessional.

What can we do in the short-term? I'm not sure, but I hope smarter people than me comes up with some solutions ASAP before a compromise like this starts actually impact developers using npm.

0daystock · on May 27, 2022

It is long past time for channel-binding of bearer credentials. It's unconscionable we as engineering professionals allow the total sum of security controls of a system to constantly be reduced to a string humans pass around.

judge2020 · on May 27, 2022

OIDC-based authorization between trusted parties seems to be picking up steam[0], with the main issue being that setting it up isn't as easy as "paste shared secret".

0: https://docs.github.com/en/actions/deployment/security-harde...

tedivm · on May 27, 2022

I wrote a blog post about connecting Github and AWS over OIDC. In my example I focus on using it to public images to ECR, but it's applicable to any AWS permission needs.

https://blog.tedivm.com/guides/2021/10/github-actions-push-t...

capableweb · on May 27, 2022

I agree, but even so, you can make things a lot more secure from "having plaintext credentials commit to source code versioning" without changing your entire infrastructure, which a move like that would require. Simply use environment variables and a .env file.

goodpoint · on May 27, 2022

This is why we should use packages from well-known Linux distributions instead of npm/pip/cargo etc.

I know the available libraries are a fraction of the ecosystem, but very often it's a good enough fraction if you are willing to be flexible in your choices.

noodlesUK · on May 27, 2022

Serious question: what’s the difference? Do Linux distros actually audit packages very much? Supply chain attacks in Linux distros are pretty scary, as you can basically expect them to run install scripts as root.

Gigachad · on May 27, 2022

A bit, but I assume it’s mostly just the extra friction to get things in that acts as the main filter. Malware has made it in to Debian before. See the xscreensaver time bomb.

The extra friction also makes Linux package managers useless. Just about 0% of the things people are installing with npm exist in distro repos. Distro repos are also extremely poor at keeping multiple versions around at once.

Surprisingly people don’t like having their react upgrade tied to the Debian major version update.

goodpoint · on May 27, 2022

> Do Linux distros actually audit packages very much?

It varies widely depending on the distro.

Debian is the the most strict on review process but also on screening the volunteers doing the work.

Yet, most popular distros has been very effective at blocking a large number of supply chain attacks over the last 20 years.

pmontra · on May 27, 2022

A fraction and based on my old Ruby gems memories, fairly out of date.

I remember that somebody posted a reply to a comment of mine here on HN years ago saying that s/he was rebuilding every single gem as deb package before deploying it in production and that was the only sensible way to do. I don't think it adds much to security unless they also read all the code, but it's a lot of work that none of my customer is going to pay me for. I also probably don't want to start a profession of deb builder for Ruby gems.

logifail · on May 27, 2022

> A fraction and based on my old Ruby gems memories, fairly out of date

That's almost certainly true, but the case of having to choose, I'd almost always select slightly older but trusted code to less trustworthy bleeding-edge-newness.

I guess that's why I'm content with dozens of instances running Debian.

pcl · on May 27, 2022

Older dependencies, of course, might contain security vulnerabilities that have been patched in more recent version updates. Its a frustrating choice: stay on old versions and risk exposure to old vulns, or stay up-to-date and risk supply chain attacks.

goodpoint · on May 27, 2022

No, it's the very opposite.

Vulnerabilities can be discovered in released software, but new vulnerabilities are only introduced in new releases.

Security fixes are backported by the distro. Over time, stable distributions become and remain more secure than cutting edge software.

goodpoint · on May 27, 2022

Various large organizations rebuild Debian to detect unexpected changes both as part of https://wiki.debian.org/ReproducibleBuilds and independently.

> I don't think it adds much to security unless they also read all the code

Some distros ensure that hundreds of thousands of users deploy and use the same packages consistently before making a distro release.

This creates plenty of accountability, making it very difficult for a supply chain attack to go undetected.

> A fraction and based on my old Ruby gems memories, fairly out of date.

Just like buying a car, you have a tradeoff between well-known and little-known, well tested or bleeding edge.

What's safer for production use?

maxloh · on May 27, 2022

You cannot download React from apt/yum AFAIK.

goodpoint · on May 27, 2022

https://packages.debian.org/source/bullseye/i386/node-react

This one?

EDIT: it's a bit old, probably due to the usual mess of dependency management in JS. Still perfectly usable tho.

Having seen that, I would stay away from react.

Gigachad · on May 27, 2022

That package is wildly out of date. I’d estimate the vast majority of react apps are running on a much newer version. And no one wants to have their react upgrade tied to a Debian upgrade. Forcing these things to happen at the same time would be a nightmare.

dodgerdan · on May 27, 2022

Will this be met with a shrug from the JS community? Or is this the come to Jesus moment for the JS supply chain?

ratww · on May 27, 2022

What do you expect the community to do?

Stop using thousands of packages? Start vetting packages, as if security was important?

There are already dozens of us saying it is possible to not have too many dependencies, and vet packages before installing. But every time we open our mouths we are treated as if we just escaped some sort of insane asylum.

At a place I worked in the past we used to have a 40-line microservice using plain-node without any dependencies. That was by design. One junior dev took it upon himself, in their spare time, to convert the whole thing to use some js MVC framework, complete with a full-blown build process, transpilers, and all the nine yards. There was a big discussion in the PR and a lot of juniors complained that we should migrate because they "didn't learn plain node.js in college".

We can't have nice things anymore.

dodgerdan · on May 28, 2022

> What do you expect the community to do?

Ensuring that the core infrastructure of their software systems doesn’t have the same security standards as teenagers wordpress site would be an awesome start.

ratww · on May 28, 2022

Depending on what you're calling core infrastructure, it might not be something under control of "the community".

dgb23 · on May 27, 2022

It surprises me that colleges teach web frameworks.

havblue · on May 27, 2022

I suppose it's job security for a lot of us that colleges are producing developers who don't know what a string is.

ratww · on May 27, 2022

Me too. But what really suprises me is that today that's virtually the only thing some of them are teaching.

It's Java or Python for programming basics (arithmetics, ifs, print), then straight into frameworks.

dgb23 · on May 27, 2022

That's really unfortunate. I think education is the best place to teach these things bottom up. Especially considering that you understand a framework way better after creating one yourself. There are not that many essential parts in a web framework anyways, it's the plumbing, the tooling and the polish around them that makes them productive.

ratww · on May 27, 2022

Exactly. IMO you're a 100x better developer when you at least tried to create something you use from scratch.

I find however this will become harder and harder with newbies being constantly bombarded online with messages stating that creating anything from scratch is a futile exercise, together with companies influencing college curriculums.

bin_bash · on May 27, 2022

It's unrealistic to vet the entire chain of dependencies especially considering how they can become vulnerable at any time. Developers can't be expected to be responsible for that.

What we need is a set of policies around dependencies that we agree on as an industry and tools to help make securing our systems easier.

We're working on it: https://slsa.dev/spec/v0.1/requirements

ratww · on May 27, 2022

Nope. It is completely realistic as long as you keep the chain of dependencies manageable.

To vet dependencies properly you have to stop using thousands of them willy-nilly.

To stop using thousands of them you mostly have to change only your development dependencies. Runtimes dependencies, even in large Javascript projects, are often well behaved and are rarely a big issue. Except maybe for a few large backend frameworks.

No, that won't destroy your company.

Anything that helps vetting will be more than welcome, though.

VoidWhisperer · on May 27, 2022

For once, I don't think this highlights an issue exclusively specific to JS.. this could've happened to any package system that Github owned when the attacker was able to pivot after accessing the private repos.

dotancohen · on May 27, 2022

  > Will this be met with a shrug from the JS community?

Go read the comment that begins "Top 10 maintainer here". Not even a shrug.

rmbyrro · on May 27, 2022

Was this incident facilitated by something inherent in the JS ecosystem? I have the impression it wasn't.

The JS ecosystem sucks, but anyway, not particularly their fault in this case.

EnKopVand · on May 27, 2022

I’m of a bit of an opposite mind on the many, and usually very public, NPM security issues. Because from my experience the JS ecosystem, and it’s woes, teach a lot of people to never trust the part of their operation that is coming from someone else. I mean not everyone, obviously, but in my anecdotal experience it’s far more common to see good package control and review processes for JS than any other language, well except for maybe Python when the Python is done by software engineers and not “data-scientists”.

Supply chain security is immensely important, and I encourage you not to learn about it the hard way like I did. Which somewhat ironically happened in the .Net ecosystem when one of our trusted Nuget packages got hacked many years ago. Now, I could be mistaken and I hope I am, but I suspect that if you ask a Java, a JS and a C# developer if they trust their ecosystem, then only one of them is likely to say yes.

So no, there won’t be some great revelation in the JS community. The best you can hope for with stories like these is that fewer developers feel like imposters when they realise that GitHub stores plaintext security assets in their logs.

stolenmerch · on May 27, 2022

As a member of the JS community: shrug. I revoked my OAuth apps on Github, changed my passwords. The tarballs are unaffected. Not worried.

throwaway290 · on May 27, 2022

I hope this is self-deprecating humor, but for anyone that takes it at face value:

The implication of a successful attack on NPM, with huge unvetted dependency graphs currently in fashion, would be that any of the thousands dependencies of a modern small JavaScript app could suddenly include malicious code that runs your dev machine or your production systems.

(That's why the key part of the announcement is "GitHub is currently confident that the actor did not modify any published packages in the registry or publish any new versions to existing packages".)

tuxie_ · on May 27, 2022

What would you expect the JS community do after this? What would you do?

SkyPuncher · on May 27, 2022

Yes, because this fundamentally wasn't an attack against NPM or any specific package manager. This stemmed from a breach at Heroku.

cpach · on May 27, 2022

More comments over here: https://news.ycombinator.com/item?id=31526649

kseifried · on May 29, 2022

I've been thinking about this a lot, how do we do Security at the SaaS level. Some thoughts:

https://github.com/cloudsecurityalliance/CSA-IT-Operations/t...

https://github.com/cloudsecurityalliance/CSA-IT-Operations/b...

glenngillen · on May 27, 2022

Did I miss the previous GitHub announcements about this amidst all the noise about how badly Heroku handled their part of this problem? Or have GitHub been sitting on the specific facts they had a database, email, and hash passwords leaked for over a month now?

ralph84 · on May 27, 2022

I take it you’ve never been involved in a breach investigation. Figuring out what an attacker had access to and whether they exploited that access isn’t trivial, especially for a heavily used service like npm. To say they were “sitting on” information while probably tens if not hundreds of engineers assisted in making sure the investigation was complete and accurate is uncharitable.

glenngillen · on May 28, 2022

How presumptuous. I’ve worked with all of the teams involved in this specific incident over the past 12 years. I’ve helped lead incident response for much larger breaches. I know what’s involved. And the moment you’re aware customers have had their passwords exposed (even if hashed) you come clean on it and empower them to take mitigating action so that they’re not unduly exposed longer than they need to be. You probably even consider forcing password resets for everyone.

As Heroku did, in its own hamfisted and still too slow way.

ralph84 · on May 29, 2022

And you know the exact moment GitHub became aware that passwords were exposed and that they "sat on" that information? If you actually did work with those people, that's even more reason to assume good faith. Saying "I could have done it better" isn't helpful.

tln · on May 27, 2022

TFA starts with "On April 15, we published a blog[0] detailing an attack campaign ..."

So, there was a blog post on April 15, at least.

[0] https://github.blog/2022-04-15-security-alert-stolen-oauth-u...

glenngillen · on May 28, 2022

Yeah, that’s the only one I recalled seeing. And nothing at the time of a database leak with emails and hashed passwords, nor a backup that included access to private repos.

srathi · on May 27, 2022

Shameless plug: I created a Golang utility to scrub passwords from a deeply nested struct, before logging, at Nutanix some time back [0][1]. We also run an automated test to try out all operations with a known password, and then ensuring that it is not present in any of the log files.

[0] https://github.com/ssrathi/go-scrub [1] https://www.nutanix.dev/2022/04/22/golang-the-art-of-reflect...

fomine3 · on May 27, 2022

Reminder: npm is now owned by GitHub

classified · on May 27, 2022

On one hand it's good to have an organization with lots of engineering resources behind these, but I have to wonder whether that much stuff in one place (Microsoft in this case) won't bite us down the line.

eximius · on May 27, 2022

And this is one reason why client side hashing is a good idea (in addition to other procedures).

Even if you screw up, the impact is so much less severe.

01acheru · on May 27, 2022

If you do client side hashing then the hash becomes the password, it only helps with password reuse on other services. From your service perspective the security issue doesn't change much.

eximius · on May 28, 2022

I can already impersonate my users.

But yes, client side hashing is just one part of a security story, but it sure does make log leaks less damaging.

SAI_Peregrinus · on May 28, 2022

saPAKEs are one possible solution, but none are currently standardized.

WallyFunk · on May 27, 2022

Does anyone know if Socket[0] can help here?

[0] https://socket.dev/

xbar · on May 27, 2022

"On April 12, GitHub Security began an investigation that uncovered evidence that an attacker abused stolen OAuth user tokens issued to two third-party OAuth integrators..."

Too vague to be useful. Stolen from whom? Simultaneously from two different third parties? The whole thing began April 12? Based on what signal and from whom?

Here's what this vagueness tells me: GitHub OAuth token management for integrators is badly operated and poorly monitored by the entity issuing tokens.

My conclusion is that Microsoft (GitHub), SalesForce (Heroku), and Idera (TA Associate, Travis CI) have not learned the SolarWinds lessons of Supply Chain Security yet. Perhaps this will be the incident that helps.

Some of the posts in this thread are a little too c'est la vie for my tastes.

msbarnett · on May 27, 2022

> > "On April 12, GitHub Security began an investigation that uncovered evidence that an attacker abused stolen OAuth user tokens issued to two third-party OAuth integrators..."

> Too vague to be useful. Stolen from whom?

Heroku and Travis CI. That information is literally in the end of the sentence you cut off:

"On April 15, we published a blog detailing an attack campaign utilizing stolen OAuth user tokens issued to two third-party GitHub.com integrators, Heroku and Travis CI."

celticninja · on May 27, 2022

It could also be read as a total of 4 third party integrators. The comma between integrators and Heroku could indicate a part of a list. It's not as clearly written as it could be that is for sure.

msbarnett · on May 27, 2022

It would really only make sense for someone to write it that way if Heroku and Travis CI were not "third-party OAuth integrators" (which they are). At some point we're just playing "Yes but if we assume the author did not speak English natively and made this particular kind of odd error, then you could read it as meaning...". Occam's Razor applies – the simplest possible reading is almost certainly the correct one.

born-jre · on May 27, 2022

on separate note, why don't registry like npm take a raw repo and build in some kind of reproducible environment (like nixos/or even standardized docker ) then user uploading pre built package is beyond me. i think fdroid does sth like that. i think it will help with a lot with different supply chain attack. (not this one though). native extensions (c++) will be pain though, but with the advent of wasm maybe native extensions are not that great idea now huh ¯\_(ツ)_/¯

lol1lol · on May 27, 2022

Isn't a decent auth library boilerplate at this point?

swang · on May 27, 2022

What were they using to hash passwords? sha256? I actually am curious since they didn't mention it in their email.

msbarnett · on May 27, 2022

> The password hashes in this archived data were generated using PBKDF2 or salted SHA1 algorithms previously used by the npm registry. These weak hashing algorithms have not been used to store npm user passwords since the npm registry began using bcrypt in 2017.

Which is so frustrating. When you upgrade your hashing algorithm, always always _always_ immediately remediate the weak hash mess by hashing your weak hashes with your new stronger hash, and turn the login check into

    bcrypt(sha1(user-entered-password)) == stored-bcrypted-sha1

you can then upgrade them to a straight bcrypt if the check succeeds, but keeping the weak hashes on disk indefinitely until the user logs in (if ever!) is such a risk.

wepple · on May 27, 2022

I’ve never heard this suggested before. Such a simple elegant solution I’m kinda embarrassed as a security engineer.

bvrmn · on May 27, 2022

It's an interesting spin, we learned to store hashed passwords. But there is virtually zero advise to store hashed tokens. I can't remember any corresponding point in OWASP guides. It's funny how old username:password pair is more secure than modern tech.

latchkey · on May 27, 2022

I consider my passwords public knowledge at this point. My NPM is secured with 2FA, which should be the minimum default requirement for any service.

arzel · on May 27, 2022

What were* the passwords hashed with?

yardstick · on May 27, 2022

From https://github.blog/2022-05-26-npm-security-update-oauth-tok...

“The password hashes in this archived data were generated using PBKDF2 or salted SHA1 algorithms previously used by the npm registry. These weak hashing algorithms have not been used to store npm user passwords since the npm registry began using bcrypt in 2017. ”

Sohcahtoa82 · on May 27, 2022

How did this happen? You'd think a company like GitHub would know better than to allow passwords to land in log files.

klodolph · on May 27, 2022

What does it mean for a company to "know better" than to do something?

It means some combination of the following: people don't make the mistake in the first place, the mistake is caught in code review, the mistake is caught in audits, the mistake is caught by automated tooling...

_ktx2 · on May 27, 2022

> Salesforce-owned Heroku noted that some of its private repos were accessed on April 9 before slamming the brakes on GitHub integration. That integration was restored earlier this week, according to the company's status page.

This is important for this forum, imo. Heroku was cast a lot of shade during this.

alexghr · on May 27, 2022

I'll quote my comment from the duplicate thread because I think it's important people audit their authorized applications on Github

  > This is probably a good time to remind people to check their authorized OAuth applications on Github[1] and make sure that any unused apps have their access revoked.
  >
  > [1]: https://github.com/settings/applications

pvg · on May 27, 2022

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

bearjaws · on May 27, 2022

Something I cannot emphasize enough for startups is to not use your own auth no matter how convenient you might think it is. Github was built before the era of many "Authentication as a service" providers, but this is another example of why you don't roll your own.

sergiomattei · on May 27, 2022

I don’t understand. Just about everyone rolls their own auth.

This is awful advice. I’d never trust my authentication needs to external providers who can hold my user information hostage.

bearjaws · on May 27, 2022

... AWS offers managed authentication service (Cognito), for many websites your data is literally already in AWS. They have zero reason to hold your usernames and emails hostage...

The same can be said for GCP (Identity Platform) and Azure.

joemi · on May 27, 2022

Rolling your own has some benefits, though. Such as not being as big a target as a provider that services many sites, and being more in control of the auth data, and not being reliant on a service that's beyond your control to not suddenly stop working or shut down.

bearjaws · on May 27, 2022

For sure some auth providers will be compromised at some point in the future, but there are literally thousands (tens of thousands?) of examples of self managed identity management being compromised. How many Auth0 or AWS Cognito service compromises have there been?

To your point about availability, AWS, GCP and Azure all have managed authentication services, that are fundamental to their platforms. I highly doubt they are going anywhere. I have yet to have an Cognito outage in three years.