Wrong lesson. Don't put secret keys in your repository. Someone getting a copy o...

stickfigure · on July 27, 2015

Don't put secret keys in your repository is also the wrong lesson.

The right lesson is: Know where your secret keys are and take the appropriate steps to secure them. Whether that's in the codebase, a properties/ini/conf/whatever file, environment variables, whatever - know where they are and make sure you understand possible threats against them.

This story could just as easily have been written about how easy it is to download ALL_THE_SECRETS.txt. Don't feel smugly secure just because you don't store passwords in git.

Dylan16807 · on July 27, 2015

Putting keys in a text file doesn't fit the narrative of a generally-careful user forgetting about side effects and metadata.

It's important to know where your keys are, but it's also important to not store your keys in certain ways that are easily overlooked.

A lesson of "don't put secret keys inside the web root" is also useful.

But a lesson of "know where your keys are and secure them" is a bit too short-sighted. You don't just want them to be secure right now, you want the mechanisms keeping them secure to be mistake-resistant.

Don't put them in the code, even if you promise to be super careful.

jsprogrammer · on July 27, 2015

Is there a general algorithm that can tell you all possible threats against your secret keys?

danudey · on July 27, 2015

My approach to security, when discussing things with our engineers:

1. Make a list of everything that absolutely positively cannot live without this data/access/permissions/etc.

2. Put the data somewhere where absolutely nothing whatsoever can ever read it (except root).

3. Figure out what one single change will resolve #2 so that the things in #1 can happen without any other things gaining access.

If you don't do #1, you don't understand your requirements/applications. If you don't do #2, then your data is probably vulnerable through some other mechanism. If you can't do #3 then you probably need to change something else (e.g. stop running all processes as the same user, stop running all services on the same box, stop trusting users, set up more granular sudoers rules, etc).

What I find is that when you come up with an idea for #3, and then come up with a list of side effects, you can actually find a lot of the kinds of issues I mentioned above, for example where the public website CMS (as 'daemon') and the accounting backend (as 'daemon') both have access to the same resources, and thus someone gaining access to the CMS can get the accounting DB user/pass and get access to your transaction records, user database, etc.

lmm · on July 27, 2015

No, there are an infinity of things that shouldn't have access to your secret keys. So you have to take a default deny approach, and ensure that only things that positively should have access to your secret keys do.

dangero · on July 27, 2015

Where is the right place to store db passwords, api keys, etc? What is best practice in this area?

cddotdotslash · on July 27, 2015

I use AWS for a number of applications, so I've started doing the following:

1. Create a JSON file containing encrypted secrets (DB pass, etc.)

2. Upload the file to a secure S3 bucket with fine-tuned permissions and server-side encryption

3. For the instance that is launching, include the permission in the IAM role that allows it to "S3:GetObject" on the specific JSON file you uploaded.

4. Deliver decryption keys to the app in some manner (Chef, Ansible, etc.)

5. When the app starts, it downloads the JSON file and loads the environment variables.

I wrote a blog post[1] about this with more detailed info and an NPM module if you use Node.js.

[1] http://blog.matthewdfuller.com/2015/01/using-iam-roles-and-s...

err4nt · on July 27, 2015

This system is truly beautiful, I think one of the best suggestions in the thread (definitely the best self-rolled solution not using other tools)

I have a question about redundancy or "What happens if your gatekeper EC2 instance goes down"? If you have multiple gatekeepers could they be set up this way:

- let's say you have five different web apps using a gatekeeper to hold their secrets

- let's say you have n gatekeepers (let's say 3) and each of the apps knows the address of all three gatekeepers.

- If the primary gatekeeper is unreachable, all five apps would try to contact the secondary gatekeeper, but that gatekeeper would only (ever) respond in the event that the secondary gatekeeper also found the primary gatekeeper unreachable.

It's like a sleeper cell - at any given moment you have multiple replacement gatekeepers ready and waiting to serve, but each of them is unable to respond unless the one above it in the list stops responding. In this way you could lose gatekeepers (even permanently) and build a little bit of resilience into the apps depending on it while you're able to sort out what happened and restore normal behaviour.

Is this a good idea?

cddotdotslash · on July 28, 2015

I'm confused by your reference to "gatekeeper EC2 instances." In the scenario I described, the secrets are housed on S3, not a separate EC2 instance. So, theoretically, as long as the underlying instance running the application code can access S3, and S3 doesn't go down (very unlikely), there shouldn't be any issues.

mahmoudimus · on July 27, 2015

Yeah, we take this same approach. I wrote an ansible module that does this.

burke · on July 27, 2015

We (Shopify) use https://github.com/Shopify/ejson -- we store encrypted secrets in the repository, relying on the production server to have the decryption key.

It's relatively common to provision secrets with configuration management software like Chef/puppet/ansible/etc using, e.g. Chef's encrypted data bags.

Another slightly heavier-weight solution with some nice properties is to use a credential broker such as Vault: https://www.vaultproject.io/

e12e · on July 27, 2015

For ansibile the built-in solution is: http://docs.ansible.com/ansible/playbooks_vault.html

notduncansmith · on July 27, 2015

Just wanted to +1 the suggestion for Vault - I've found it to be a really nice balance between usability and security.

tokenizerrr · on July 27, 2015

Environment variables are the best and easiest way that I know of. You can supply those anyway you want to, and any programming language can easily get their values.

jsjohnst · on July 27, 2015

Glad I don't work at your shop then. Environment variables are a terrible way to give your app secure information. There's well over a dozen reasons why you shouldn't do this in your apps, but one super obvious one is there's way to many frameworks that expose environment variables in their debug output if not properly configured. Think you'll never misconfigure a server? Guess again, pretty much every major site (Google, FB, Twitter, Yahoo, EBay, Microsoft, etc) have all done it at some point.

tokenizerrr · on July 27, 2015

Please review HN's guidelines on civility.

jsjohnst · on July 27, 2015

Fair point, I potentially should've left off the first sentence. I stand behind the rest of the post, but the first sentence is a bit on the edge and I apologize.

tokenizerrr · on July 27, 2015

Alright, well, I've never seen an application/framework spit out environment variables when it was misconfigured. But then again, I barely work with web-related stuff so maybe I just don't use the kind of software that does this. Could you provide some examples?

damon_c · on July 27, 2015

Many web frameworks do this when in "debug" mode.

dang · on July 27, 2015

Your comment sounded more colloquial than uncivil to me, but thanks for responding so respectfully.

shutupbitch · on July 27, 2015

[flagged]

tokenizerrr · on July 27, 2015

I'm glad you made a new account named 'shutupbitch' just to tell me this. Thank you for your contribution.

dang · on July 27, 2015

Please don't feed trolls.

otterley · on July 29, 2015

The "dump environment" problem is an issue for novice developers, but mature shops should have security-conscious frameworks for secrets handling that do things like clear the variable from the environment at initialization time.

What are your other 11 objections?

alexisnorman · on July 27, 2015

I'm surprised this isn't higher up. Are there any arguments against ENV variables in favor of something else?

smtddr · on July 27, 2015

Here's my caution to this. If low level processes can do "ps aux", and they see something like:

DB_USER=scott DB_PASSWORD=b3withm3pl3aze /usr/bin/python webapp.py

That could be troublesome if an attacker figured out a way to run remote commands on your server even as an unprivileged user.

Dylan16807 · on July 27, 2015

When I test that, the command line does not include the variable-setting. Is this a problem that depends on version or are you mistaken?

It would be kind of weird to include that considering how argv works.

yrro · on July 27, 2015

Reading the environment of another process is a privileged operation.

smtddr · on July 27, 2015

If an attacker can get the process that's running the webapp.py to exec some abitrary bash command, that process has the ability to read its own /proc/$PID/environ . In general, you can read /proc/$PID/environ on processes that you own. At least I can do that on my Debian system:

    pikachu@POKEMONGYM ~ $ sleep 99 &
    [1] 21340
    pikachu@POKEMONGYM ~ $ cat /proc/21340/environ

XDG_SESSION_ID=5COMP_WORDBREAKS= "'><;|&(:TERM=screenSHELL=/bin/bashXDG_SESSION_COOKIE=8571b679eed8952dd96ad28a54...<etc>

(I actually gave the wrong example in my previous comment. While it is true that giving the ENV on cmdline will show up in ps eaux, the more appropriate example is what I just explained in this comment.)

tokenizerrr · on July 27, 2015

If you can get it to exec some arbitrary bash command (or otherwise access the environ of a process) you can also have it cat any file on the server, and even the memory of the running processes that belong to the same user as the exploited process, and also execute network requests. So if you get that far, pretty much nothing will protect you.

smtddr · on July 27, 2015

Sure, but there are some shops that do their security from a point-of-view of "Attacker can run commands on your server as the user that started whatever-public-service/webapp/api", and go from there. I happen to think that's the best way to think about it.

Now, if an attacker manages to get root access then it's game over[1]. That just shouldn't happen. But nobody should be running their webserver as root. So, whatever that user is should be low-powered with only enough privileges to start the webserver & bind port 8080 (and use iptables or whatever to reroute connections to port 80 --> 8080) and the whole setup should be designed that this account won't be able to escalate things further if someone got a bash shell to it.

______

1. You should at least have some way of detecting that it happened and consider all data & files compromised and just wipe the whole machine & start over. Or take that machine offline for investigation into what happened and put a fresh new one in its place.

yrro · on July 29, 2015

If an attacker can run an arbitrary command on your server, it's already time to rotate all the credentials in your system and let any data subjects whose data you hold know that you fucked up, big time. That's just the Linux model.

vidarh · on July 27, 2015

The example above is someone who have stupidly started a process with the environment variables exposed on the command line

yrro · on July 29, 2015

Ok, but that's not a problem caused by the exposure of the environment, it's caused by the exposure of the command line.

vidarh · on July 30, 2015

I agree - I was just explaining what the issue the above commenter raised. It just means you should use a saner way of initializing your environment with sensitive values.

marcus · on July 27, 2015

ENV variables are inherited by child processes by default, so please use care when using this approach.

tfigment · on July 27, 2015

My preferred solution currently is to use try to use encrypted strings in config files that are not stored in VCS. The host machine encrypts and decrypts using host specific keys so if the file is copied off-server, it is not fully compromised immediately. This is usually via python script which rewrites the file. (BTW, pretty easy to do on Windows boxes with MS API). I've considered using encrypted folders on windows in addition but not sure if that really makes a difference.

Usually the base config is in VCS but without user/password/db strings. We then manually configure the file with the encrypted strings on the server (usually with the machine name in the filename so that we can use hostname in code to find it and makes it clear the file is machine specific). Not all tools make this easy though and only works if you can add your own code in between. Also prefer files to environment as the files can be locked down easier in my opinion and more obvious what is going on.

I like some of the other solutions that are using encrypted strings but with a keystore server and may consider for the future if they support both windows and linux.

mycelium · on July 27, 2015

Stack Exchange's blackbox [1] is one solution. I haven't played with it personally and I'd love to hear other people's take on what's worked for them.

[1] https://github.com/StackExchange/blackbox

thaumaturgy · on July 27, 2015

FWIW I have a /private directory in the root of all vhosts, so it looks like:

    /srv/www/domain.com/public_html/
            |--------->/private/
            |--------->/logs/
            |--------->/tmp/

Anything stored in /private/ is not publicly accessible by the web server process, but can be read or written by anything running under the user's username.

It's specifically for storing things like configuration files.

I think this should be standard practice.

dangero · on July 27, 2015

Thanks for the tip. How do you keep passwords and keys in sync amongst team members safely?

thaumaturgy · on July 27, 2015

I only just recently had to figure that out. I opted for setting up a .kdb KeePass file in a private git repo and giving everyone ("everyone" = myself + one other) access to that. I'm pretty sure that's not a very good solution.

raspie · on July 27, 2015

Why do you store such stuff under /public_html anyway? One level higher would be more appropriate I think.

thaumaturgy · on July 27, 2015

It's not under public_html. It's under /srv/www/domain.com. It is one level higher.

raspie · on July 27, 2015

Oh, that makes more sense. Sorry I got confused by the ascii tree.

craigkerstiens · on July 27, 2015

http://12factor.net provides a good guidance for this at a high level. In reality all config should be separated from code.

There's a variety of mechanisms for loading this into your environment.

gabetax · on July 27, 2015

You can use a datastore like HashiCorp's vault: https://vaultproject.io

mdpopescu · on July 27, 2015

Windows has something similar in DPAPI - https://msdn.microsoft.com/en-us/library/ms995355.aspx

bobwaycott · on July 27, 2015

Config files that are not version controlled, or environment variables. I prefer config files because it's easier for me to communicate to other team members what needs to be present in their local development environment.

I typically handle this by versioning a `config.example` file, which includes all the necessary config keys an application expects. The example file defaults these attrs to various strings meant to show they are examples only. I include instructions to copy the `config.example` to a `config.yml` (or some other appropriate extension), and replace the values as necessary. The `config.yml` file is specifically excluded in the `.gitignore` file. The application will only load the `config.yml` file when started, so I also ensure to raise a descriptive error informing team members when they are missing a local `config.yml`.

This allows the `config.example` to also serve as a self-documenting config for the application, as comments can be included that identify and explain each of the config keys and their purposes.

bigmac · on July 27, 2015

Keywhiz is a good solution. See here for some background info: https://square.github.io/keywhiz/

Disclaimer: I worked on it.

ars · on July 27, 2015

I store dummy values in VC, then edit the real data on the production server. (And I obviously never check anything in from production, if you can set the production VC user to read only.) This has a nice side effect that if I edit the configuration file the new stuff gets merged in without causing a mess.

Another way is a second file that overrides settings as needed. Although I have found that to be less maintainable if the configuration file changes. That file should be somewhere entirely out of the VC tree.

Either way, the file must be placed in a directory that is not served by the web server.

/include

and

/public

are traditional. Only /public is exposed by the web server.

crdoconnor · on July 27, 2015

http://docs.ansible.com/ansible/playbooks_vault.html

corford · on July 27, 2015

Have a look at: https://github.com/StackExchange/blackbox

tracker1 · on July 27, 2015

For me, I have connection criteria for a configuration database as environment variables... the config library will then connect to the configuration server with those credentials and get everything that application needs to connect to other services... I'd considered using etcd for this, but was unstable for me at that time... I keep settings cached for 5 minutes, then the library will re-fetch, in case they changed.

emmelaich · on July 27, 2015

I think that if you're using git,

    use git crypt[1][2]
    use git-dir and work-tree options/env vars

[1] https://www.agwa.name/projects/git-crypt/

[2] though you have to remember to git crypt lock

X-Istence · on July 27, 2015

In a configuration file that is not version controlled, or even environment variables, so that your application starts with the right variables, but they are not in some config file.

dangero · on July 27, 2015

How do you communicate that data amongst team members then?

bobwaycott · on July 27, 2015

As I detailed in my other response to your original question, use an example config file that is version controlled. It includes all the necessary config keys, but example-only values. All team members would then be able to easily create a local config file based on the example that works. You can even document the config with comments in the example file so devs know what is needed and what it's for.

whoisthemachine · on July 27, 2015

I think at one point, if you have a shared password for a development DB, production DB, etc. then just keeping those on a pen and paper notebook is your best solution. Usually, for shared environments such as that (although I hope the team can set-up their own DB's for development!), the number of shared "secrets" is relatively small. Some secrets are best not stored electronically, especially if they can give away user data.

anantzoid · on July 27, 2015

Can also try maintaining a different repository of passwords and and pulling it on to the server during deployment.