What about using existing crypto libraries (such as OpenSSL) to add crypto to an...

crpatino · on April 14, 2015

You are probably going to get it wrong, not because there's anything bad with you, but because standard crypto libraries are too fine grained. The primitives are too low level, and you have to assemble sequences of calls in the right order and be in the watch for unexpected side effects.

I could tell you how I have gotten it wrong in the past, but there is no guarantee that I won't get it wrong again in a different way. So, the audit idea has it's merits, but you really want not to rely on the Linus Law of eyeballs. That means knowledgeable auditors who charge actual money for their time.

tptacek · on April 14, 2015

Same as "rolling your own crypto"; in fact, that's generally, except in extreme cases, exactly what we're referring to when we talk about people rolling their own.

derekp7 · on April 14, 2015

So what are my options then? I could put out an initial version that shows what I want to accomplish, then either have it reviewed heavily, or see if I can get someone who's more of an expert to give it a go based on those specifications. Here's the problems I have so far, which I know I'm out of my depth:

1) In order to keep the backup software's deduplication functionality (and to minimize the server trusting the client), I need to have the IV (initialization vector) be a computed value based on the file's contents (so that multiple copies of the same file on the clients will encrypt to the same contents going to the backend server). I know that _This will leak some data_ (i.e., you may not want an attacker to know that a given set of files are the same contents). So I plan on making this optional -- either the user gets to take advantage of dedup, or better encryption.

2) To do number 1 above, I was planning on taking a hash of the file, encrypting that with the users key, then using that as a predictable IV. You will still have a unique IV per unique file, but I don't know enough to see if this can leak any other data. It looks like RFC 5297 describes an approach similar to this, but I think it is for a different use case.

3) I need the backup server to know which version of an encryption key the client used. That is, if the client changes keys between backups, I need the server to instruct the client to do a full backup, not incremental. So I can either have the client provide a version number for the key (or if using a keyfile, use the datestamp of the file as a version string), or I can encrypt a given string using the key, and use a hash of the encrypted string as the key version number. (Note, in no case will I ever be storing a hash of the plaintext of the client's files on the backup server, as that too can leak information)

My apologies if the above makes any experts here cringe, but as I mentioned my constraint is to have same-content files encrypted to the same target contents (for dedup purposes), although I will give the user to turn that off and use a random IV for better security (and give up dedup).