That's completely crazy, the backdoor is introduced through a very cryptic addition to the configure script. Just looking at the diff, it doesn't look malicious at all, it looks like build script gibberish.
This is my main take-away from this. We must stop using upstream configure and other "binary" scripts. Delete them all and run "autoreconf -fi" to recreate them. (Debian already does something like this I think.)
> We must stop using upstream configure and other "binary" scripts. Delete them all and run "autoreconf -fi" to recreate them.
I would go further than that: all files which are in a distributed tarball, but not on the corresponding git repository, should be treated as suspect.
Distributing these generated autotools files is a relic of times when it could not be expected that the target machine would have all the necessary development environment pieces. Nowadays, we should be able to assume that whoever wants to compile the code can also run autoconf/automake/etc to generate the build scripts from their sources.
And other than the autotools output, and perhaps a couple of other tarball build artifacts (like cargo simplifying the Cargo.toml file), there should be no difference between what is distributed and what is on the repository. I recall reading about some project to find the corresponding commit for all Rust crates and compare it with the published crate, though I can't find it right now; I don't know whether there's something similar being done for other ecosystems.
One small problem with this is that autoconf is not backwards-compatible. There are projects out there that need older autoconf than distributions ship with.
The test code generated by older autoconf is not going to be work correctly with newer GCC releases due to the deprecation of implicit int and implicit function declarations (see https://fedoraproject.org/wiki/Changes/PortingToModernC), so these projects already have to be updated to work with newer autoconf.
Typing `./configure` wont work but something like `./configure CFLAGS="-Wno-error=implicit-function-declaration"` (or whatever flag) might work (IIRC it is possible to pass flags to the compiler invocations used for checking out the existence of features) without needing to recreate it.
Also chances are you can shove that flag in some old `configure.in` and have it work with an old autoconf for years before it having to update it :-P.
Yes, it sucks to add yet another wrapper but that’s what you get for choosing non backwards compatible tools in the first place. In combination with projects that don’t keep up to date on supporting later versions.
> Why do we distribute tarballs at all? A git hash should be all thats needed...
A git hash means nothing without the repository it came from, so you'd need to distribute both. A tarball is a self-contained artifact. If I store a tarball in a CD-ROM, and look at it twenty years later, it will still have the same complete code; if I store a git hash in a CD-ROM, without storing a copy of the repository together with it, twenty years later there's a good chance that the repository is no longer available.
We could distribute the git hash together with a shallow copy of the repository (we don't actually need the history as long as the commit with its trees and blobs is there), but that's just reinventing a tarball with more steps.
(Setting aside that currently git hashes use SHA-1, which is not considered strong enough.)
except it isn't reinventing the tarball, because the git hash forces verification that every single file in the repo matches that in the release.
And git even has support for "compressed git repo in a file" or "shallow git repo in a file" or even "diffs from the last release, compressed in a file". They're called "git bundle"'s.
They're literally perfect for software distribution and archiving.
People don't know how to use git hashes, and it's not been "done". Whereas downloading tarballs and verifying hashes of the tarball has been "good enough" because the real thing it's been detecting is communication faults, not supply chain attacks.
People also like version numbers like 2.5.1 but that's not a hash, and you can only indirectly make it a hash.
> I would go further than that: all files which are in a distributed tarball, but not on the corresponding git repository, should be treated as suspect.
This and the automated A/B / diff to check the tarball against the repo, flag if mismatched.
I don't think it would help much. I work on machine learning frameworks. A lot of them(and math libraries) rely on just in time compilation. None of us has the time or expertise to inspect JIT-ed assembly code. Not even mentioning that much of the code deliberately read/write out of bound, which is not an issue if you always add some extra bytes at the end of each buffer, which could make most memory sanitizer tools useless. When you run their unit tests, you run the JIT code, then a lot of things could happen. Maybe we should ask all packaging systems splitting their build into compile and test two stages, to ensure that a testing code would not impact the binaries that are going to be published.
I would rather to read and analysis the generated code instead of the code that generates it.
Maybe it's time to dramatically simplify autoconf?
How long do we need to (pretend to) keep compatibility with pre-ANSI C compilers, broken shells on exotic retro-unixes, and running scripts that check how many bits are in a byte?
Not just autoconf. Build systems in general are a bad abstraction, which leads to lots and lots of code to try to make them do what you want. It's a sad reality of the mismatch between a prodecural task (compile files X, Y, and Z into binary A) and what we want (compile some random subset of files X, Y, and Z, doing an arbitrary number of other tasks first, into binary B).
For fun, you can read the responses to my musing that maybe build systems aren't needed: https://news.ycombinator.com/item?id=35474996 (People can't imagine programming without a build system - it's sad)
Autoconf is m4 macros and Bourne shell. Most mainstream programming languages have a packaging system that lets you invoke a shell script. This attack is a reminder to keep your shell scripts clean. Don't treat them as an afterthought.
I'm wondering is there i.e. no way to add an automated flagging system that A/B / `diff` checks the tarball contents against the repo's files and warns if there's a mismatch? This would be on i.e. GitHub's end so that there'd be this sort of automated integrity test and subsequent warning? Just a thought, since tainted tarballs like these might be altogether be (and become) a threat vector, regardless of the repo.
It looks like an earlier commit with a binary blob "test data" contained the bulk of the backdoor, then the configure script enabled it, and then later commits patched up valgrind errors caused by the backdoor. See the commit links in the "Compromised Repository" section.
Also, seems like the same user who made these changes are still submitting changes to various repositories as of a few days ago. Maybe these projects need to temporarily stop accepting commits until further review is done?
The use of "eval" stands out, or at least it should stand out – but there are two more instances of it in the same script, which presumably are not used maliciously.
A while back there was a discussion[0] of an arbitrary code execution vulnerability in exiftool which was also the result of "eval".
Avoiding casual use of this overpowered footgun might make it easier to spot malicious backdoors. Usually there is a better way to do it in almost all cases where people feel the need to reach for "eval", unless the feature you're implementing really is "take a piece of arbitrary code from the user and execute it".
Unfortunately eval in a shell script has an effect on the semantics but is not necessary to do some kind of parsing of the contents of a variable, unlike Python or Perl or JavaScript. A
$goo
line (without quotes) will already do word splitting, though it won't do another layer of variable expansion and unquoting, for which you'll need
You can be certain it has happened, many times. Now think of all the software we mindlessly consume via docker, language package managers, and the like.
Remember, there is no such thing as computer security. Make your decisions accordingly :)
A big part of the problem is all the tooling around git (like the default github UI) which hides diffs for binary files like these pseudo-"test" files. Makes them an ideal place to hide exploit data since comparatively few people would bother opening a hex editor manually.
How many people read autoconf scripts, though? I think those filters are symptom of the larger problem that many popular C/C++ codebases have these gigantic build files which even experts try to avoid dealing with. I know why we have them but it does seem like something which might be worth reconsidering now that the tool chain is considerably more stable than it was in the 80s and 90s.
The alternatives are _better_ but still not great. build.rs is much easier to read and audit, for example, but it’s definitely still the case that people probably skim past it. I know that the Rust community has been working on things like build sandboxing and I’d expect efforts to be a lot easier there than in a mess of m4/sh where everyone is afraid to break 4 decades of prior usage.
build.rs is easier to read, but it's the tip of the iceberg when it comes to auditing.
If I were to sneak in some underhanded code, I'd do it through either a dependency that is used by build.rs (not unlike what was done for xz) or a crate purporting to implement a very useful procedural macro...
I mean, autoconf is basically a set of template programs for snffing out whether a system has X symbol available to the linker. Any replacement for it would end up morphing into it over time.
We have much better tools now and much simpler support matrices, though. When this stuff was created, you had more processor architectures, compilers, operating systems, etc. and they were all much worse in terms of features and compatibility. Any C codebase in the 90s was half #ifdef blocks with comments like “DGUX lies about supporting X” or “SCO implemented Y but without option Z so we use Q instead”.
Even in binary you can see patterns. Not saying it's perfect to show binary diffs (but it is better than showing nothing) but I know even my slow mammalian brain can spot obvious human readable characters in various binary encoding formats. If I see a few in a row which doesn't make sense, why wouldn't I poke it?
This particular file was described as an archive file with corrupted data somewhere in the middle. Assuming you wanted to scroll that far through a hexdump of it, there could be pretty much any data in there without being suspicious.
You're right - the two exploit files are lzma-compressed and then deliberately corrupted using `tr`, so a hex dump wouldn't show anything immediately suspicious to a reviewer.
Is this lzma compressed? Hard to tell because of the lack of formatting, but this looks like amd64 shellcode to me.
But that's not really important to the point - I'm not looking at a diff of every committed favicon.ico or ttf font or a binary test file to make sure it doesn't contain a shellcode.
testdata should not be on the same machine as the build is done. testdata (and tests generally) aren't as well audited, and therefore shouldn't be allowed to leak into the finished product.
Sure - you want to test stuff, but that can be done with a special "test build" in it's own VM.
in this case the backdoor was hidden in a nesting doll of compressed data manipulated with head/tail and tr, even replacing byte ranges inbetween. Would've been impossible to find if you were just looking at the test fixtures.
> "Given the activity over several weeks, the committer is either directly
involved or there was some quite severe compromise of their
system. Unfortunately the latter looks like the less likely explanation, given
they communicated on various lists about the "fixes" mentioned above."