Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> gzip bomb (100kB size, unpacked around 20GB)

Not possible (unless you're talking double gzip). gzip's max compression ratio is 1032:1[1]. So 100kB can expand to at most ~103MB with single gzip.

Brotli allows much larger compression. Here's[2] a brotli bomb I created that's 81MB compressed and 100TB uncompressed. That's a 1.2M:1 compression ratio.

[1] https://stackoverflow.com/a/16794960

[2] https://github.com/google/google-ctf/blob/main/2019/finals/m...



You're probably right in regards to compression ratios, and I also think that brotli would be a much better candidate. Proxies probably won't support it as "Transfer-Encoding: br" though.

> Not possible (unless you're talking double gzip). gzip's max compression ratio is 1032:1[1]. So 100kB can expand to at most ~103MB with single gzip.

Not sure if I understand the rest of your argument though. If the critique is that it's not possible and that I'm lying(?) about the unpacked file size on the proxy side, then below you'll find the answer from my pentester's perspective. Note that the goal is not to have a valid gzip archive, the goal is to have an as big as possible gzip archive _while unpacking_ and before the integrity and alignment checks.

In order to understand how gzip's longest match algorithm works, I recommend to read that specific method first (grep for "longest_match(IPos cur_match)" in case it changes in the future): https://git.savannah.gnu.org/cgit/gzip.git/tree/deflate.c#37...

The beauty of C code that "cleverly" uses pointers and scan windows with offsets is that there's always a technique to exploit it.

I'll leave that for the reader to understand how the scan condition for unaligned windows can be modified so that the "good_match" and "MAX_MATCH" conditions are avoided, which, in return, leads to bigger chains than the 258 / 4096 bytes the linked StackOverflow answer was talking about :)


deflate.c appears to be doing compression. inflate.c[1] is what does decompression.

Are you saying you can modify your gzip compression code locally to generate a malformed gzip file? That wouldn't be exploiting deflate.c , that would be exploiting the receiver's decompression code, which might be inflate.c or some other implementation of gzip decompression, which might be in some other language. The language used by the compression code doesn't seem relevant to me, rather it's the language used by the decompression code that might have vulnerabilities that can be exploited. If you have a compressed gzip file that expands to more than 1032:1, the file itself is a proof of concept of the vulnerability; it doesn't matter whether the file was generated by C, Rust, Python, or in a hex editor by hand.

If you've found something in gzip code that causes it to use significantly more memory or disk space than it should (either during compression or decompression), I think that's a denial or service vulnerability and should be reported to gzip.

[1] https://git.savannah.gnu.org/cgit/gzip.git/tree/inflate.c


This comment would've definitely earned gold on Reddit. Here all you get is an upvote :)


I'm pretty unpopular on reddit because of my opinion about C/C++ as a language.

Most people in that ecosystem try to justify that they are somehow better when they can write pointer-magic that nobody else can understand, and feel personally attacked immediately when you mention how much less complex and more maintainable the code would have been if they would have used Go or Rust or another memory safe language.

For me, because I work in cyber, Go is kind of somewhere the middle ground between intentionally crappy C code for offensive purposes and maintainable Go code for defensive purposes. Can't use rust because you can't write exploits in Rust without nullifying the reason you used Rust in the first place :D

Go has a lot of conventions and paradigms I could have an opinion against. The point behind it is that it has opinions, at all, which makes it more maintainable, even when you don't like the enforced opinions.

I'm not saying that there is no reason to use C, there is. But every time you choose C as a language you should be aware that you chose a maintenance burden that increases the risk of the codebase to be abandoned in the future.


For your information, at the end of that same line of text he does explicitly meantion double gzip.


Yeah, but it was unclear to me whether the beginning was about single or double gzip.


Its a pain in da arse to support brotli decoding in Ruby (not sure about other languages) so when I send http requests, i just omit it in the request headers




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: