I don't see a problem with them expressing the ratio as a decimal since it becomes a simple multiplier of the original file size 38GB x 0.3.
But it's downright misleading to show the vertical axis from something other than 0.0 to 1.0 when comparing ratios. They start it at 0.2. In reality, LZJB is saving 50% of the space whereas gzip saves 70%. But a naive glance at the graph implies gzip look roughly 3 times smaller/better than LZJB.
Classic "How to Lie with Statistics" stuff.* I would have expected better from an "analytics" database.
Author here. Believe it or not I originally had the compression ratio graph rotated 90 degrees, and had manually modified it to run from 0.00 to 1.00. Google docs for some god awful reason insists on starting at 0.2 by default. Anyway, when my colleagues reviewed a draft of this post they requested that I rotate the graph back, and in the process I forgot to reset the scale. Sorry for the confusion. It's fixed now. As for the definition of "compression ratio", I looked this up and went with the definition found here: http://en.wikipedia.org/wiki/Data_compression_ratio
If you read in any other article something like the following: "Taking Product X as having a baseline compression ratio of 1, Product Y had a compression ratio of 0.5 and Product Z had a compression ratio of 0.3", I'm pretty sure 99.9999% of the HN population would interpret that as Products Y and Z having worse compression than X, not better. That's my point.
This academic-looking paper (first hit I tried from Wikipedia) gives the standard definition of "compression ratio" as compressed/uncompressed size (section 4.2), consistent with the linked article.
I'm pretty sure you're impression of 99.9999% of the HN population is wrong.
Which includes this section on "Usage of the term": "There is some confusion about the term 'compression ratio', particularly outside academia and commerce. In particular, some authors use the term 'compression ratio' to mean 'space savings', even though the latter is not a ratio; and others use the term 'compression ratio' to mean its inverse, even though that equates higher compression ratio with lower compression."
So, my bad, however in my practical workplace experience the above (in italics) has been the case, hence the confusion.
But it's downright misleading to show the vertical axis from something other than 0.0 to 1.0 when comparing ratios. They start it at 0.2. In reality, LZJB is saving 50% of the space whereas gzip saves 70%. But a naive glance at the graph implies gzip look roughly 3 times smaller/better than LZJB.
Classic "How to Lie with Statistics" stuff.* I would have expected better from an "analytics" database.
* Not saying they intend to lie here but it's representative of the classic text https://en.wikipedia.org/wiki/How_to_Lie_with_Statistics