Reading through author's float compression series, I can't unnotice that in this post plot axis got switched - and it is a lot... easier? more elegant? to have speed on the vertical somehow
That you can chain one compressor with a second reminds me of the QOI (a png-competitor) whose output is often competitive with png (which uses gzip) _before_ it's output gets compressed with something as mundane as zstd or gzip.
My understanding is that chaining compressors is a classic technique for image compression. IIRC PNG and Basis are both implemented as initial transformation/pre-filtering/conditioning pass(es) designed to make the image data more compressible before feeding it to a codec like gzip or zstd.
This definitely works for things that aren't images too. I previously proved that you could improve the compression ratio for WebAssembly significantly by performing lossless transforms on the module before feeding it to gzip or brotli (though the gains are much smaller for brotli since it's so good to begin with): https://github.com/WebAssembly/design/issues/1180
One classic transformation for executable code is to convert memory offsets to absolute addresses for compression. Absolute addresses are more compressible than relative ones.
Probably the single oldest trick in the code compression book.
In sane code, there are more function calls than there are functions. Imagine, now, that there's a function at 0x1337, and it's called from 69 different places in the code.
If we're using relative addresses, this would, of course, result in 69 different addresses to compress - each relative address being the difference between 0x1337 and the position of the code that calls it.
If we're using absolute addresses, we get the same exact address 0x1337 repeated 69 times - which is way more compressor friendly.
Mesh optimizer's performance here is a nice reminder: the state of the art in general purpose compression is hard to beat, but special purpose still has room for improvement.
Reading through author's float compression series, I can't unnotice that in this post plot axis got switched - and it is a lot... easier? more elegant? to have speed on the vertical somehow
at least for me
That you can chain one compressor with a second reminds me of the QOI (a png-competitor) whose output is often competitive with png (which uses gzip) _before_ it's output gets compressed with something as mundane as zstd or gzip.
My understanding is that chaining compressors is a classic technique for image compression. IIRC PNG and Basis are both implemented as initial transformation/pre-filtering/conditioning pass(es) designed to make the image data more compressible before feeding it to a codec like gzip or zstd.
This definitely works for things that aren't images too. I previously proved that you could improve the compression ratio for WebAssembly significantly by performing lossless transforms on the module before feeding it to gzip or brotli (though the gains are much smaller for brotli since it's so good to begin with): https://github.com/WebAssembly/design/issues/1180
Exe filters are cool, I think I first saw the split stream thing in the kkrunchy writeup https://fgiesen.wordpress.com/2011/01/24/x86-code-compressio..., looks like it was first in PPMexe.
Vidvox HAP and Resolume DXV codecs also have a fast lossless compression stage
One classic transformation for executable code is to convert memory offsets to absolute addresses for compression. Absolute addresses are more compressible than relative ones.
Probably the single oldest trick in the code compression book.
Isn’t it the other way around? Absolute addresses are all different while relatives often repeat, leading to better compression.
In sane code, there are more function calls than there are functions. Imagine, now, that there's a function at 0x1337, and it's called from 69 different places in the code.
If we're using relative addresses, this would, of course, result in 69 different addresses to compress - each relative address being the difference between 0x1337 and the position of the code that calls it.
If we're using absolute addresses, we get the same exact address 0x1337 repeated 69 times - which is way more compressor friendly.
Mesh optimizer's performance here is a nice reminder: the state of the art in general purpose compression is hard to beat, but special purpose still has room for improvement.
And not only that, but you can use a special-purpose optimiser for a different domain and somehow get great results!