Tuesday, April 7, 2015

Sampling, or a faster LZ4

 Quite some time ago, I've been experimenting with some unusual sampling methods, in an attempt to find a better way to compress data with LZ4.

The main idea was as follows : LZ4 hash table is getting filled pretty quickly, due to its small size. It becomes the dominant limitation, both for compression ratio and speed. In many cases, a hash cell is overwritten many times before being actually useful (i.e. produce a match). So, could there be some better way to update the hash table, which would update it less often, but in the end, update it more efficiently (i.e. limit wastes from over-writing) ?

It turned out my expectation were too optimistic. Any time I tried to reduce the update rate, it would result in a correspondingly reduced compression ratio. With that experiment failed, I settled for an "optimal" sampling pattern, which became the core of LZ4.

Recently, I've revisited this method. After all, getting a lower compression ratio at a faster speed is not necessarily a bad outcome. It depends on user expectation. So maybe, should a user be allowed to select its own "optimal" speed/compression ratio, he may actually prefer another trade-off than the default one.

Enter LZ4_compress_fast(). It's a new function , available only in developer branch for the time being, which handles a single new parameter : int acceleration.

The concept is fairly simple : the higher the value of acceleration, the faster the compression. Correspondingly, compression ratio decreases too. It can be pretty fine-tuned, each acceleration level providing a little 3-4% speed boost, meaning one could select quite exactly its preferred speed range.

In order to get a taste of this new parameter, a few limited tests were run on the same corpus using different acceleration values. Here are some early results :

                    Compression   Decompression   Ratio
memcpy                4200 MB/s      4200 MB/s    1.000
LZ4 fast 50           1080 MB/s      2650 MB/s    1.375
LZ4 fast 17            680 MB/s      2220 MB/s    1.607
LZ4 fast 5             475 MB/s      1920 MB/s    1.886
LZ4 default            385 MB/s      1850 MB/s    2.101

Silesia Corpus in single-thread mode, Core i5-4300U @1.9GHz, compiled with GCC v4.8.2 on Linux Mint 64-bits v17.


It provides some hint of the relatively wide range of newly accessible speed/compression trade-offs.

The new function prototype is currently only accessible within the "dev" branch. It's still considered experimental, but may find its way into next release r129, depending on user feedback.

Having a parameter to accelerate, rather than strengthen, compression is an unusual concept, so it's not yet clear if it's a very good one. What do you think ? Is a faster and programmable version, trading compression ratio for more speed, a good idea to fit into LZ4 API ?

Edit : LZ4_compress_fast() is released as part of LZ4 r129.