reductor

command module
v0.0.0-...-efa42df Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 23, 2022 License: MIT Imports: 14 Imported by: 0

README

Reductor

A lossless compressor combining LZ and Huffman Coding.

Quickstart

> go build .
> ./reductor
Usage: ./reductor [OPTIONS] <filename>
  -compress
        run the program in compression mode (default true)
  -cpuprofile string
        write cpu profile to file
  -graphviz string
        write graphviz huffman tree representation to file
  -lz string
        write lz representation to file
  -max-match uint
        maximum match size for LZ algorithm (upper limit is 255) (default 255)
  -min-match uint
        minimum match size for LZ algorithm (default 4)
  -name string
        name for the file with compressed data
  -search-size uint
        size of the search window of LZ algorithm (upper limit is 65535) (default 4096)
  -verbose
        display log messages

Providing it with just a path will create <old-filename>.reduced compressed file. Lets compress the executable itself:

> ./reductor -verbose reductor
2022/01/19 00:09:35 Running ./reductor in verbose mode
2022/01/19 00:09:35 Compress: reductor
2022/01/19 00:09:35 Config: min-match=4, max-match=255, search-size=4096
2022/01/19 00:09:35 Input size(bytes): 2378496
2022/01/19 00:09:42 Time elapsed: 7.089495207s
2022/01/19 00:09:42 Compression ratio: 1.41
> ls
...
-rwxrwxr-x 1 antoni antoni 2378496 sty 19 00:02 reductor
-rw-rw-r-- 1 antoni antoni 1684775 sty 19 00:09 reductor.reduced
...

Now, lets decompress it:

> ./reductor -compress=false reductor.reduced
> ls
...
-rwxrwxr-x 1 antoni antoni 2378496 sty 19 00:02 reductor
-rw-rw-r-- 1 antoni antoni 1684775 sty 19 00:09 reductor.reduced
-rw-rw-r-- 1 antoni antoni 2378496 sty 19 00:10 reductor.unreduced
...

Visuals

The compressor allows to visualize what happens under the hood. It can dump LZ encoded representation to a file via --lz <filename>. In this representation LZ "pointers" are symbolicly representad as <distance,length>. Beginning of LZ representation for compressing this README contents is as follows:

> ./reductor -lz lz.txt README.md
> cat lz.txt
# Reductor

A lossless compressor<11,4>bining LZ and Huffman Coding.

```console
> go build .
> ./r<89,7> --help
Usage of<27,11>:
  -<108,8>
    <4,4>run the program in<144,9>ion mode (default true)<71,5>puprofile string<80,9>write cpu<82,4><33,5>to <41,4><126,4>graphviz<53,22><30,9>h<245,7>tree re<285,4>entat<145,4><78,11>lz77<127,22><26,5><57,26>max-match uint<267,9>maximum <27,6>size for LZ77 algorithm (upper limit is 255)<290,10><14,4><99,5>in<99,21>in<99,36><368,8>4<365,5>nam<359,17><20,5><158,4><444,4><392,5>with<587,9>ed data<498,4>search-<207,5><239,13><225,5>of<510,5><37,6> window<567,4><245,31>6553<247,12>4096<537,5>verbos<493,4><609,6>display log mes<662,4>s<708,4>

It can also dump Huffman Tree generated during compression (in DOT format) via --graphviz <filename>. For a tree generated compressing this file (with LZ coding turned off) it looks like below. Notice that bytes are represented by numbers, because we get outside of readable ASCII range in the tree.

Huffman Tree created for compressing this README.md file contents.

Performance

TODO: This is out of date, comp/decomp times are ~3x faster
File Size Comp Ratio - Gzip Comp/Decomp Time - Gzip Comp Ratio - reductor Comp/Decom time - reductor
enwik9 1Gb 3.09 45.64s/8.47s 2.11 1h105s/193s
log18MB.txt 18Mb 1027 101ms/100ms 290 63.71s/78ms
random1MB.txt 1Mb 0.99 37ms/12ms 0.89 9.5s/327ms
yellowsub.txt 1.5Kb 2.90 2ms/2ms 1.80 3.2ms/0.48ms
singlechar.txt 2bytes 0.05 1ms/1ms 0.33 0.07/0.06ms
reductor 2.5Mb 1.81 126ms/28ms 1.45 13.91/530m

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL