iris
the art of seeing data.
compression that
sees the data+ salience active
iris profiles before it compresses. structure first, codec second.
iris compress archive.csv out.irisiris decompress out.iris recovered.csv
the numbers
| file type | size | zstd -19 | bzip2 -9 | iris |
|---|---|---|---|---|
| random binary | 5 MB | 1.00x | 1.00x | 1.00x |
| structured logs | 7.4 MB | 9.76x | 13.28x | 15.0x |
| CSV export | 3.5 MB | 4.36x | 4.98x | 6.6x |
| nearly-sorted u32s | 4 MB | 171x | 247x | 405x |
| repetitive binary | 2 MB | 3.29x | 3.23x | 3.30x |
all roundtrips verified lossless by MD5
fivesix stages
profile
a single read-only pass measures entropy, autocorrelation at 64 lags, and block fingerprints. nothing is read twice.
salience
detects tensor_mode. computes a strictly lossless per-block saliency map and infers layer chunks to influence routing without altering values.
column grammar
structured text is decomposed by column. enums become dict indices. the result approaches the information-theoretic minimum.
resonance
periodic structure is extracted via wrapping i8 delta at the dominant lag. data with a strong period compresses to near-zero residual.
prediction graph
similar blocks found via SimHash and LSH bands. no window limit. similar blocks stored as XOR deltas. lz77 can't do this.
route
final entropy decides the pure-rust codec. rANS, range-coding, or hierarchical block matching. zero external dependencies.
usage
compress an input file into the custom iris format
reconstruct the exact original bytes
inspect how the file was routed based on its profile