salience
noticing the structure before striking.
weaponizing statistics
without leaving a scratch
deep learning weights and scientific arrays are practically invisible to legacy compressors. salience teaches iris to see them.
what's in a name?
In visual perception, salience is the quality that makes an object stand out from its neighbors. When you enter a room or look at a photograph, your eyes do not scan linearly from left to right. They dart immediately to the most salient features—high contrasts, sharp edges, isolated anomalies.
Data compression is a perception problem. Salience (Stage 0.5) is the mechanism that allows iris to look deeply at a stream of raw bytes and instinctively "notice" what is statistically prominent before making a move.
It prevents iris from compressing a multi-gigabyte Tensor blindly and instead allows it to draw a rich topographical map of where the heavy tails, outliers, and dense informational spikes live.
why we built it
Modern datasets are dominated by heavy numeric arrays: FP16 and BF16 model weights, FP32 scientific manifolds, and multidimensional arrays.
Traditional algorithms (like Deflate or Zstd) struggle profoundly here. Float entropy looks practically indistinguishable from white noise if you lack the numerical context. The industry responded by either punting with minimal dictionary overlaps, or resorting to aggressive lossy quantization that actively destroys inference accuracy.
This shouldn't be a compromise. Salience was introduced as Stage 0.5 to bridge the gap. We wanted the staggering compression ratios of a deep numeric analytics engine, while refusing to break the strict lossless oath of the iris container.
"salience maps the mountains so the router doesn't have to guess."
- → traditional: blindly encode bytes
- → lossy: drop decimal bits entirely
- → iris: map saliency & shift structural routes
how it works
the mechanics of stage 0.5
1. tensor detection
During the Stage 0 profiler's single read-only pass, it calculates autocorrelation lags across wrap-around i8 differences. If it observes a massive resonance peak exactly at lag 2, it flags the presence of FP16/BF16 data. A peak at lag 4 signals FP32. If no such peak is found, the data is not a tensor, and Salience goes back to sleep entirely—ensuring zero overhead for standard CSVs or logs.
2. semantic block scoring
Once awake, Salience slices the data logically into 4KB blocks (e.g., 2,048 weights per block). For each block, it runs a blazing fast single-pass Welford statistical analysis measuring the mean, variance, kurtosis, and outlier ratio (weights > 3σ from the mean). It synthesizes these into a normalized Saliency Score between 0 and 1.
3. layer-aware grouping
Model weights aren't random; they are structured sequentially in layers (Attention Q/K/V, Feed Forward, Norms). Salience measures the KL-divergence, or "distribution shift," between adjacent blocks. Sharp deviations act as natural boundaries, allowing it to mathematically infer layer transitions without any external schema files.
4. biased routing
This topographical chunk of metadata (the SaliencyMap) gets serialized ahead of the block data. When the standard iris compression stages fire (Resonance, Stride Grammar, Prediction Graph), they read the map. A highly salient `attn_qkv` block gets routed to robust, match-protected entropy chains. A low-salience, near-Gaussian `ffn_down` block experiences ferocious dictionary compression. The map tells them exactly when to push harder.
the strict lossless protocol
pure metadata
Salience never alters the value matrix. It does not quantize. It does not approximate. It just leaves notes for the router so it knows what's coming.
bit-for-bit parity
The final encoded container (IRS2/v3) outputs a string of bytes that decode to the exact, bit-identical mathematical floats initialized on your machine.
streaming access
The layer boundary flags afford the container a streaming read API. You can extract "layer_12.attn_qkv" seamlessly without inflating the entire file.