DataCortex auto-infers your JSON schema, reorganizes data into columns, applies type-specific encoding, then picks the optimal entropy coder. No config, no schema files. Just better compression.
cargo install datacortex-cli
DataCortex Fast mode vs the best general-purpose compressors. Higher ratio = better. Lossless, byte-exact roundtrip guaranteed.
| File | Size | DataCortex | zstd -19 | brotli -11 | vs Best |
|---|---|---|---|---|---|
| k8s structured logs 100K rows | 9.9 MB | ~40x | 18.9x | -- | +113% |
| nginx access logs 100K rows | 9.5 MB | ~28x | 17.3x | -- | +62% |
| NDJSON analytics 10K rows | 3.3 MB | 27.8x | 16.0x | 16.4x | +70% |
| NDJSON events 200 rows | 107 KB | 22.0x | 15.6x | 16.6x | +32% |
| Twitter API nested JSON | 617 KB | 19.7x | 16.7x | 18.9x | +4% |
| Event tickets repetitive | 1.7 MB | 221.7x | 176.0x | 190.0x | +17% |
Four stages, fully automatic. DataCortex understands your data's structure and exploits it.
Rust CLI, Python library, or build from source. Your choice.
Everything you need for real-world JSON compression pipelines.
6 compression paths race concurrently via Rayon. 247% CPU utilization on multi-core machines.
stdin/stdout pipes, --chunk-rows for bounded memory on huge NDJSON files. Multi-frame .dcx format.
pip install datacortex. 6 functions: compress, decompress, compress_file, decompress_file, info, detect_format.
Train compression dictionaries from sample data. Reuse across files with similar schemas for even better ratios.
CRC-32 verified. Decompress always produces identical bytes. 381 tests, 36+ E2E scenarios, zero failures.
No schema files, no database, no setup. Auto-infers format, types, and optimal compression strategy.
One command. Better compression than anything else. Try it now.
pip install datacortex