https://bellard.org/nncp/ NNCP: Lossless Data Compression with Neural Networks NNCP is an experiment to build a practical lossless data compressor with neural networks. The latest version uses a Transformer model. The papers nncp_v2.1.pdf and nncp.pdf describe the algorithms and results of previous releases of NNCP. The current release of NNCP is implemented in C and uses LibNC to get better performance than PyTorch. Compression ratio Result for enwik8: Program Compr. size Ratio (bytes) (bpb) gzip 36 445 248 2.92 xz 24 865 244 1.99 NNCP (2021-02-06) 15 020 691 1.20 CMIX (v18) 14 838 332 1.19 Result for enwik9: Program Compr. size Ratio Program size(zip, Total (bytes) (bpb) bytes) (bytes) gzip 322 591 995 2.58 38 801 322 630 796 xz 197 331 816 1.58 36 752 197 368 568 CMIX (v18) 115 714 367 0.926 208 961 115 923 328 NNCP 110 034 293 0.880 197 491 110 231 784 (2021-04-24) * The results for the other programs are from the Large Text Compression Benchmark. Download * NNCP v3: Linux version (including CUDA support): nncp-2021-04-24.tar.gz (Changelog, readme.txt). * NNCP v3: Precompiled Windows version (no CUDA support): nncp-2021-04-24-win64.zip. * NNCP v2 (Python+PyTorch, GPU required): nncp_v2-2021-02-06-1.tar.gz Related Links * LibNC: C Library for Tensor Manipulation. * gpt2tc: Text Completion and Compression using GPT-2. * NNCP thread on the encode.su forum. * CMIX lossless data compression program. * lstm-compress: lossless data compression with LSTM. * Large Text Compression Benchmark. --------------------------------------------------------------------- Fabrice Bellard - https://bellard.org/