Commit Graph

4 Commits

Author SHA1 Message Date
Mintsuki
c0a0bed8f8 tools/limlzpack: Use int32_t for suffix array indices 2026-04-20 18:02:58 +02:00
Mintsuki
b1c28f00e4 tools/limlzpack: Write CRC header as little-endian on any host 2026-04-20 18:02:58 +02:00
Mintsuki
ff63204fc9 stage1/decompressor: Ensure limlz streams always end on a literal token 2026-04-20 18:02:57 +02:00
Kamila Szewczyk
9c3ead9386 decompressor: gzip/tinf -> limlz
removes external dependency on tinf by replacing the compression algorithm with a simpler, faster, smaller and more auditable fixed-width LZ77 encoding purpose-tailored to x86 code mixed with data.

before: decompressor.bin 2,492 bytes (tinf dependency) with .text 0x875 and .rodata 0x13c bytes each.
after: decompressor.bin consists only of .text, 0xe6-byte decompressor; 90.8% reduction in decompressor volume.

the dependency on gzip during compile-time is replaced by host/limlzpack.c, a Lempel-Ziv encoder in 275 SLoC that uses a suffix array matchfinder (prefix-doubling in mathcal O(n log^2 n) and Storer-Szymanski backwards parse. the fixed-width formats packets as [F][LLLL][MMM], favouring a literal-skewed distribution with F switching between one-byte and two-byte offsets (favouring recent statistics).

integrity checking is done via crc32 with the polynomial 0xEDB88320, reflected.

the effective loss in compression ratio by using a tremendously simpler and less packed with edge cases algorithm causes a compression ratio hit well below 1KB, factoring in the stub sizes.

also adds new machinery for host cc detection per review.
2026-04-19 00:29:09 +02:00