The multiple file compression benchmark tests aim to simulate real-world performance scenarios for lossless data compression programs. The test set comprises various file types, chosen based on their common usage in compression tasks. The selection process considers the prevalence of file types typically encountered by regular compression software users. For instance, text files may outnumber .ocx files in the set, reflecting typical user behavior rather than any strict criteria. With hundreds of files totaling over 300 MB, the large collection helps filter out anomalies, ensuring a more representative evaluation of compressors.
Certain programs like CCM and BZIP2 operate on single files exclusively. To accommodate these, all files are bundled into a single TAR archive, sorted alphabetically by suffix and name. Results for these compressors are denoted with a ‘Y’ in the tarred column.
Emphasizing realism, the test doesn’t explore optimal compression configurations achievable via advanced command-line or GUI switches. Instead, it focuses on a limited set of configurations resembling those chosen by typical users. For instance, 7-Zip’s Ultra compression mode via GUI is selected, despite being potentially inferior to customized command-line configurations. WinRAR undergoes testing with maximum dictionary size and solid archiving, among other standard settings. Programs are restricted to a maximum memory usage of 800 MB and must complete compression within 24 hours. To qualify for listing on MFC (Multiple File Compression), the compressed size must be at most 50% of the original size.
Responding to requests, compression times are now measured alongside other metrics for single-file tests. Furthermore, the test set is kept non-public to prevent developers from tailoring their programs specifically to this benchmark, ensuring fairer real-life performance assessments.
Scoring is based on compressed size, with the program achieving the smallest size deemed the best. Efficiency is evaluated by multiplying the compression time (in seconds) with the ratio of the archive size to the smallest measured size, favoring programs with lower scores. This scoring method ensures fairness by equating the efficiency of different compressors under varying compression speeds and resulting archive sizes.
Observations and conclusions from the test include:
- PAQ8 and WinRK (PWCM) excel in compression performance, reducing the 300+ MB test set to under 62 MB, albeit with lengthy compression times exceeding 8.5 hours. Programs like PAQ8O8 and PAQAR demonstrate exceptional compression but at the cost of extended compression durations, with PAQAR taking over 17 hours for the test. Notably, these top-performing programs utilize PAQ-like engines for compression, with PAQ8 particularly adept at recognizing and separately compressing embedded images in files like Word DOC, significantly enhancing compression ratios.
- WinRAR, WinRK (Fastest ROLZ3), FreeARC, SBC 0.970, and CCM exhibit commendable efficiency, delivering good compression with reasonable speed. FreeARC stands out for achieving compression in just 1 minute and 40 seconds to 82 MB, followed closely by WinRAR and SBC. CCM and FreeARC emerge as the fastest among high-power compressors, compressing the test set to 79 MB in a little over 5 minutes, with FreeARC demonstrating slightly faster performance.
- THOR, QuickLZ, SLUG, and LZOP emerge as speed champions, compressing the entire 300+ MB test set in under 3.4 seconds. However, their speed comes at the expense of lower compression ratios. Other fast compressors like Pkzip, Arj, and Gzip offer slightly better compression ratios while maintaining impressive speeds.
- Decompression speeds generally surpass compression speeds, except for compressors utilizing PPM or PAQ engines. Decompression times of 5-7 seconds appear to be the system limit, potentially even faster when decompressing to a RAM disk or /dev/null.
- Notably, WinRAR appears to outperform 7-Zip in this test, possibly due to WinRAR’s multimedia filter capability, which can handle embedded multimedia files like WAV and BMP, lacking in 7-Zip. Multimedia files were included in the test set to simulate scenarios such as compressed game distributions, which often feature numerous multimedia files.
- The presence of pre-compressed files, such as CHM, GIF, and JPEG, reduces the performance gap between top and other compressors. A future test set may focus on non-multimedia data to explore differences more distinctly.