1/7/2024 0 Comments Tar xz compression levelThe zsh archive uses an unknown order, and the Python archive orders the file by modification date. If I inspect the archives with Quicklook (and the Betterzip plugin) I see that the files in the archive are ordered in a different way: If I compare the two tar archives directly, they seem different: ➜ diff īinary files and differ tar on Raspbian 10: xz (XZ Utils) 5.2.4 liblzma 5.2.4Īfter compression, I've extracted both archives and compared the resulting folder with: diff -r py-archive-expanded zsh-archive-expanded. With tarfile.open(py_out, "w:xz") as tar: This script compares both methods: #!/usr/bin/env python3įullpath = Path("/Users/user/Desktop/temp/tar/") So we’ve demonstrated that xz does indeed create much smaller archives than gzip.I'm compressing ~1.3 GB folders each filled with 1440 JSON files and find that there's a 15-fold difference between using the tar command and Python's built-in tarfile library on macOS or Raspbian 10 (Buster) Minimal working example But at 70 seconds, xz also took nearly 18 times as long! Compression levels six and beyond hugely increased the compression time for a negligible 1% reduction in archive size With compression level 5, xz produced the smallest archive at 29 MB, which was 69% smaller than pig z.xz used compression level 1 out of 9 for this In the same 4 seconds, xz compressed the file to just 48 MB, which was 49% smaller than pigz.Higher compression levels didn’t produce meaningfully smaller archives At compression level 7 out of 9, pigz compressed the 818 MB CSV file down to 95 MB in 4 seconds.pigz does this by default, xz because of the -T0 option Both archiving tools saturated the CPU in our tests.We compared xz to pigz, a gzip implementation that uses multithreading for faster compression and decompression. To test this claim, we used the same 818 MB CSV file, and the same computer with six CPU cores and hyperthreading, as we used to test gzip in Linux. Previously, we stated that xz creates smaller archives than gzip. Unlike xz, tar doesn’t delete the archive file after the extraction is completeĥ.Because of the v option, tar shows which files are extracted from the archive.tar does this automatically by inspecting the file and detecting the xz compression We don’t have to tell tar to decompress with xz.We decompress the file and extract its content into the current directory.Please note that we removed the J option here because –use-compress-program already sets the compression program.ĭecompressing a tar archive with xz is also a single step and identical to gzip (except for the different file extension): tar xvf Here, we specify the minimum compression level 1: tar cvf -use-compress-program='xz -1' *.csv We use this option to set the compression level, too. Tar allows setting the compression program through the –use-compress-program option. Which xz compression level does tar pick? It depends on our version of tar, but it probably is the default compression level 6. Unlike xz and gzip, tar doesn’t delete the input files after it creates the archive.Because of the v option, tar shows which files are added to the archive.The J option enables compression with xz.We compress all files with a csv extension in the current directory into the compressed archive,.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |