Working with Compressed Files
One of the most common problems that I encounter outside the Windows world is how to work with the huge number of different compressed files that exist in addition to those that have leaked out of the Mac and Windows worlds.
In this HOWTO, I have included all of the file formats that I am currently aware of. That said, it's still very possible that there are some more formats that are not covered here. So if you know of any, please let me know at the martian engineer at gmail dot com, and I will add them here, crediting you.
One word about the windows column on the following table, I added it based on some googling, and have not tested any of the options as I do not have windows machine to test with, nor have I even seen it in many years.
Extension | Linux | OSX | Windows | Example | Notes |
---|---|---|---|---|---|
.bz2 | ✅ | ✅ | ✅ | view | https://sourceware.org/bzip2/ Windows Download: here |
.Z | ✅ | ✅ | view | https://github.com/vapier/ncompress/releases | |
.z | ✅ | ✅ | ✅ | view | https://www.gnu.org/software/gzip/ Windows Download: here |
.gz | |||||
.gzip | |||||
.tar | ✅ | ✅ | ✅ | view | https://www.gnu.org/software/tar/ Windows Download: here |
.tar.Z | view | ||||
.tar.z | view | ||||
.tar.gz | view | ||||
.tgz | view | ||||
.tar.bz2 | view | ||||
.zip | ✅ | ✅ | ✅ | view | |
.arc | ❎ | view | unarc - https://github.com/xredor/unarc | ||
.ark | ✅ | view | |||
.arj | ✅ | view | |||
.lzh | ✅ | view | |||
.lha | ✅ | view | |||
.zoo | ✅ | view | |||
.rar | ✅ | ✅ | ✅ | view | |
.hqx | ✅ | view | |||
.sit | ✅ | view | |||
.sea | ✅ | view | |||
.dmg | ✅ | view | |||
.uue | ✅ | view | |||
.uu | ✅ | view | |||
.cab | ✅ | view | |||
.ace | ✅ | view | |||
.rpm | ✅ | view |
This is one of the newer formats available, and my format of choice, owning to its superior compression ratio, its based on the "Burrows-Wheeler Transform", and a achieves one of the best lossless compression ratios available as of yet.
The bzip2
tool which is used to compress and decompress files in this format, and its included with OSX plus, every Linux Distribution I have ever seen, and also should be included with other OS such as Solaris and BSD.
In this first example, the specified file is compress, and replaced with the compressed file, the .bz2
is appended to the filename.
bzip2 nameoffileDecompress Example 1
In this first decompression example, the specified file is decompressed and replaces the compressed version, the .bz2
extension is removed. The -d
command switch is used to specify the file is to be decompressed.
bzip2 -d nameoffile.bz2Decompress Example 2
In this second decompression example, the specified file is decompressed, but this time does not overwrite the compressed file, but instead is written to the outputfile. The command switch -c
is used in addition to the last example, this writes the decompressed data to the stdout (console).
bzip2 -dc nameoffile.bz2 > outputfile
Thats it for this type of file when used on its own, I talk more about it in the tar
section. For more information have a look at the man page, which is accessed like this man bzip2
. Below are links to the home page for bzip2 and also a technical article on the technology behind it.
This is one of the original compression formats, and has been replaced by gzip
, which will decompress these files. That said the original tools to compress and decompress in this format are still around so are worth covering separately. The tools in question are; compress
and uncompress
, which are part of OSX, should already be part of your Linux distribution, and also should be available on other OS like Solaris and BSD.
In this first example, the specified file is replaced with the compressed file, the extension .Z
is appended to the original filename.
compress nameoffileCompress Example 2
In this next example the specified file is compressed, but the original file is not replaced, instead the compressed data is piped to the specified output file. One thing to note if you use compress
like this is that the original filename or extension is not automatically retained. You will see that the -c
command switch is used, and tells compress
to write the compressed data to the stdout.
compress -c nameoffile > outputfileDecompress Example 1
In this first example, the compressed file is decompressed, the compressed file is replaced with the decompressed one, the .Z
extension is removed.
uncompress nameoffile.ZDecompress Example 2
With this next example, the compress
tools is used to decompress. This decompression is specified with the -d
command switch. Other than that the behavior is the same as the uncompress
tools.
compress -d nameoffile.Z
As I mentioned above, this type of compressed file has now been replaced with gzip
, which will decompress these files without any problems, so as you can expect the examples included in the gzip
section are valid.
This is the most common type of compression used in the non-windows world, and has many know file extensions associated with it, the most common are; .z and .gz with .gzip being unusual but still out in the world. The tools you use to work with these files are gzip
and gunzip
. Just like most of the other tools these are part of just about every flavor of UNIX known, such as Linux, OSX and Solaris. Compress Example 1
In this first example the specified filename is compressed and replaced with the compressed version, the default extension of .gz
is appended.
gzip nameoffileCompress Example 2
In this next example the specified file is compressed, but the original file is not replaced, instead the compressed data is piped to the specified output file. One thing to note if you use gzip
like this is that the original filename or extension is not automatically retained. You will see that the -c
command switch is used, and tells gzip
to write the compressed data to the stdout.
gzip -c nameoffile > outputfileDecompress Example 1
In this first example, the compressed file is decompressed, the compressed file is replaced with the decompressed one, the .gz
, .z
or .gzip
extension is removed.
gunzip nameoffile.gz
or
gunzip nameoffile.z
or
gunzip nameoffile.gzipDecompress Example 2
With this next example, the gzip
tools is used to decompress. This decompression is specified with the -d
command switch. Other than that the behavior is the same as the gunzip
tools.
gzip -d nameoffile.gz
or
gzip -d nameoffile.z
or
gzip -d nameoffile.gzip
Yes you would be right in thinking all of the above are effectively the same, with the only difference being the extension, where I have intentionally shown one example for each of the common gzip
extensions.
NOTE: gzip/gunzip are not the same as zip and will NOT work with zip files.
This one is different to all of the others, in that tar
the tool thats used with these extensions by itself does not compress anything, it was actually intended for use with tape backup media. Compression/decompress is only possible when tar
is used in conjunction with any of the other tools. Now as this is the cause of the majority of the confusion, I have included compress and decompress examples for each of the most common combinations. No need to worry about this tool, its been part of all Linux distributions since the beginning of time, and is included with OSX, Solaris and BSD.
NOTE: Warning for OSX users, although this tool is included as part of the OS, under 10.2.8 or earlier it does not support the
-j
or-y
switches, so you will have to usebzip2 -d nameoffile.tar.bz2 | tar -xvf
, which is this third example in thetar.bz2
section below. On the other hand if you are using 10.3 or later, this problem has been fixed with the-j
switch being fully supported.
.tar
These files are not compressed at all, they are just "concatenated" into a single file. You manipulate these files like this :-
Creation Example
To copy all the files in a specified directory into the named .tar
file you use the tar
command like this.
tar cvf nameoffile.tar nameofdirectory
The command switches used are cvf
these are "create", "verbose" and "file". I do suggest that you look at the man page for tar, as their are masses of possible options with this powerful tool.
Once you have your .tar
file, you retrieve the files with the following command.
tar xvf nameoffile.tar
The command switches used this time are xvf
these are "extract", "verbose" and "file". I do suggest that you look at the man page for tar, as their are masses of possible options with this powerful tool.
.tar.Z
As I talked about above, the tar
does not on its own compress anything, the compression comes from external tools, which in this case is compress
.
This example will take all files in the specified directory and copy them in compressed format into the specified file. The files in the specified directory are not removed. The command switches cvfZ
are, copy, verbose, files, and filter through compress.
tar cvfZ nameoffile.tar.Z directoryname/Decompress Example 1
This example will decompress the specified file, extracting all files, leaving the compressed file unchanged. The command switches xvfZ
are extract, verbose, files, and filter through compress.
tar xvfZ nameoffile.tar.Z directoryname/Decompress Example 2
This next example does exactly the same as the last one, with exception that it will work with versions of tar
that do not support the Z
switch (if such a beast still exists). It works by first running compress with the d
and c
command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar
which extracts the files. The original compressed file is unchanged.
compress -dc nameoffile.tar.Z | tar -xvf -
As I talked about above, the tar
does not on its own compress anything, the compression comes from external tools, which in this case is gzip
.
This example will take all the files in the specified directory, compress them then write them into the single file specified. The command switches cvfz
are, copy, verbose, files, and filter through gzip.
tar cvfz nameoffile.tar.z directoryname/
or
tar cvfz nameoffile.tar.gz directoryname/
or
tar cvfz nameoffile.tgz directoryname/Decompress Example 1
This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfz
are extract, verbose, files, and filter through gzip.
tar xvfz nameoffile.tar.z
or
tar xvfz nameoffile.tar.gz
or
tar xvfz nameoffile.tgzDecompress Example 2
This next example does exactly the same as the last one, with exception that it will work with versions of tar
that do not support the z
switch (if such a beast still exists). It works by first running gzip with the d
and c
command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar
which extracts the files. The original compressed file is unchanged.
gzip -dc nameoffile.tar.z | tar -xvf -
or
gzip -dc nameoffile.tar.gz | tar -xvf-
or
gzip -dc nameoffile.tgz | tar -xvf -
.tar.bz2
As I talked about above, the tar
does not on its own compress anything, the compression comes from external tools, which in this case is bzip2
. Now with this format their is a gotcha, in that one of the command switches you need has been changed between versions of tar. This is annoying, so I have included additional examples to help regardless of the version of tar
you have.
tar
you have
This is not normally a issue, but as I said above, some genius decided to change the command switch that causes tar
to filter though bzip2
. The version of tar
you are using is identified like this.
tar --version
The two versions that I know of are 1.13 and on the RH box, 1.13.19. The interesting thing is that the latest version on the gnu web site is 1.13, so I am not sure where the later 1.13.19 came from.
Compress Example 1. (tar version 1.13)
This example will take all the files in the specified directory, compress them using bzip2 then write them into the single file specified. The command switches cvfy
are, copy, verbose, files, and filter through bzip2. As with all the tar
examples the original files are unaltered.
tar cvfy nameoffile.tar.bz2Compress Example 2. (tar version 1.13.19)
This example will take all the files in the specified directory, compress them using bzip2 then write them into the single file specified. The command switches cvfj
are, copy, verbose, files, and filter through bzip2. As with all the tar
examples the original files are unaltered.
tar cvfj nameoffile.tar.bz2
Compress Example 3. (All versions)TIP: OSX 10.3 or later users can use the above compression example, while those with older versions of OSX need to use the following.
In this final compression example, the functionality of both the previous ones is replicated, but inplace of using the -y
or -j
switches, we are going to use UNIX pipes and the bzip2
to compress.
tar cvf - nameoffile | bzip2 -c > compressedfile.tar.bz2
Now this may look like lots of work, but actually its quite simple. What it does is; create a new archive containing nameoffile
then write the archive to stdout
which in then piped (the |
) into the bzip2
tool which compresses it again to stdout
where its redirected into compressedfile.tar.bz2
.
Decompress Example 1. (tar version 1.13)TIP: The
nameoffile
can be a single file or a directory.
This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfy
are extract, verbose, files, and filter through bzip2.
tar xvfy nameoffile.tar.bz2Decompress Example 2. (tar version 1.13.19)
This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfj
are extract, verbose, files, and filter through bzip2
.
tar xvfj nameoffile.tar.bz2
Decompress Example 3. (All versions)TIP: OSX 10.3 or later users can use the above example, while those with older versions need to use the next one.
This next example does exactly the same as the last two, with exception that it will work with both versions of tar
and should also work with those that do not support filtering though bzip2
. It works by first running bzip2 with the d
and c
command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar
which extracts the files and any directories into the current directory, leaving the original compressed file unchanged.
bzip2 -dc nameoffile.tar.bz2 | tar -xvf -
Yes this is the same as the .zip files you use in the windows world, these files are handled either by pkzip for UNIX or more commonly by zip and unzip, which are included in every Linux I have seen, and may also be part of other OS such as Solaris or BSD. Just in case you do not have these tools, or just want to learn more, here are links for downloads etc.
For those of you who miss the old DOS pkzip tool, all is not lost their is a port to Linux although its binary only, and shareware, even with those problems, its included in the links.
Compress ExampleThis example will compress the specified file, writing the results to the specified zip file. The original file is retained.
zip zipfilename.zip filetocompressDecompress Example
This example will decompress the specified zip file, the contents of this compressed file including any directories are written to the current directory. The compressed zip file is unchanged.
unzip nameoffile.zip
This is one of the very first compression formats available back in the early days of DOS. These days its extremely rare to see these files in the Windoze world let alone under Linux/UNIX. All is not lost their is a free tool called nomarc
that will decompress these things, the download link is below. Now things are a changing, seems there is a tool that will allow these things to be created, its called SEA arc, and is available from here
This example will decompress the specified file, the contents are written to the current directory.
nomarc nameoffile.arc
or
nomarc nameoffile.ark
This is one of the formats that used in the windows world, and as far as I can tell their is no way to create these files under anything else, then why would you, as their are so many UNIX formats available. Sure you may not be able to create these files, but you can decompress them with the unarj
tool, which so far has uncompressed every .arj file I have tried, including things like multiple file backups etc. unarj
so far has not been included with any distribution of Linux or any OS other than Windoze/dos etc.
This example will decompress the specified .arj file, extracting all files and directories into the current directory. The command switch x
means extract with pathnames. The original .arj file is unchanged.
unarj x nameoffile.arj
UPDATE: So far I have not been able to make the Open Source arj compile on any of my UNIX/Linux boxes, so cannot really include any usage examples, when I get the time I will have another go, until then read the /docs/compile.txt file for some help.
This format is these days quite rare even in the Windoze world, but as their are still some files in this format out their, its still worth covering. There is not much chance that the lha
tool thats used to support this format is part of your distribution
Now I had a problem with this tools, in that I never made it work, the download is binary only and when I tried it on my Linux box it dived into the dirt big time. Since then I emailed the author for the source code, which was ignored.
In the process of researching this HOWTO, I came across another tool that claimed to support lha compressed files, but this tool also crashed when ever I tried to run it. So until I get a working tool their can be no examples. Sorry Folks.
This format is another rare one, in fact I have not seen any .zoo files in many years. The tool that handles these things is zoo
and is very unlikely to be part of any OS these days. All is not lost you can get a source RPM of the Linux port of this DOS app.
This example will decompress the specified file, writing the contents to the current directory. The e
command switch means extract.
zoo e nameoffile.zoo
This is one of the more popular compression formats in use in the windows world, and it seems to be becoming more popular on other platforms. There used to be only one unofficial tool for decompressing these files under Linux, but now their is a official Linux port available, that allows for both compression and decompression under Linux, with full support for encrypted and multipart file sets.
I have used this tool extensively and moved files between platforms with no problems what so ever.
Compress Example 1This first example will create a compressed file, that contains all .txt files in the current directory. All the target files are retained, as are all filename.
rar a nameoffile.rar *.txtCompress Example 1
This example will create a compressed file, that contains the single file. Just as with the last example the original file is retained, as is the filename.
rar a nameoffile.rar singlefileDecompress example
This example will decompress the specified file, the contents are written to the current directory. The x
switch means extract, and you are prompted to over right any existing files.
rar x nameoffile.rar
These things are used mainly in mac world, but are well supported under Linux and so I am told, Windoze, Macs and Solaris, by the Stuffit expander, which despite its name will compress and decompress.
Now this tool supports more than just the .hqx and .sit files, it can compress into the following formats
zip, compress, uu, hqx, bin, sitseg, sitsegN, pf, sit5
and decompress the following :-
sit, cpt, zip, arc, arj, lha, rar, gz, compress, uu, hqx, toa, mime, tar, bin, sit, seg, sit, seg, pf, sit5, bz2, apple
WARNING: If you are using Stuffit Deluxe or Standard 8.0 you need to update to 8.01 to avoid a permissions bug. [more]
Now the bad news, this tool is unlike all of the others so far detailed in this page, in that its not open sourced, nor is it free, registration costs USD$29.95. Which if you need to read mac compression formats is a small price to pay.
With Open Source tools to handle these files, their is a solution, although I have not been able to find a working download for the source code. This tool is called macutils and is a suite of tools for handling Mac files under Linux.
UPDATE: Lars Mueller submitted this link for mactools source download. Thanks, Lars.
These files are Mac specific Self Extracting Archive files, and are not supported at all on anything else. Should this change I will update this page.
These things are Mac specific disk images, and are usually mounted by double clicking from within finder, or using the open
command from a terminal.
These files are uu encoded, which is a old format usually used for things like email file attachments, or Usenet files. There are a number of tools that handle these files. Some examples are:
TIP: I would recommend the use of the GNU Sharutils, its free, open sourced, and very well documented.
Yes it is what it sound like, support for decompressing files from Microsoft .cab files. Not that I have ever found a use for this, but its their for you if you need it.
Decompress ExampleThis example will extract all the files contained in the cab file into the current working directory.
cabextract nameoffile.cab
NEW: It seems its actually possible to create this things under Linux. Hans Fredrik Nordhaug, submitted the following link for the tool. Thanks, Hans.lcab home (.cab creator)
So far I have not had time to play with this new tool, but have confirmed that it does build under Linux (gcc 2.95-3) and OSX 10.3 (gcc 3.3) without problems. When time allows I will do some tests and add usage examples here.
Now I have never seen one of these things, but I have received several emails asking for help decompressing these things under Linux. The format is a Windoze thing, but the author has released a Linux port, which can decompress, but not compress.
I have no ace files, so until I get hold of something to test with, I am not able to include any examples.
Yes I know this has always been what amounts to a Linux specific format, but no longer is that the case, as these things can now be worked with under OSX
, using a tool called RPMinator, thats available from here
Seems this is not the only rpm tool available for OSX, with a command line version available from here
I have no idea why anyone would want to use RPMs under OSX, but I thought it worth including it here after a number of people actually requested it.