Working with Compressed Files

One of the most common problems that I encounter outside the Windows world is how to work with the huge number of different compressed files that exist in addition to those that have leaked out of the Mac and Windows worlds.

In this HOWTO, I have included all of the file formats that I am currently aware of. That said, it's still very possible that there are some more formats that are not covered here. So if you know of any, please let me know at the martian engineer at gmail dot com, and I will add them here, crediting you.

One word about the windows column on the following table, I added it based on some googling, and have not tested any of the options as I do not have windows machine to test with, nor have I even seen it in many years.

Extension Linux OSX Windows Example Notes
.bz2 view https://sourceware.org/bzip2/
Windows Download: here
.Z view https://github.com/vapier/ncompress/releases
.z view https://www.gnu.org/software/gzip/
Windows Download: here
.gz
.gzip
.tar view https://www.gnu.org/software/tar/
Windows Download: here
.tar.Z view
.tar.z view
.tar.gz view
.tgz view
.tar.bz2 view
.zip view
.arc view unarc - https://github.com/xredor/unarc
.ark view
.arj view
.lzh view
.lha view
.zoo view
.rar view
.hqx view
.sit view
.sea view
.dmg view
.uue view
.uu view
.cab view
.ace view
.rpm view

bzip2 [.bz2] [top]

This is one of the newer formats available, and my format of choice, owning to its superior compression ratio, its based on the "Burrows-Wheeler Transform", and a achieves one of the best lossless compression ratios available as of yet.

The bzip2 tool which is used to compress and decompress files in this format, and its included with OSX plus, every Linux Distribution I have ever seen, and also should be included with other OS such as Solaris and BSD.

  Compress Example

In this first example, the specified file is compress, and replaced with the compressed file, the .bz2 is appended to the filename.

bzip2 nameoffile
Decompress Example 1

In this first decompression example, the specified file is decompressed and replaces the compressed version, the .bz2 extension is removed. The -d command switch is used to specify the file is to be decompressed.

bzip2 -d nameoffile.bz2
Decompress Example 2

In this second decompression example, the specified file is decompressed, but this time does not overwrite the compressed file, but instead is written to the outputfile. The command switch -c is used in addition to the last example, this writes the decompressed data to the stdout (console).

bzip2 -dc nameoffile.bz2 > outputfile

Thats it for this type of file when used on its own, I talk more about it in the tar section. For more information have a look at the man page, which is accessed like this man bzip2. Below are links to the home page for bzip2 and also a technical article on the technology behind it.

compress/uncompress [.Z] [top]

This is one of the original compression formats, and has been replaced by gzip, which will decompress these files. That said the original tools to compress and decompress in this format are still around so are worth covering separately. The tools in question are; compress and uncompress, which are part of OSX, should already be part of your Linux distribution, and also should be available on other OS like Solaris and BSD.

Compress Example 1

In this first example, the specified file is replaced with the compressed file, the extension .Z is appended to the original filename.

compress nameoffile
Compress Example 2

In this next example the specified file is compressed, but the original file is not replaced, instead the compressed data is piped to the specified output file. One thing to note if you use compress like this is that the original filename or extension is not automatically retained. You will see that the -c command switch is used, and tells compress to write the compressed data to the stdout.

compress -c nameoffile > outputfile
Decompress Example 1

In this first example, the compressed file is decompressed, the compressed file is replaced with the decompressed one, the .Z extension is removed.

uncompress nameoffile.Z
Decompress Example 2

With this next example, the compress tools is used to decompress. This decompression is specified with the -d command switch. Other than that the behavior is the same as the uncompress tools.

compress -d nameoffile.Z

As I mentioned above, this type of compressed file has now been replaced with gzip, which will decompress these files without any problems, so as you can expect the examples included in the gzip section are valid.

gzip/gunzip [.z .gz .gzip] [top]

This is the most common type of compression used in the non-windows world, and has many know file extensions associated with it, the most common are; .z and .gz with .gzip being unusual but still out in the world. The tools you use to work with these files are gzip and gunzip. Just like most of the other tools these are part of just about every flavor of UNIX known, such as Linux, OSX and Solaris. Compress Example 1

Compress Example 1

In this first example the specified filename is compressed and replaced with the compressed version, the default extension of .gz is appended.

gzip nameoffile
Compress Example 2

In this next example the specified file is compressed, but the original file is not replaced, instead the compressed data is piped to the specified output file. One thing to note if you use gzip like this is that the original filename or extension is not automatically retained. You will see that the -c command switch is used, and tells gzip to write the compressed data to the stdout.

gzip -c nameoffile > outputfile
Decompress Example 1

In this first example, the compressed file is decompressed, the compressed file is replaced with the decompressed one, the .gz, .z or .gzip extension is removed.

gunzip nameoffile.gz

or

gunzip nameoffile.z

or

gunzip nameoffile.gzip
Decompress Example 2

With this next example, the gzip tools is used to decompress. This decompression is specified with the -d command switch. Other than that the behavior is the same as the gunzip tools.

gzip -d nameoffile.gz

or

gzip -d nameoffile.z

or

gzip -d nameoffile.gzip

Yes you would be right in thinking all of the above are effectively the same, with the only difference being the extension, where I have intentionally shown one example for each of the common gzip extensions.

NOTE: gzip/gunzip are not the same as zip and will NOT work with zip files.

tar [.tar .tar.Z .tar.z .tar.gz .tgz .tar.bz2] [top]

This one is different to all of the others, in that tar the tool thats used with these extensions by itself does not compress anything, it was actually intended for use with tape backup media. Compression/decompress is only possible when tar is used in conjunction with any of the other tools. Now as this is the cause of the majority of the confusion, I have included compress and decompress examples for each of the most common combinations. No need to worry about this tool, its been part of all Linux distributions since the beginning of time, and is included with OSX, Solaris and BSD.

NOTE: Warning for OSX users, although this tool is included as part of the OS, under 10.2.8 or earlier it does not support the -j or -y switches, so you will have to use bzip2 -d nameoffile.tar.bz2 | tar -xvf, which is this third example in the tar.bz2 section below. On the other hand if you are using 10.3 or later, this problem has been fixed with the -j switch being fully supported.

.tar

These files are not compressed at all, they are just "concatenated" into a single file. You manipulate these files like this :-

Creation Example

To copy all the files in a specified directory into the named .tar file you use the tar command like this.

tar cvf nameoffile.tar nameofdirectory

The command switches used are cvf these are "create", "verbose" and "file". I do suggest that you look at the man page for tar, as their are masses of possible options with this powerful tool.

Extraction Example

Once you have your .tar file, you retrieve the files with the following command.

tar xvf nameoffile.tar

The command switches used this time are xvf these are "extract", "verbose" and "file". I do suggest that you look at the man page for tar, as their are masses of possible options with this powerful tool.

.tar.Z

As I talked about above, the tar does not on its own compress anything, the compression comes from external tools, which in this case is compress.

Compress Example

This example will take all files in the specified directory and copy them in compressed format into the specified file. The files in the specified directory are not removed. The command switches cvfZ are, copy, verbose, files, and filter through compress.

tar cvfZ nameoffile.tar.Z directoryname/
Decompress Example 1

This example will decompress the specified file, extracting all files, leaving the compressed file unchanged. The command switches xvfZ are extract, verbose, files, and filter through compress.

tar xvfZ nameoffile.tar.Z directoryname/
Decompress Example 2

This next example does exactly the same as the last one, with exception that it will work with versions of tar that do not support the Z switch (if such a beast still exists). It works by first running compress with the d and c command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar which extracts the files. The original compressed file is unchanged.

compress -dc nameoffile.tar.Z | tar -xvf -
.tar.z .tar.gz .tgz [top]

As I talked about above, the tar does not on its own compress anything, the compression comes from external tools, which in this case is gzip.

Compress Example

This example will take all the files in the specified directory, compress them then write them into the single file specified. The command switches cvfz are, copy, verbose, files, and filter through gzip.

tar cvfz nameoffile.tar.z directoryname/

or

tar cvfz nameoffile.tar.gz directoryname/

or

tar cvfz nameoffile.tgz directoryname/
Decompress Example 1

This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfz are extract, verbose, files, and filter through gzip.

tar xvfz nameoffile.tar.z

or

tar xvfz nameoffile.tar.gz

or

tar xvfz nameoffile.tgz
Decompress Example 2

This next example does exactly the same as the last one, with exception that it will work with versions of tar that do not support the z switch (if such a beast still exists). It works by first running gzip with the d and c command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar which extracts the files. The original compressed file is unchanged.

gzip -dc nameoffile.tar.z | tar -xvf -

or

gzip -dc nameoffile.tar.gz | tar -xvf-

or

gzip -dc nameoffile.tgz | tar -xvf -

.tar.bz2

As I talked about above, the tar does not on its own compress anything, the compression comes from external tools, which in this case is bzip2. Now with this format their is a gotcha, in that one of the command switches you need has been changed between versions of tar. This is annoying, so I have included additional examples to help regardless of the version of tar you have.

How to find the version of tar you have

This is not normally a issue, but as I said above, some genius decided to change the command switch that causes tar to filter though bzip2. The version of tar you are using is identified like this.

tar --version

The two versions that I know of are 1.13 and on the RH box, 1.13.19. The interesting thing is that the latest version on the gnu web site is 1.13, so I am not sure where the later 1.13.19 came from.

Compress Example 1. (tar version 1.13)

This example will take all the files in the specified directory, compress them using bzip2 then write them into the single file specified. The command switches cvfy are, copy, verbose, files, and filter through bzip2. As with all the tar examples the original files are unaltered.

tar cvfy nameoffile.tar.bz2
Compress Example 2. (tar version 1.13.19)

This example will take all the files in the specified directory, compress them using bzip2 then write them into the single file specified. The command switches cvfj are, copy, verbose, files, and filter through bzip2. As with all the tar examples the original files are unaltered.

tar cvfj nameoffile.tar.bz2

TIP: OSX 10.3 or later users can use the above compression example, while those with older versions of OSX need to use the following.

Compress Example 3. (All versions)

In this final compression example, the functionality of both the previous ones is replicated, but inplace of using the -y or -j switches, we are going to use UNIX pipes and the bzip2 to compress.

tar cvf - nameoffile | bzip2 -c > compressedfile.tar.bz2

Now this may look like lots of work, but actually its quite simple. What it does is; create a new archive containing nameoffile then write the archive to stdout which in then piped (the |) into the bzip2 tool which compresses it again to stdout where its redirected into compressedfile.tar.bz2.

TIP: The nameoffile can be a single file or a directory.

Decompress Example 1. (tar version 1.13)

This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfy are extract, verbose, files, and filter through bzip2.

tar xvfy nameoffile.tar.bz2
Decompress Example 2. (tar version 1.13.19)

This example will decompress the specified file, extracting all files and any directories into the current directory, leaving the compressed file unchanged. The command switches xvfj are extract, verbose, files, and filter through bzip2.

tar xvfj nameoffile.tar.bz2

TIP: OSX 10.3 or later users can use the above example, while those with older versions need to use the next one.

Decompress Example 3. (All versions)

This next example does exactly the same as the last two, with exception that it will work with both versions of tarand should also work with those that do not support filtering though bzip2. It works by first running bzip2 with the d and c command switches which will decompress the specified file and write the decompressed data to stdout. Its then piped into tar which extracts the files and any directories into the current directory, leaving the original compressed file unchanged.

bzip2 -dc nameoffile.tar.bz2 | tar -xvf -
zip/unzip [.zip] [top]

Yes this is the same as the .zip files you use in the windows world, these files are handled either by pkzip for UNIX or more commonly by zip and unzip, which are included in every Linux I have seen, and may also be part of other OS such as Solaris or BSD. Just in case you do not have these tools, or just want to learn more, here are links for downloads etc.

For those of you who miss the old DOS pkzip tool, all is not lost their is a port to Linux although its binary only, and shareware, even with those problems, its included in the links.

Compress Example

This example will compress the specified file, writing the results to the specified zip file. The original file is retained.

zip zipfilename.zip filetocompress
Decompress Example

This example will decompress the specified zip file, the contents of this compressed file including any directories are written to the current directory. The compressed zip file is unchanged.

unzip nameoffile.zip
unarc [.arc .ark] [top]

This is one of the very first compression formats available back in the early days of DOS. These days its extremely rare to see these files in the Windoze world let alone under Linux/UNIX. All is not lost their is a free tool called nomarc that will decompress these things, the download link is below. Now things are a changing, seems there is a tool that will allow these things to be created, its called SEA arc, and is available from here

Decompress Example

This example will decompress the specified file, the contents are written to the current directory.

nomarc nameoffile.arc

or

nomarc nameoffile.ark
unarj [.arj] [top]

This is one of the formats that used in the windows world, and as far as I can tell their is no way to create these files under anything else, then why would you, as their are so many UNIX formats available. Sure you may not be able to create these files, but you can decompress them with the unarj tool, which so far has uncompressed every .arj file I have tried, including things like multiple file backups etc. unarj so far has not been included with any distribution of Linux or any OS other than Windoze/dos etc.

Decompress Example

This example will decompress the specified .arj file, extracting all files and directories into the current directory. The command switch x means extract with pathnames. The original .arj file is unchanged.

unarj x nameoffile.arj

UPDATE: So far I have not been able to make the Open Source arj compile on any of my UNIX/Linux boxes, so cannot really include any usage examples, when I get the time I will have another go, until then read the /docs/compile.txt file for some help.

lha [.lzh .lha] [top]

This format is these days quite rare even in the Windoze world, but as their are still some files in this format out their, its still worth covering. There is not much chance that the lha tool thats used to support this format is part of your distribution

Now I had a problem with this tools, in that I never made it work, the download is binary only and when I tried it on my Linux box it dived into the dirt big time. Since then I emailed the author for the source code, which was ignored.

In the process of researching this HOWTO, I came across another tool that claimed to support lha compressed files, but this tool also crashed when ever I tried to run it. So until I get a working tool their can be no examples. Sorry Folks.

zoo [.zoo] [top]

This format is another rare one, in fact I have not seen any .zoo files in many years. The tool that handles these things is zoo and is very unlikely to be part of any OS these days. All is not lost you can get a source RPM of the Linux port of this DOS app.

Decompress Example

This example will decompress the specified file, writing the contents to the current directory. The e command switch means extract.

zoo e nameoffile.zoo
rar [.rar] [top]

This is one of the more popular compression formats in use in the windows world, and it seems to be becoming more popular on other platforms. There used to be only one unofficial tool for decompressing these files under Linux, but now their is a official Linux port available, that allows for both compression and decompression under Linux, with full support for encrypted and multipart file sets.

I have used this tool extensively and moved files between platforms with no problems what so ever.

Compress Example 1

This first example will create a compressed file, that contains all .txt files in the current directory. All the target files are retained, as are all filename.

rar a nameoffile.rar *.txt
Compress Example 1

This example will create a compressed file, that contains the single file. Just as with the last example the original file is retained, as is the filename.

rar a nameoffile.rar singlefile
Decompress example

This example will decompress the specified file, the contents are written to the current directory. The x switch means extract, and you are prompted to over right any existing files.

rar x nameoffile.rar
Stuffit expander [.hqx .sit] [top]

These things are used mainly in mac world, but are well supported under Linux and so I am told, Windoze, Macs and Solaris, by the Stuffit expander, which despite its name will compress and decompress.

Now this tool supports more than just the .hqx and .sit files, it can compress into the following formats

zip, compress, uu, hqx, bin, sitseg, sitsegN, pf, sit5

and decompress the following :-

sit, cpt, zip, arc, arj, lha, rar, gz, compress, uu, hqx, toa,
mime, tar, bin, sit, seg, sit, seg, pf, sit5, bz2, apple

WARNING: If you are using Stuffit Deluxe or Standard 8.0 you need to update to 8.01 to avoid a permissions bug. [more]

Now the bad news, this tool is unlike all of the others so far detailed in this page, in that its not open sourced, nor is it free, registration costs USD$29.95. Which if you need to read mac compression formats is a small price to pay.

With Open Source tools to handle these files, their is a solution, although I have not been able to find a working download for the source code. This tool is called macutils and is a suite of tools for handling Mac files under Linux.

UPDATE: Lars Mueller submitted this link for mactools source download. Thanks, Lars.

Mac Self Extractors [.sea] [top]

These files are Mac specific Self Extracting Archive files, and are not supported at all on anything else. Should this change I will update this page.

Diskimages [.dmg] [top]

These things are Mac specific disk images, and are usually mounted by double clicking from within finder, or using the open command from a terminal.

uuencoded [.uue] [top]

These files are uu encoded, which is a old format usually used for things like email file attachments, or Usenet files. There are a number of tools that handle these files. Some examples are:

TIP: I would recommend the use of the GNU Sharutils, its free, open sourced, and very well documented.

cabextract [.cab] [top]

Yes it is what it sound like, support for decompressing files from Microsoft .cab files. Not that I have ever found a use for this, but its their for you if you need it.

Decompress Example

This example will extract all the files contained in the cab file into the current working directory.

cabextract nameoffile.cab

NEW: It seems its actually possible to create this things under Linux. Hans Fredrik Nordhaug, submitted the following link for the tool. Thanks, Hans. lcab home (.cab creator)

So far I have not had time to play with this new tool, but have confirmed that it does build under Linux (gcc 2.95-3) and OSX 10.3 (gcc 3.3) without problems. When time allows I will do some tests and add usage examples here.

ace [.ace] [top]

Now I have never seen one of these things, but I have received several emails asking for help decompressing these things under Linux. The format is a Windoze thing, but the author has released a Linux port, which can decompress, but not compress.

I have no ace files, so until I get hold of something to test with, I am not able to include any examples.

rpm [.rpm] [top]

Yes I know this has always been what amounts to a Linux specific format, but no longer is that the case, as these things can now be worked with under OSX, using a tool called RPMinator, thats available from here

Seems this is not the only rpm tool available for OSX, with a command line version available from here

I have no idea why anyone would want to use RPMs under OSX, but I thought it worth including it here after a number of people actually requested it.

× About Photography 3D Graphics Gallery Travels Hall of Shame Reviews Writings