Stable version to be found:

And other formats as well (may be) on the Stable HOWTOs page

If necessary use the Discussion page to give comments.

Discussion Page Content if any

Changes on this HOWTO, including the licence was allowed by the author (http://lists.tldp.org/go.to?list=discuss&cmd=showmsg&msgnum=11737).

Updates to 2.0

This is a summary of my updates and my reasons for them:

  • Changed title from Bzip mini-HOWTO to Bzip HOWTO since there really isn't any competition for this HOWTO or a more complete version out there.

  • Turned David Fetter's email address from a heading into an 'authors' section
  • Fixed some of the headings hierarchy and nesting, but I have some work to go
  • Removed linktexts since they only have any use in hyperlinked versions of the HOWTO - these links will be useless in pure text releases, for example.
  • Changes to the Introduction
    • Removed mention of the bzip2 algorithm being 'new' since it dates the HOWTO - it was new in 1999 but not now.
    • Added an Assumptions section so the reader knows whether s/he is qualified enough to understand and use the HOWTO. I think all of the HOWTOs should have one of these.
    • Moved the 'future editions' comments into its own section since it doesn't really belong in an introduction. Besides, I doubt that any of these 'future editions' will be implemented anyway.
    • Updated e-mail addresses and links where Google gave me new ones. Ones that I couldn't update have been left blank. The whole section on 'in your language' should be taken out since TLDP should be hosting these copies!
  • I provided a more verbose explanation on compiling from make since I don't think this is common knowledge
  • Changed 'using bzip2 by itself' to a more useful copy-and-paste of the program's basic features. Someone using this HOWTO has probably already consulted the manpage and needs more hand holding.
  • Updated the tar commands. tar now provides a -j switch to use bzip and --use-compress-program seems to have been removed (at least from my version of tar)

BordenRhodes

Bzip2 HOWTO

Licence

      Permission is granted to copy, distribute and/or modify this
      document under the terms of the GNU Free Documentation License,
      Version 1.2 or any later version published by the Free Software
      Foundation; with no Invariant Sections, no Front-Cover Texts and
      no Back-Cover Texts.  A copy of the license is included in the
      section entitled "GNU Free Documentation License".

GNU Free Documentation License

Author

David Fetter, david@fetter.org

v2.00, 22 August 1999


This document tells how to use the new bzip2 compression program. The local copy of the sgml at the current site is here, and the "author-itative" sgml is here.

1. Introduction

Bzip2 is a groovy algorithm for compressing data. It generally makes files that are 60-70% of the size of their gzip'd counterparts.

This document will demonstrate how to install bzip2 and common uses of the program.

1.1 Assumptions

This HOWTO assumes that you have a working operating system and you know how to run commands from a terminal or command line. Commands are presented as they would appear on a *nix-based operating system, such as BSD, Linux or Mac OS X. Windows and DOS users should be able to use this HOWTO as well. However, these users may need to modify some commands.

1.2 Revision History

v2.00

Changed the Using bzip2 with less section so .tar.bzip2 files can actually be read. Thanks to Nicola Fabiano for the correction.

Updated buzzit utility.

Updated tar information.

v1.92

Updated the Getting bzip2 binaries section, including adding S.u.S.E.'s.

v1.91

Corrected a typo and clarified some shell idioms in the section on using bzip2 with tar. Thanks to Alessandro Rubini for these.

Updated the buzzit tool not to stomp on the original bzip2 archive.

Added bgrep, a zgrep-like tool.

v1.9

Clarified the gcc 2.7.* problem. Thanks to Ulrik Dickow for pointing this out.

Added Leonard Jean-Marc's elegant way to work with tar.

Added Linus Ãkerlund's Swedish translation.

Fixed the wu-ftpd section per Arnaud Launay's suggestion.

Moved translations to their own section.

v1.8

Put buzzit and tar.diff in the sgml where they belong. Fixed punctuation and formatting. Thanks to Arnaud Launay for his help correcting my copy. :-)

Dropped xv project for now due to lack of popular interest.

Added teasers for future versions of the document.

v1.7

Added buzzit utility. Fixed the patch against gnu tar.

v1.6

Added TenThumbs' Netscape enabler.

Also changed lesspipe.sh per his sugestion. It should work better now.

v1.5

Added Arnaud Launay's French translation, and his wu-ftpd file.

v1.4

Added Tetsu Isaji's Japanese translation.

v1.3

Added Ulrik Dickow's .emacs for 19.30 and higher.

(Also corrected jka-compr.el patch for emacs per his suggestion. Oops! Bzip2's doesn't yet(?) have an "append" flag.)

v1.2

Changed patch for emacs so it automagically recognizes .bz2 files.

v1.1

Added patch for emacs.

v1.0

Round 1.

1.3 Future versions

Future versions of the document will have applications of libbzip2, the bzip2 C library which bzip2's author, Julian Seward has kindly written. The bzip2 manual, which includes low-level information about the library, can be found at http://bzip.org/docs.html.

Future versions of the document may also include a summary of the discussion over whether (and how) bzip2 should be used in the Linux kernel.

2. Getting Bzip2

Bzip2's home page is at http://www.bzip.org/.

2.1 Bzip2-HOWTO in your language

French speakers may wish to refer to Arnaud Launay's French documents. The web version is at http://launay.org/HOWTO/Bzip2.fr.html. Arnaud can be contacted by electronic mail at mailto:asl@launay.org

Japanese speakers may wish to refer to Tetsu Isaji's Japanese documents [[|here]]. Isaji can be reached at [[|his home page]], or by electronic mail at this address.

Swedish speakers may wish to refer to Linus Ãkerlund's Swedish documents here. Linus can be reached by electronic mail at [[|this address.]]

2.2 Getting bzip2 precompiled binaries

Almost every distribution of Linux and Mac OS should come with bzip2 ready to use. If yours somehow does not, search your distribution's repositories for "bzip2."

A binary executable for Windows is available from the GNUWin32 project at http://gnuwin32.sourceforge.net/packages/bzip2.htm.

2.3 Getting bzip2 sources

The program's full source code and documentation is available from http://bzip.org/downloads.html.

2.4 Compiling bzip2 for your machine

Decompress the source code's file into a directory of your choice. Tarball file handling is beyond the scope of this HOWTO, but on *nix platforms,  tar xvf <path-to-bzip2-file>.tar.gz  will extract the file into a sub folder of the current directory. Once the file is decompressed, the README file in the bzip2 source code directory has directions on how to compile and/or install bzip2 onto your system.

For the impatient, from a console or terminal run the following commands:

cd <path-to-decompressed-bzip2-file>
make

The program will create two new binary executables, bzip2 and bzip2recover within that directory. If these commands do not work for you, consult the README.

3. Using bzip2 by itself

Bzip2's man page is the definitive source for the command line arguments and options. In the simplest case, bzip2 -z foo or bzip2 --compress foo will replace foo with a compressed version of itself, foo.bz2. Conversely, bzip2 -d foo.bz2 or bzip2 --decompress foo.bz2 will replace foo.bz2 with the decompressed version of itself, foo.

4. Using bzip2 with tar

Listed below are three ways to use bzip2 with tar, namely:

4.1 Easiest to set up

This method requires no setup at all. To un-tar the bzip2'd tar archive, foo.tar.bz2 in the current directory, do

 /path/to/bzip2 -cd foo.tar.bz2 | tar xf -  or  tar xjf foo.tar.bz2 .

These work, but can be a nuisance to type often.

4.2 Easy to set up, fairly easy to use, no need for root privileges

Thanks to Leonard Jean-Marc for the tip. Thanks also to Alessandro Rubini for differentiating bash from the csh's.

In your ~/.bashrc, you can put in a line like this:

 alias btar='tar -j ' 

In your ~/.tcshrc, or ~/.cshrc, the analogous line looks like this:

 alias btar 'tar -j ' 

5. Using bzip2 with less

To uncompress bzip2'd files on the fly, i.e. to be able to use "less" on them without first bunzip2'ing them, you can make a lesspipe.sh (man less) like this:

# This is a preprocessor for 'less'.  It is used when this environment
# variable is set:   LESSOPEN="|lesspipe.sh %s"

  case "$1" in
  *.tar) tar tvvf $1 2&gt;/dev/null ;; # View contents of various tar'd files
  *.tgz) tar tzvvf $1 2&gt;/dev/null ;;
# This one work for the unmodified version of tar:
  *.tar.bz2) bzip2 -cd $1 $1 2&gt;/dev/null | tar tvvf - ;;
#This one works with the patched version of tar:
# *.tar.bz2) tyvvf $1 2&gt;/dev/null ;;
  *.tar.gz) tar tzvvf $1 2&gt;/dev/null ;;
  *.tar.Z) tar tzvvf $1 2&gt;/dev/null ;;
  *.tar.z) tar tzvvf $1 2&gt;/dev/null ;;
  *.bz2) bzip2 -dc $1  2&gt;/dev/null ;; # View compressed files correctly
  *.Z) gzip -dc $1  2&gt;/dev/null ;;
  *.z) gzip -dc $1  2&gt;/dev/null ;;
  *.gz) gzip -dc $1  2&gt;/dev/null ;;
  *.zip) unzip -l $1 2&gt;/dev/null ;;
  *.1|*.2|*.3|*.4|*.5|*.6|*.7|*.8|*.9|*.n|*.man) FILE=`file -L $1` ; # groff src
    FILE=`echo $FILE | cut -d ' ' -f 2`
    if [ "$FILE" = "troff" ]; then
      groff -s -p -t -e -Tascii -mandoc $1
    fi ;;
  *) cat $1 2&gt;/dev/null ;;
#  *) FILE=`file -L $1` ; # Check to see if binary, if so -- view with 'strings'
#    FILE1=`echo $FILE | cut -d ' ' -f 2`
#    FILE2=`echo $FILE | cut -d ' ' -f 3`
#    if [ "$FILE1" = "Linux/i386" -o "$FILE2" = "Linux/i386" \
#         -o "$FILE1" = "ELF" -o "$FILE2" = "ELF" ]; then
#      strings $1
#    fi ;;
  esac

6. Using bzip2 with emacs

6.1 Changing emacs for everyone:

I've written the following patch to jka-compr.el which adds bzip2 to auto-compression-mode.

Disclaimer: I have only tested this with emacs-20.2, but have no reason to believe that a similar approach won't work with other versions.

To use it,

  1. Go to the emacs-20.2/lisp source directory (wherever you untarred it)
  2. Put the patch below in a file called jka-compr.el.diff (it should be alone in that file ;).
  3. Do  {{{  patch &lt; jka-compr.el.diff }}} 

  4. Start emacs, and do  {{{  M-x byte-compile-file jka-compr.el }}} 

  5. Leave emacs.
  6. Move your original jka-compr.elc to a safe place in case of bugs.
  7. Replace it with the new jka-compr.elc.
  8. Have fun!

--- jka-compr.el        Sat Jul 26 17:02:39 1997
+++ jka-compr.el.new    Thu Feb  5 17:44:35 1998
@@ -44,7 +44,7 @@
 ;; The variable, jka-compr-compression-info-list can be used to
 ;; customize jka-compr to work with other compression programs.
 ;; The default value of this variable allows jka-compr to work with
-;; Unix compress and gzip.
+;; Unix compress and gzip.  David Fetter added bzip2 support :)
 ;;
 ;; If you are concerned about the stderr output of gzip and other
 ;; compression/decompression programs showing up in your buffers, you
@@ -121,7 +121,9 @@

 ;;; I have this defined so that .Z files are assumed to be in unix
-;;; compress format; and .gz files, in gzip format.
+;;; compress format; and .gz files, in gzip format, and .bz2 files,
+;;; in the snappy new bzip2 format from http://www.muraroa.demon.co.uk.
+;;; Keep up the good work, people!
 (defcustom jka-compr-compression-info-list
   ;;[regexp
   ;; compr-message  compr-prog  compr-args
@@ -131,6 +133,10 @@
      "compressing"    "compress"     ("-c")
      "uncompressing"  "uncompress"   ("-c")
      nil t]
+    ["\\.bz2\\'"
+     "bzip2ing"        "bzip2"         ("")
+     "bunzip2ing"      "bzip2"         ("-d")
+     nil t]
     ["\\.tgz\\'"
      "zipping"        "gzip"         ("-c" "-q")
      "unzipping"      "gzip"         ("-c" "-q" "-d")

6.2 Changing emacs for one person:

Thanks for this one go to Ulrik Dickow, ukd@kampsax.dk, Systems Programmer at Kampsax Technology:

To make it so you can use bzip2 automatically when you aren't the sysadmin, just add the following to your .emacs file.

;; Automatic (un)compression on loading/saving files (gzip(1) and similar)
;; We start it in the off state, so that bzip2(1) support can be added.
;; Code thrown together by Ulrik Dickow for ~/.emacs with Emacs 19.34.
;; Should work with many older and newer Emacsen too.  No warranty though.
;;
(if (fboundp 'auto-compression-mode) ; Emacs 19.30+
    (auto-compression-mode 0)
  (require 'jka-compr)
  (toggle-auto-compression 0))
;; Now add bzip2 support and turn auto compression back on.
(add-to-list 'jka-compr-compression-info-list
             ["\\.bz2\\(~\\|\\.~[0-9]+~\\)?\\'"
              "zipping"        "bzip2"         ()
              "unzipping"      "bzip2"         ("-d")
              nil t])
(toggle-auto-compression 1 t)

7. Using bzip2 with wu-ftpd

Thanks to Arnaud Launay for this bandwidth saver. The following should go in /etc/ftpconversions to do on-the-fly compressions and decompressions with bzip2. Make sure that the paths (like /bin/compress) are right.

 :.Z:  :  :/bin/compress -d -c %s:T_REG|T_ASCII:O_UNCOMPRESS:UNCOMPRESS
 :   : :.Z:/bin/compress -c %s:T_REG:O_COMPRESS:COMPRESS
 :.gz: :  :/bin/gzip -cd %s:T_REG|T_ASCII:O_UNCOMPRESS:GUNZIP
 :   : :.gz:/bin/gzip -9 -c %s:T_REG:O_COMPRESS:GZIP
 :.bz2: :  :/bin/bzip2 -cd %s:T_REG|T_ASCII:O_UNCOMPRESS:BUNZIP2
 :   : :.bz2:/bin/bzip2 -9 -c %s:T_REG:O_COMPRESS:BZIP2
 :   : :.tar:/bin/tar -c -f - %s:T_REG|T_DIR:O_TAR:TAR
 :   : :.tar.Z:/bin/tar -c -Z -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+COMPRESS
 :   : :.tar.gz:/bin/tar -c -z -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+GZIP
 :   : :.tar.bz2:/bin/tar -c -y -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+BZIP2

8. Using bzip2 with grep

The following utility, which I call bgrep, is a slight modification of the zgrep which comes with Linux. You can use it to grep through files without bunzip2'ing them first.

# bgrep -- a wrapper around a grep program that decompresses files as needed
PATH="/usr/bin:$PATH"; export PATH

prog=`echo $0 | sed 's|.*/||'`
case "$prog" in
        *egrep) grep=${EGREP-egrep}     ;;
        *fgrep) grep=${FGREP-fgrep}     ;;
        *)      grep=${GREP-grep}       ;;
esac
pat=""
while test $# -ne 0; do
  case "$1" in
  -e | -f) opt="$opt $1"; shift; pat="$1"
           if test "$grep" = grep; then  # grep is buggy with -e on SVR4
             grep=egrep
           fi;;
  -*)      opt="$opt $1";;
   *)      if test -z "$pat"; then
             pat="$1"
           else
             break;
           fi;;
  esac
  shift
done

if test -z "$pat"; then
  echo "grep through bzip2 files"
  echo "usage: $prog [grep_options] pattern [files]"
  exit 1
fi

list=0
silent=0
op=`echo "$opt" | sed -e 's/ //g' -e 's/-//g'`
case "$op" in
  *l*) list=1
esac
case "$op" in
  *h*) silent=1
esac

if test $# -eq 0; then
  bzip2 -cd | $grep $opt "$pat"
  exit $?
fi

res=0
for i do
  if test $list -eq 1; then
    bzip2 -cdfq "$i" | $grep $opt "$pat" &gt; /dev/null &amp;&amp; echo $i
    r=$?
  elif test $# -eq 1 -o $silent -eq 1; then
    bzip2 -cd "$i" | $grep $opt "$pat"
    r=$?
  else
    bzip2 -cd "$i" | $grep $opt "$pat" | sed "s|^|${i}:|"
    r=$?
  fi
  test "$r" -ne 0 &amp;&amp; res="$r"
done
exit $res

9. Using bzip2 with Netscape under the X.

tenthumbs@cybernex.net says:

 I also found a way to get Linux Netscape to use bzip2 for Content-Encoding just as it uses gzip. Add this to $HOME/.Xdefaults or $HOME/.Xresources  I use the -s option because I would rather trade some decompressing speed for RAM usage. You can leave the option out if you want to. 

Netscape*encodingFilters:      \
        x-compress :  : .Z     : uncompress -c  \n\
        compress   :  : .Z     : uncompress -c  \n\
        x-gzip     :  : .z,.gz : gzip -cdq      \n\
        gzip       :  : .z,.gz : gzip -cdq      \n\
        x-bzip2    :  : .bz2   : bzip2 -ds \n

10. Using bzip2 to recompress other compression formats

The following perl program takes files compressed in other formats (.tar.gz, .tgz. .tar.Z, and .Z for this iteration) and repacks them for better compression. The perl source has all kinds of neat documentation on what it does and how it does what it does. This latest version takes files as input on the command line. Without command line arguments, it tries to repack every file in the current working directory.

#######################################################
#                                                     #
# This program takes compressed and gzipped programs  #
# in the current directory and turns them into bzip2  #
# format.  It handles the .tgz extension in a         #
# reasonable way, producing a .tar.bz2 file.          #
#                                                     #
#######################################################
$counter = 0;
$saved_bytes = 0;
$totals_file = '/tmp/machine_bzip2_total';
$machine_bzip2_total = 0;

@raw = (defined @ARGV)?@ARGV:&lt;*&gt;;

foreach(@raw) {
    next if /^bzip/;
    next unless /\.(tgz|gz|Z)$/;
    push @files, $_;
}
$total = scalar(@files);

foreach (@files) {
    if (/tgz$/) {
        ($new=$_) =~ s/tgz$/tar.bz2/;
    } else {
        ($new=$_) =~ s/\.g?z$/.bz2/i;
    }
    $orig_size = (stat $_)[7];
    ++$counter;
    print "Repacking $_ ($counter/$total)...\n";
    if ((system "gzip -cd $_ |bzip2 &gt;$new") == 0) {
        $new_size = (stat $new)[7];
        $factor = int(100*$new_size/$orig_size+.5);
        $saved_bytes += $orig_size-$new_size;
        print "$new is about $factor% of the size of $_. :",($factor&lt;100)?')':'(',"\n";
        unlink $_;
    } else {
        print "Arrgghh!  Something happened to $_: $!\n";
    }
}
print "You've "
    , ($saved_bytes&gt;=0)?"saved ":"lost "
    , abs($saved_bytes)
    , " bytes of storage space :"
    , ($saved_bytes&gt;=0)?")":"("
    , "\n"
    ;

unless (-e '/tmp/machine_bzip2_total') {
    system ('echo "0" &gt;/tmp/machine_bzip2_total');
    system ('chmod', '0666', '/tmp/machine_bzip2_total');
}

chomp($machine_bzip2_total = `cat $totals_file`);
open TOTAL, "&gt;$totals_file"
     or die "Can't open system-wide total: $!";
$machine_bzip2_total += $saved_bytes;
print TOTAL $machine_bzip2_total;
close TOTAL;

print "That's a machine-wide total of ",`cat $totals_file`," bytes saved.\n";

Bzip2 (last edited 2011-01-09 01:47:52 by BordenRhodes)