CS 1723 Section 1T, Summer 1998 Assignment 7: Huffman Coding

Write a program to decode a file encoded with the program huffman.c.

huffman.c reads a text file named on the command line and counts all occurences of each of the 255 possible byte combinations in the file in an array called freqs[256]. For example, if there are 14 letter 'A's in the file, then freqs['A'] == 14. The program also counts the number of characters in the file, calling this quantity n. It then builds a Huffman tree from the frequency data. An output file is opened with extension .huf. Before coding the file, the program writes the freq array of 256 integers (1024 bytes on the Sun) and the integer n (four bytes on the Sun) to the output file, using C statements equivalent to:

	fwrite (freqs, 256, sizeof (int), outputfile);
	fwrite (&n, 1, sizeof (int), outputfile);
Then the input file is read again and Huffman coded, and the resulting bits are written to the output file.

Your program should read a Huffman coded input file named on the command line (just like huffman.c), then decode the file, writing it to standard output (i.e. not another file opened with fopen()). You may borrow any code you wish from huffman.c.

Your program should do the following:

Test your program by compressing files with huffman.c and decompressing them with your program.

To turn in your program:

Note: huffman.c and your program work with data in binary format. This means that you may have to do something system dependent to get them to work on a non-Unix operating system (e.g., use fopen (..., "rb") instead of "r"). Also, note that the SPARC is a "big-endian" system; integers are stored with the most significant byte in the first position. You can still use huffman.c on a "little-endian" system, where integers are stored least significant byte first, but compressed files will not be portable to big-endian systems. So you should do your final testing and decompression of the test files on a big-endian machine like runner. The Motorola 68000 series and PowerPC processors (i.e., Macintosh computers) and MIPS-based Silicon Graphics computers are big-endian and thus compatible with runner. Intel processors (e.g. Pentium) are little-endian and thus not compatible.

This program is due by midnight, Monday, August 3, 1998.