home *** CD-ROM | disk | FTP | other *** search
-
-
- HUFF
- A dynamic huffman cruncher/decruncher
- Version 0.62
- Copyright 1992 by M.Zimmermann
-
-
-
- License/Disclaimer
- ------------------
-
- This library may be freely distributed with the XPK compression package,
- as long as it is kept in its original, complete, and unmodified form. It
- may not be distributed by itself or in a commercial package of any kind
- without my permission.
-
- This program is distributed in the hope that it will be useful, but
- WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- or FITNESS FOR A PARTICULAR PURPOSE.
-
-
-
- About Huffmans
- --------------
-
- The idea of a huffman crunch is as follows: often used bytes (ie 8 bit
- codes) get a shorter code than 8 bits, fi 5 bits. So everytime one of these
- bytes occurs in the source file I save (in this example) 3 bits in the dest
- file. To find out the optimum codes for each byte there is a simple method:
- A tree is to be constructed so that the path from the root to each leaf
- reflects the optimum (ie huffman) code. Unfortunately most computers (the
- Amiga, too) are byte-oriented, which means a rather complex handling of
- codes that are not a multiple of 8 bits. This results in pretty slow
- compression & decompression. So this means that the xpkHUFF.library
- probably won't be used for normal circumstances, but, as Dominik stated, it
- may serve well as an example library.
-
- There are three different huffman crunch algorithms known:
-
- · static compression/decompression
- · dynamic compression/decompression
- · adaptive compression/decompression
-
-
- What are the differences?
-
- The static huffman uses a fix table to represent each byte of the source.
- This, of course, makes sense only, if the structure of the data to be
- crunched is known. In this case (for instance crunching an english text) a
- fix table of codes is embedded in the code. Crunching other data than what
- the table was generated for will probably result in worse compression or
- even expansion.
-
- This is what a dynamic huffman is avoiding: it first creates a
- statistics table reflecting the frequency every byte occurs with and
- generates a special tree/table for this case, so the dynamic huffman does a
- good compression for this special data.
-
- But there is something that can be improved, anyway: imagine, there is a
- data block which contains many 'a's in it's first part and many 'z's in the
- last part.... The dynamic huffman would represent the 'a's and 'z's with
- short codes, of course. But it probably would be even better if the
- crunch/decrunch tree would reflect the *current* data beeing processed
- instead of the whole block, thus in resulting shorter codes for 'a' and 'z',
- depending of the position in the data block. This is what an adaptive
- huffman deals with: it creates the tree while crunching, without doing any
- statistics or tree creation before crunching. It permanently updates it's
- internal huffman tree. Therefore it doesn't have to include the information
- about the decrunch tree in the crunched data.
-
-
-
- Final words about huffmans ...
- ------------------------------
-
- A stand-alone huffman will never achieve crunch results as fine as those
- reached with most other crunchers, for these will not only regard the number
- of occurances for each byte (as huffman does), but sequels of byte, too.
- This means: If you create all permutations of a datablock, the huffman
- crunched will always have the same length. Others won't, as they are
- depending on the order of the crunched data, too.
-
-
-
- Description
- -----------
-
- The library 'xpkHUFF.library' implements a dynamic huffman crunch
- algorithm, even though the adaptive might result in slightly better crunch
- results. However, this is more complex to implement and I'm using a maximum
- buffer size of 64K, so this is a little bit like an adaptive huffman for
- large files.
-
- If I should have lots of spare time I will probably implement an adaptive
- huffman crunch algorithm. This new library will be called xpkHUFF, too, and
- new xpkHUFF.libraries will always handle output generated by earlier
- versions.
-
- The xpkHUFF.library supports a totally unsafe (but a little bit better
- than simple eor :-) encryption. Please note that crunch/decrunch speeds
- decrease when encryption is used.
-
-
-
- The source ...
- --------------
-
- The source is provided, so you might have a look at different decrunch
- codes. Please note: The only code I tested intensively is the byte/cache
- decrunch code, which was used to create the library included in this
- package. For it's the fasted code, you probably won't want to use another
- one. However, this might help you to understand what huffmans are about.
-
- If you have never written a huffman, have a look at it, a huffman code is
- a pretty good idea. If you *have* written a huffman code and it's *faster*
- than mine, please let me know, I will then update xpkHUFF.library.
-
-
-
- Implementation
- --------------
-
- If you should see an errormessage saying output buffer too small while
- crunching *and* encrypting, this means you tried to crunch and encrypt a
- file that would crunched and encrypted be larger than the original file.
- This should occur only with very small files (for I have a minimum file size
- due to tables) or with files that have been crunched already and therefore
- would expand during crunch.
-
- A technical note: this could also happen, if the last chunk of a file to
- be crunched/encrypted would be dimensioned too small by xpkmaster.library.
-
- However, in this case you cannot encrypt the file. I know this could be
- annoying and will think about a solution for this problem, but remember:
- this encryption would not be safe, better if you used FEAL or IDEA for
- secure encryption.
-
-
-
- Statistics
- ----------
-
- Following is a table briefly listing some comparative statistics for HUFF
- without encryption. These were generated by xBench on the standard XPK
- benchmark system (A3000/25 with SCRAM, using the AmigaVision executable as
- data). Note that memory needs don't include xpkmaster.library's buffers.
-
- Method Packing Unpacking Packing Unpacking Compression
- Memory Memory Speed Speed Ratio
- ------ ------- --------- ------- --------- -----------
- HUFF 30K 71K 88 K/s 138 K/s 24.1%
-
-
- Where unpack speed varies depending on decrunch code (refer to source for
- that).
-
-
- Last words ...
- --------------
-
- I tried hard to debug this library with range checking while writing
- bytes on crunching, and so on, but as in every code larger than, say 10
- lines :-), there will be bugs. I don't know any bugs in this version, but
- if you should meet one, please let me know via email (refer to end of this
- document for my email adr). As usually, reproducable bugs are preferred.
- Please add your configuration, programs running (best if you try without
- startup-sequence!), and, most important of all, add the file you tried to
- crunch! Thank you.
-
-
-
- Version History
- ---------------
-
- ; V 0.1 - 12-Jul-1992 : first version
- ; V x.yy - 18-Jul-1992 : first OK version
- ; V x.yy - 19-Jul-1992 : sped up decrunching
- ; V x.yy - 21-Jul-1992 : bug fixed in word/long decrunching: min pack
- ; chunk size now 3/5
- ; V x.yy - 21-Jul-1992 : replaced many subq/bxx with dbf (ie sped up
- ; crunching a little bit), bug fixed: there was
- ; a dbf counter wrong (one of my favorite 0/1
- ; problem bugs)
- ; V 0.50 - 29-Jul-1992 : added 68030+ cache optimized decrunch code
- ; V 0.51 - 01-Aug-1992 : byte decrunch improved, first code added,
- ; indicator byte for crunchmethod used added,
- ; 68030+ chache optimized code does not make
- ; sense any more, since byte decrunch fits to
- ; cache completely, now
- ; V 0.52 - 01-Aug-1992 : unsafe encryption supported
- ; V 0.53 - 03-Aug-1992 : slight improvements made to crunch code
- ; (+ 6K/s)
- ; V 0.54 - 03-Aug-1992 : inconsistence in expansion handling fixed
- ; V 0.55 - 03-Aug-1992 : bug fixed: expansion handling now considers
- ; table creation, too
- ; V 0.56 - 03-Aug-1992 : bug fixed: HUFF now can crunch files
- ; consisting of always the same byte (shame
- ; on me, termination criterium was wrong)
- ; V 0.57 - 03-Aug-1992 : Tree creation code partially rewritten
- ; V 0.58 - 05-Aug-1992 : bug fixed: wrong termination criterium for
- ; expansion check (my favorite 0/1 problem)
- ; V 0.59 - 06-Aug-1992 : now decrypting in a special buffer, not using
- ; InBuf (this is read only, I was told) any more
- ; V 0.60 - 07-Aug-1992 : added extra memory required during
- ; packing/unpacking
- ; V 0.61 - 08-Aug-1992 : expansion check changed, renamed from
- ; xpkDHUF.library to xpkHUFF.library thus
- ; corresponding to the possibility of handling
- ; adaptive huffman codes later without having
- ; an inconsistence in the name
- ; V 0.62 - 10-Aug-1992 : Flag XPKIF_MODES removed (I don't have modes
- ; yet (but I have a mapping code :-=))
-
-
-
- Contact Address
- ---------------
-
- zimmerma@helios.ibr.cs.tu-bs.de (M. Zimmermann)
-
-