NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / compress / 3863 < prev next >

Wrap

Internet Message Format | 1992-11-17 | 2.3 KB

Path: sparky!uunet!usc!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!think.com!ames!agate!doc.ic.ac.uk!uknet!mcsun!sun4nl!ruuinf!accucx!nevries From: nevries@accucx.cc.ruu.nl (Nico E de Vries) Newsgroups: comp.compression Subject: Re: Need a compressor for sparse bit datastream Keywords: compression technique Message-ID: <3328@accucx.cc.ruu.nl> Date: 16 Nov 92 16:53:52 GMT References: <1992Nov13.120505.29654@spectrum.xerox.com> Organization: Academic Computer Centre Utrecht Lines: 28 In <1992Nov13.120505.29654@spectrum.xerox.com> richard@garfield.noname (richard_landells.sbd-e@rx.xerox.com) writes: >I have an application that generates binary output. The output is relatively random, but there are approximately twice as many off bits as on bits. My objective is to compress this as much as possible. >I have tried several 'standard' compressors, arj 2.2, lharc, pkzip 1.1, and have only managed to achieve very minimal compression in the order of 4% at best (on a 40K file). Now I know that a truly random binary datastream cannot be compressed, but I was kind of hoping for better than 4%. Am I missing something fundamental, or is this really the best that can be achieved? >If there is a technique to compress this type of data, I would appreciate some pointers to some source code that implements it. If the data is random but has two times more 0's than 1's use arithmetic compression. This achieves better compression than the 4% you mentioned. If there is some "logic" in the data (repeating patterns etc) you might considder e.g. higher order arithmetic compression. Something which also might work is converting bits to bytes. This makes the file 8 times larger but allows ARJ and PKZIP to do their job. Both are byte oriented. The resulting compressed file might become smaller than the compressed origional one. >Richard Landells (landells.sbd-e@rx.xerox.com) >Rank Xerox System Centre Nico E. de Vries (nevries@cc.ruu.nl) |------------------* AA III PPP _ This text is supplied AS IS, no warranties of any kind | A A I P P | apply. No rights can be derived from this text. This | AAAA I PPP | text is likely to contain spelling and grammar errors. | A A I P *---------------------------( Donate to GreenPeace! )----* A A III P "The IBM PC is still waiting for a version of the CP/M OS.", G.M. Vose, 1982.