RISC DISC 1

home *** CD-ROM | disk | FTP | other *** search

/ RISC DISC 1 / RISC_DISC_1.iso / usefulinfo / faq / jpegfaq < prev next >

Wrap

Internet Message Format | 1994-10-15 | 18.5 KB

Xref: micromuse comp.compression:1336 alt.graphics.pixutils:1404 alt.binaries.pictures.d:719 alt.binaries.pictures.erotica.d:3 alt.binaries.pictures.misc:1 alt.sex.pictures.d:623 Path: micromuse!axion!uknet!ukc!mcsun!uunet!zaphod.mps.ohio-state.edu!mips!dimacs.rutgers.edu!rutgers!ub!galileo.cc.rochester.edu!rochester!pt.cs.cmu.edu!tgl From: tgl+@cs.cmu.edu (Tom Lane) Newsgroups: comp.compression,alt.graphics.pixutils,alt.binaries.pictures.d,alt.binaries.pictures.erotica.d,alt.binaries.pictures.misc,alt.sex.pictures.d Subject: All about JPEG Summary: trying to clear up some of the confusion Keywords: JPEG, image compression, FAQ Message-ID: <1991Oct22.223451.253852@cs.cmu.edu> Date: 22 Oct 91 22:34:51 GMT Followup-To: alt.binaries.pictures.d Organization: School of Computer Science, Carnegie Mellon Lines: 345 Nntp-Posting-Host: g.gp.cs.cmu.edu Originator: tgl@G.GP.CS.CMU.EDU Recent posts have made it clear that some folks are still in the dark about what JPEG is, while others think they know what it is but are harboring misconceptions. Herewith is some authoritative (I hope) information about what JPEG can and can't do, where you can get software for it, etc. etc. It may be worth turning this into a FAQ file. Suggestions for additions and clarifications would be welcome. 1. What is JPEG? JPEG is a standardized image compression mechanism. JPEG stands for Joint Photographic Experts Group (the original name of the committee that wrote the standard). JPEG is designed for compressing either full-color or gray-scale digital images of "natural" (real-world) scenes. JPEG does not handle black-and-white (1-bit-per-pixel) images, nor does it handle motion picture compression. (There are related committees, JBIG and MPEG respectively, working on standards for compressing those types of images.) JPEG is "lossy", meaning that the image you get out of decompression isn't quite identical to what you put in. The algorithm achieves much of its compression by exploiting known limitations of the human eye; notably, the fact that small color details aren't perceived as well as small details of light-and-dark. Thus, JPEG is intended for compressing images that will be looked at by humans. If you plan to machine-analyze your images, the small errors introduced by JPEG may be a problem for you, even if they are invisible to the eye. A useful property of JPEG is that the degree of lossiness can be varied by adjusting compression parameters. This means that the image maker can trade off file size against output image quality. You can make *extremely* small files if you don't mind poor quality; this is useful for indexing image archives, making thumbnail views or icons, etc. etc. Conversely, if you aren't happy with the output quality at the default compression setting, you can jack up the quality until you are happy, and accept lesser compression. 2. Why use JPEG? Basically, to make your image files smaller. This is a big win for transmitting files across networks and for archiving libraries of images. Being able to compress a 2 Mbyte full-color file down to 100 Kbytes or so makes a big difference in disk space or transmission time! (If you are comparing GIF and JPEG, the size ratio is more like four to one. More details below.) Unless your viewing software supports JPEG directly, you'll have to convert JPEG to some other format for viewing or manipulating images. Thus, using JPEG is essentially a time/space tradeoff: you give up some time in order to store or transmit an image more cheaply. It's worth noting that when network or phone transmission is involved, the time savings from transferring a shorter file can be much greater than the extra time to decompress the file. I'll let you do the arithmetic yourself. 3. How well does it work? Pretty darn well. Here are some sample file sizes for an image I have handy, a 727x525 full-color image of a ship in a harbor. The first three files are for comparison purposes; the rest were created with the free JPEG software described at the end of this file. File Size in bytes Comments ship.ppm 1145040 Original file in PPM format (no compression) ship.ppm.Z 963829 PPM file passed thru Unix compress compress doesn't accomplish a lot, you'll note. ship.gif 240438 Converted to GIF with ppmquant -fs 256 | ppmtogif Most of the savings is the result of losing color info: GIF saves 8 bits/pixel, not 24. (See sec. 5.) ship.jpg100 315600 cjpeg -Q 100 (highest quality setting) This is indistinguishable from the 24-bit original, at least to my nonprofessional eyeballs. ship.jpg75 57995 cjpeg -Q 75 (default setting) You have to look mighty darn close to distinguish this from the original, even with both on-screen at once. ship.jpg50 38399 cjpeg -Q 50 This has slight defects; if you know what to look for, you could tell it's been JPEGged without seeing the original. Still at or above the quality of typical recent postings in Usenet pictures groups. ship.jpg25 25186 cjpeg -Q 25 Visible blockiness (djpeg -b helps some). Much higher quality than a GIF of comparable size, though. ship.jpg5o 6597 cjpeg -Q 5 -o Blocky, but perfectly satisfactory for preview or indexing purposes. In this case JPEG can make a file that's a factor of four or five smaller than a GIF of comparable quality. This seems to be a typical ratio for real-world scenes. GIF does significantly better on images with only a few distinct colors, such as cartoons or line art. JPEG can't squeeze these files as much as GIF does without introducing highly visible defects. This sort of image is best left in GIF form. 4. What about lossless JPEG? There's a great deal of confusion on this subject. The JPEG committee did define a truly lossless compression algorithm, i.e., one that guarantees the final output is bit-for-bit identical to the original input. However, this lossless mode has almost nothing in common with the regular, lossy JPEG algorithm. As far as I know, the lossless JPEG mode is not implemented in any software available to the public. Saying "-Q 100" to the free JPEG software DOES NOT get you a lossless image. What it does get rid of is deliberate information loss in the coefficient quantization step. There is still a good deal of information loss in the color subsampling step. (There should be a command line switch to disable subsampling, but as of today, there isn't one.) Even with both quantization and subsampling turned off, the standard JPEG algorithm is not truly lossless, because it is subject to roundoff errors in various calculations. The maximum error is a few counts in any one pixel value; it's highly unlikely that this could be perceived by the human eye, but it might be a concern if you are doing machine processing of an image. At this minimum-loss setting, standard JPEG produces files that are perhaps half the size of an uncompressed 24-bit-per-pixel image. JPEG's true lossless mode is reputed to provide roughly the same amount of compression. Those in the know do not regard this as state-of-the-art performance for lossless image compression; if you need lossless compression, you may be well advised to wait for the upcoming JBIG standard. 5. What's all this hoopla about color quantization? Most people don't have full-color (24 bit per pixel) display hardware. Typical display hardware stores 8 or fewer bits per pixel, so it can display 256 or fewer distinct colors at a time. To display a full-color image, the computer must map the image into an appropriate set of representative colors. This process is called "color quantization" (not to be confused with the coefficient quantization done internally by JPEG). Color quantization is obviously a lossy process. It turns out that for most images, the details of the color quantization algorithm have MUCH more impact on the final image quality than do any errors introduced by JPEG (except at the lowest JPEG quality settings). Since JPEG is inherently a full-color format, converting a JPEG image for display on 8-bit-or-less hardware requires color quantization. A GIF image, by definition, has already been quantized to 256 or fewer colors. For purposes of Usenet picture distribution, GIF has the advantage that the sender precomputes the color quantization and recipients don't have to. This is also the *disadvantage* of GIF: you're stuck with the sender's quantization. If the sender quantized to a different number of colors than what you can display, you have to re-quantize, resulting in much poorer image quality than if you had quantized once from a full-color image. Furthermore, if the sender didn't use a high-quality color quantization algorithm, you're out of luck. For this reason, JPEG offers the promise of *significantly better* image quality for all users whose machines don't match the sender's display hardware. JPEG's full color image can be quantized to precisely match the user's display hardware. Furthermore, you will be able to take advantage of future improvements in quantization algorithms (there is a lot of active research in this area), or purchase better display hardware, to get a better view of JPEG images you already have. With GIF, you're stuck forevermore with what was sent. It's also worth mentioning that many GIF-viewing programs include rather shoddy quantization routines. If you view a 256-color GIF on a 16-color EGA display, for example, you are probably getting a much worse image than you need to. This is partly an inevitable consequence of doing two color quantizations (one to create the GIF, one to display it), but often it's also due to sloppiness. JPEG conversion programs will be forced to use high quality quantizers in order to get acceptable results at all, and in normal use they will quantize directly to the number of colors to be displayed. Thus, JPEG is likely to provide better results than the average GIF program for low-color-resolution displays as well as high-resolution ones! The same considerations apply to gray-scale images, although quantization of gray scale is a much simpler problem. 6. When should I use JPEG, and when should I stick with GIF? For the reasons discussed above, JPEG is superior to GIF for storing and distributing full-color and gray-scale images of "realistic" scenes. JPEG is superior even if you don't have 24-bit display hardware, and it is a LOT superior if you do. GIF remains the superior format for cartoons, line drawings, and some other types of "non-realistic" images. JPEG is not designed for good performance on this kind of image. If you have an existing library of GIF images, you may wonder whether you should convert it to JPEG. You will lose some image quality if you do so, but the disk space savings may justify converting anyway. (The preceding section, which argued that JPEG image quality is superior to GIF, only applies if both formats start from a full-color original. If you start from a GIF, you've already irretrievably lost a great deal of color information.) Experience to date suggests that large, high-quality GIFs are the best candidates for conversion to JPEG. They chew up the most storage so offer the most savings, and they convert to JPEG with minimum visible degradation. (Generally, JPEG won't compress low-quality input images as well as high-quality ones.) Don't waste your time converting any GIF much under 100 Kbytes. Also, don't expect JPEG files converted from GIFs to be as small as those created directly from full-color originals. For comparable quality you may have to let the converted files be as much as twice as big as straight-through JPEG files would be (i.e., shoot for 1/2 or 1/3rd the size of the GIF file, not 1/4th as shown in the earlier comparisons). 7. How does JPEG work? The buzz-words to know are chrominance subsampling, discrete cosine transforms, coefficient quantization, and Huffman or arithmetic entropy coding. This article's long enough already, so I'm not going to say more than that. For technical details, see Wallace's article in the April 1991 Communications of the ACM. 8. Why all the argument about file formats? Strictly speaking, JPEG refers only to a family of compression algorithms; it does *not* refer to a specific image file format. The JPEG committee was prevented from defining a file format by turf wars within the international standards organizations. Since we can't actually exchange images with anyone else unless we agree on a common file format, this leaves us with a problem. In the absence of official standards, a lot of JPEG program writers have just gone off to "do their own thing", and as a result their programs aren't compatible with anybody else's. The closest thing we have to a de-facto standard JPEG format is some work that's been coordinated by people at C-Cube Microsystems. They have defined two JPEG-based file formats: * JFIF (JPEG File Interchange Format), a "low-end" format that transports pixels and not much else. * TIFF/JPEG, an extension of the Aldus TIFF format. TIFF is a "high-end" format that will let you record just about everything you ever wanted to know about an image, and a lot more besides :-). TIFF is a lot more complex than JFIF, and may well prove less transportable, because different vendors have historically implemented slightly different and incompatible subsets of TIFF. It's not likely that adding JPEG to the mix will do anything to improve this situation. Both of these formats were developed with input from all the major vendors of JPEG-related products; it's reasonably likely that future commercial products will adhere to one or both standards. (However, as of right now, October 1991, it's too early for many such products to have appeared.) A particular case that people may be interested in is Apple's QuickTime software for the Macintosh. QuickTime uses a JFIF-compatible format wrapped inside the Mac-specific PICT structure. Conversion between JFIF and PICT/JPEG should be pretty straightforward; in fact Apple may release a utility for the purpose. I believe that Usenet should adopt JFIF as the replacement for GIF in picture postings. JFIF is simpler than TIFF and is available now; the TIFF/JPEG spec is still being hammered out. Even when TIFF/JPEG is available, the JFIF format is likely to be a widely supported "lowest common denominator"; TIFF/JPEG files may never be as transportable. 9. And what's all this about arithmetic coding? The JPEG spec defines two different "back end" modules for the final output of compressed data: either Huffman coding or arithmetic coding is allowed. The choice has no impact on image quality, but arithmetic coding usually produces a smaller compressed file. On typical images, arithmetic coding produces a file 5 or 10 percent smaller than Huffman coding. (The numbers previously cited are all for Huffman coding.) Unfortunately, the particular variant of arithmetic coding specified by the JPEG standard is subject to patents owned by IBM, AT&T, and Mitsubishi. Thus *you cannot legally use arithmetic coding* unless you obtain licenses from these companies. (The "fair use" doctrine allows people to implement and test the algorithm, but actually storing any images with it is dubious at best.) At least in the short run, I recommend that people not worry about arithmetic coding; the space savings isn't great enough to justify the potential legal hassles. In particular, arithmetic coding *should not* be used for any images to be exchanged on Usenet. There is some small chance that the legal situation may change in the future. Stay tuned for further details. 10. Where can I get JPEG software? Free, portable C code for JPEG compression is available from the Independent JPEG Group, which I lead. A package containing our C source code, documentation, and some small test files is available from several places. (We are not currently distributing pre-built binary files; we assume you have some experience in installing portable C programs. Pre-built binaries will probably be made available in the future.) The "official" archive site for this software is uunet.uu.net, under directory /graphics/jpeg; the current release is in jpegsrc.v1.tar.Z. (This is a compressed TAR file.) You can retrieve this file by FTP or UUCP. Folks in Europe may find it easier to FTP from nic.funet.fi (see directory pub/graphics/programs/jpeg). I believe our code is also available on CompuServe, in the GRAPHSUPPORT forum (GO PICS), library 14, as jpegv1.zip. This software has been tested on numerous Unix machines, PCs, Macs, and Amigas; we believe it can be ported to almost any machine that has a (reasonable) C compiler. We consider this to be a preliminary release. The current software only handles conversion between JPEG and PBMPLUS image formats, so it must be used in conjunction with Jef Poskanzer's free PBMPLUS software. (Well, actually it can read and write GIF files too, but writing GIF files doesn't work very well yet.) Some operations will run out of memory on PCs and other non-virtual-memory machines. These and other shortcomings will be fixed in future releases. We have released this software for both noncommercial and commercial use. Companies are welcome to use it as the basis for JPEG-related products. We do not ask a royalty, although we do ask for an acknowledgement in product literature (see the README file in the distribution for details). We hope to make this software industrial-quality --- although, as with anything that's free, we offer no warranty and accept no liability. The Independent JPEG Group is a volunteer organization; if you'd like to contribute to improving our software, you are welcome to join. In addition to the free JPEG software, I am aware of two shareware programs from Handmade Software (contact hsi@netcom.com for details). Their software runs on PCs and on a limited number of Unix machines. As of today, their software is faster and does better color quantization than the free JPEG software; but they had better have their running shoes on if they don't want to be surpassed soon. (And of course, you're morally obligated to pay if you use their software.) There are numerous commercial JPEG offerings, with more popping up every day. I recommend that you not waste your money unless you find the free software vastly too slow. In that case, purchase a hardware-assisted product. Ask hard questions about whether the product complies with the final JPEG standard and about whether it can handle the JFIF file format; an awful lot of the earliest commercial releases are not and never will be compatible with anyone else's files. -- tom lane organizer, Independent JPEG Group Internet: tgl@cs.cmu.edu BITNET: tgl%cs.cmu.edu@cmuccvma