Processing: Video artifacts

Artifacts, or undesirable errors, invariably creep into video. If you are working with analog video or with video compression, some artifacting is inevitable. Knowing different types of artifacts and their causes can help you determine which are due to shortcomings in hardware and software, deficiencies in your video process, or even software (or hardware!) bugs.

A few artifacts and their causes are listed below. This is not meant to be an exhaustive list of artifacts you may encounter, but it should cover the common ones.

Quilting (Compression artifacts)

Quilting results from high levels of compression on an image that cause warbling and edge artifacts to appear in the output. The term "quilting" refers to the fact that video compression is usually done in blocks of 8x8 or 16x16 pixels, so if the compression factor is set very high such that the blocks don't match well, the result looks like a quilt. The warbles around sharp edges are result of discarding detail from the image for higher compression. In other words, it's not that warbles were added, but that the detail which would sharpen the edge and flatten the area around it has been dropped.

Advanced video compression algorithms, particularly those based on the MPEG-4 video standard, have post-processing filters designed to reduce these artifacts, which can give the video a smeared and cartoony look that is less objectionable.

To reduce quilting artifacts:

Use more advanced video compression formats.
Compress less — use more bitrate and produce larger video files.
Use multi-pass compression if available, to better distribute bits to where they are needed.

Combing (interlacing artifacts)

Analog video is delivered using a mechanism known as interlacing to reduce flicker. Instead of sending entire frames at 29.97 frames per second (25 for PAL/SECAM), the video is sent in halves called fields, at 59.94 fields per second (50 for PAL/SECAM). These fields are interlaced together such that a frame is composed of alternating lines from each field; the even lines make up the even field and the odd lines make up the odd field. The result is high resolution in static scenes and smoother motion in fast-moving scenes, with less flicker and without requiring more bandwidth.

The catch with interlacing is that you can't have both higher resolution and smoother motion at the same time, so artifacts appear whenever you try to extract both. The process of removing the interlacing and displaying the result non-interlaced is known as deinterlacing. Displaying each field by itself gives smoother motion at the cost of resolution and is known as bob deinterlacing. Pairing fields up and displaying them together as frames gives higher resolution in exchange for smeared motion and combing artifacts and is called weave deinterlacing. Both produce frames at field rate (59.94 or 50 fps). Trying to switch between the two on a per-frame or even per-area basis depending on the amount of local motion is adaptive deinterlacing and can produce an even higher quality image.

Most video capture devices do not attempt to deinterlace and simply pair fields together, which is similar to weave deinterlacing except that it gives half the field rate (29.97 or 25). If the video was produced originally from full frames at that rate, this has a 50/50 chance of producing whole frames instead of a combed mess. There is no requirement that this be the case, though, and since the alternating fields are evenly staggered in time they usually aren't. In that case there is no "correct" way to deinterlace the video, and different deinterlacing techniques must be tried to produce the best quality non-interlaced video.

Material derived from 24 fps film is a special case in that the video is sourced from whole frames and split into fields in a specific pattern; this is called telecine. In NTSC, this is done by slowing the video down very slightly and combining fields in a 3:2 pattern; VirtualDub's inverse telecine feature, accessible in the video frame rate control dialog, can sometimes undo this pattern. In the case of PAL, the video is usually just sped up by 4% from 24 fps to 25 fps, so the most that is usually required is a single-field delay to pair up the fields correctly.

Banding (quantization artifacts)

Banding occurs when the bit depth used to represent pixels in an image is too low; the result is stairstepping in the image as pixels are forced to hop between colors that are far enough apart to distinguish visually. This is most visible in large, shallow gradients, which become bands of solid color.

When banding is observed, you should first check the display settings for your system to ensure that the display is set to 24-bit color or 32-bit color, which will produce the least banding. In particular, selecting a 15-bit or 16-bit display mode will introduce banding on screen that may not be present in the actual video.

Selecting 15-bit RGB or 16-bit RGB as a processing format will introduce noticeable banding into an image, so these formats should generally be avoided. It is still possible to see banding with YCbCr formats or 24/32-bit RGB if the gradient is very shallow and over a large area; using higher-precision formats or dithering can alleviate this. Note that VirtualDub does not currently support a format that has greater than 8 bits of precision per channel (256 levels).

Scaling artifacts

Rough and slightly blocky edges may be indicative of a poor scaling algorithm being used to resize video. In particular, use of a nearest neighbor or point sampling algorithm can give a blocky look to video due to the lack of interpolation ("smooth" stretching) during the resize, which means that rows and columns are just duplicated or deleted instead of being blended to modify the video's size. The result is that thin creases appear in the image.

A bad resize operation is difficult to undo after the fact if you no longer have the source, but if you can redo the bad operation, try moving any scaling operations later in the process so they can be done using VirtualDub's high-quality resize filter, with a bilinear or bicubic filter. For instance, if you are attempting to capture analog video at a 480x360 resolution, try using 640x480 or 640x576 — something closer to the native resolution — and then scaling in post-processing. This will often take more CPU power and storage, however.

Rainbows (pitch/stride errors)

A regular slant to a decoded image with rainbows across scanlines is usually indicative of a buggy video codec that does not compute pitch or stride correctly. The technical reason for this is that padding at the end of each horizontal row is not being accounted for correctly, resulting in each row being progressively further off. This isn't that important to diagnose the problem, though.

The key to fixing the problem is that multiples of four pixels in width usually work around this bug, because in that case the row-end correction is unnecessary. If attempting to compress with a video codec gives striped results when using widths like 321, 639, etc. but 316, 320, 324, 636, 640, 644... work, then you are experiencing this issue.