PC World 2007 March

home *** CD-ROM | disk | FTP | other *** search

/ PC World 2007 March / PCWorld_2007-03_cd.bin / audio-video / autogk / AutoGordianKnot.2.40.Setup.exe / tools / azid.txt < prev next >

Wrap

Text File | 2003-12-18 | 26KB | 782 lines

AZID Readme v1.9 build 922 (2003-12-18) Copyright (C) 1997-2003 Midas <midas@egon.gyaloglo.hu> ------------------------------------------------------------------------------- Introduction ============ This is the documentation for the AC3 decoder application, Azid. It is written by Midas <midas@egon.gyaloglo.hu>, copyright (C) 1997-2003 Midas. The most recent update of this program can be found on the excellent pages of http://www.doom9.org/ Usage and legal conditions: --------------------------- This is a test implemenation of standard A/52 from ATSC (Digital Audio Compression Standard), and it may contain algorithms covered by pending patents. This application may solely be used for proving that bitstreams are compliant to this standard for test and demonstration purposes only. Any other use may be prohibited by law in your country. The author has no liability regarding this application whatsoever. This application may be distributed freely unless prohibited by law. Overview -------- This document assumes that you know what AC3 is. I wont give any introduction to what AC3 is and how it is used. The internet is full of pages describing what AC3 is all about, the more authorative is probably for the authors of AC3: Dolby. http://www.dolby.com Now for the technical stuff: AC3 is a digital compression algorithm which may compress up to 5 full bandwidth channels and one low-frequency effects channel (with limited bandwidth of 120Hz) into a bitstream. The size of this compressed bitstream is typically reduced by a factor of 13 compared with the raw data rate. The specification for the AC3 decoder can be found in the ATSC specification A/52 at http://www.dolby.com or http://www.atsc.org. AC3 encoded audio is divided into frames. One frame gives 1536 samples of audio or 32ms of audio at 48kHz sampling rate. A single frame is divided into several sub-sections: - syncronization - bit stream information (BSI) - 6 audio blocks - auxilliary data (and CRC) The BSI section contains information regarding the bitstream and the current frame. It contains information like samplerate, number of encoded channels, downmix-levels, dynamic compression types, program contents, etc. The audio block contains the actual encoded audio. One block gives 256 samples (approx. 5.3ms at 48kHz). One audio block is atomic; the audio decoding operation is repeated for each of these six blocks. The specific details of this operation can be found in the A/52 spec. Decoder operation ----------------- The decoder decodes an audio block into elementary channels of audio. These elementary channels represents the same channels that where fed into the ac3 coder at the studio; like center, left, right, etc. If the number of actual output channels is fewer than the encoded number of audio channels, the decoder must downmix these channels into the correct number of channels. In this documentation these channels are called "input channels" and the channels are named: left (l), right (r), center (c), surround left (sl or s), surround right (sr) and low-frequency effect channel (lfe). The downmix operation reduces the number of input channels to the requested number of speakers. This operation is controlled by the option -d. It selects how many front and rear speakers the decoder should decode to. If for example -d2/0 (2 front speakers, no rear speakers) is selected, everything is mixed into these two speakers/channels. Then the audio is fed to the output selector. This controls which of these channels to route to the actual speaker output. The first speaker output is named output speaker 0, the next output speaker 1, etc. The decoder supports up to max. 6 output speakers. The -o operaion controls what channel(s) to output. If -d2/0 is selected, the left and the right channel may be output with the -ol,r option. In this case, all other channels than l or r does not contain any audio because -d2/0 generates only audio in the l and r channels. Other sequences of channel output may also be chosen, or the same channel may be output several times. For example: -or,l or -oc,c is both legal options. The -l and -L options control how the LFE channel is downmixed. The -l selects the downmix level of the LFE channel into the LFE (output) channel, while -L controls the amount of LFE audio into the left and right channels. TIP: If -d3/2 decoding is chosen, no downmix is performed and it's possible to listen to individual channels selectable with the -o option. E.g. use -ol,r to listen to the left and right channels, or use -osl,sr to listen to the surround channels. A special command option named --ch may be used to individually change the attributes for each channel. This option may be used both on input channels (named l,c,r,etc.) or on output speakers (numbers 0-5). The attributes may contain gain (see -g for syntax) and/or a dynamic compression value (see -c). NEWS ==== Thanks to everyone helping me out finding those bugs and suggesting new features. A special thanks to DSPGuru for making an superb frontend to Azid, and giving meg lots and lots of feedback. This is the list of new features added to Azid: v1.9 ---- - Added support for Dolby Pro Logic II downmix output - Fixed popping bug when playing back ac3 files with low volumes. - Increased disk-caching memory to decrease frantic disk activities by buffering more data. - Added support for 32-bits integer wav's - Default LFE into LR downmix level is changed from 0dB to -3dB, since there are to front-speakers. - Added PIII and P4 optimized version of the decoder v1.8 ---- - Fixed DRC bug that caused occational clicks and pops when decoding using normal dynamic compression. - Fixed WAV-header bug. Azid now generates WAV-files with proper headers. This bug occurred on wav32 output, which would generate an integer header with float data. - Added new file output types and renamed them to give them simpler and more informative names. See -F option for more information. - Fixed bug in azid soundcard playback that caused azid to hang on ac3-files shorter than 0.5 seconds. - Fixed the downmix overflow logic. The decoder will now print the largest overflow within a block, not the first, making it more reliable for finding the maximum level of the file. - Added a warning output enable/disable function (-w). - Added statistics output printing the total maximum level of the channels, and how many samples that overflowed. - Added support of stdout/stdin streaming. This can be done by using the '-' character as input and/or output filename. - Proper Ctrl+C handling - Fixed proper returnvalue from azid.exe - Added a two-pass maximize function. This can be used to maximize the volume of the decoded audio. - Added sectional decoding opertunities with -B and -E. With this function a section of the input file can be decoded, not only the entire file - Added support for commandfile-scripts for easier reuse of options - Fixed proper dynamic compression for channel 1 on 1+1 ac3 files DESCRIPTION =========== This section describes each setting of the decoder and how it affects the decoding process. The command-line syntax of Azid is: azid.exe [options] input.ac3 [output.wav] There are three versions of Azid available. One generic version (azid.exe), one optimized for Intel Pentium III CPU's (azid_P3.exe) and one for Intel Pentium 4 (azid_P4.exe). Please use the version that fits your CPU, the higher CPU version you can use, the faster the application works. The output.wav file is optional. If omitted, Azid sends the audio to your soundcard. If the -N option is used, no ouput will be produced (neither to a file nor to the soundcard). The numbers of entries in the -o option controls the number of output channels that Azid will produce. The default option -ol,r will produce a stereo wav or play stereo sound. If, however, -oc is used, the wav-file will be mono and the playback will be mono. More than 2 items in the -o option will create multi-channel output. Dependent on your soundcard (driver), multichannel output might not be possible. Streaming --------- If the input file is given as '-' Azid will take its input from stdin stream instead of a file, and if the output file is given as '-' its output will be sent to stdout stream. This has two major side effects: If the input file is stdin, you cannot use options that will try to seek the file, like the two-pass maximize in the input. The same applies to the output, you cannot use wav-types outputs, because the same seeking mechanisms are used there to write the proper wav-header. The second side effect is that the streaming ability of Windows is broken. Decoding overflow ----------------- When decoding 1 on 1 (2ch AC3 to 2 channel output and 6ch AC3 to 6 channel output, etc.), small overflows can be observed from time to time when not using any dynamic compression. This is a normal because of the way AC3 works. Quote from the AC3 specification (p.93 1.paragraph): "... Since the output signal consists of the original signal plus coding error, it is possible for the output signal to exceed 100% level even though the original input signal was less than or equal to 100% level." When a downmix overflow is encountered, the output signal will be saturated to 0dB FS to prevent overflow (wrap around). COMMAND OPTIONS =============== -a, --maximize -------------- Default: omitted This option will enable a two-pass maximize function of azid. Azid will in the first pass scan the entire file to find the maximum level. In the second pass the audio will be properly decoded, gaining it up to 0dB FS. NOTE: Sometimes when you use this function, the output will still create downmix overflow warnings. This is normal. It happens because the signal has touched the 0dB FS, or because of some random value within the signal has caused it to slightly overload. -b BOOL, --bsi-log=BOOL ----------------------- Default: true The AC3 bitstream contains a BSI (Bit Stream Information) section. This section contains information about the bitstream, like sampling rate, number of channels, and other informative information. This command option enables/disables such print-outs. A typlical BSI print-out looks like this: +------ BSI ----- | Bitrate: 448 kbit (48 kHz) | Mode: Complete Main (CM) | Audio mode: 2/2 L,R,SL,SR | Surround mix level: -3.0dB | Dialogue level: -27dB | Language: English | Mixlevel: 105dB SPL | Roomtype: Small root, flat monitor | Stream: Copyright protected, Original stream +---------------- -B TIME, --begin=TIME --------------------- Default: #0 This options enables you to control when or where in the file the decoding should start. The decoder skips the frames until the specified time or frame has been reached and starts from there to produce output. Please note that azid doesnt simply skip to the indicated time, but will parse through the stream to the indicated time. This is done to be able to find the correct point to start decoding. The argument can be given as a frame number (#num) or as a time ([[HH:]MM:]SS[.mss]). Examples: -B#0 Start decoding at frame 0 (inclusive) -B#100 Start decoding at frame 100 -B23 Start decoding at second 23 -B1:00 Start decoding at one minute -c COMPR, --dcompr=COMPR ------------------------ Default: none This option sets the overall dynamic compression in the decoder. This value is applied to every output speaker. The bitstream contains information of how much to amplify or attenuate the sound to decrease the overall dynamic variations (loudness) in the program contents. Different options exists to choose the wanted dynamic reduction: o none No dynamic compression. The program contents is unchanged. o normal Normal dynamic compression. Normal in-store decoders use this as an hardcoded default. o light Light dynamic compression. This is 50% (-6db) of the reduction/gain that normal dynamic compression would give. o heavy Heavy dynamic compression. Intended for poor listening environment with much background noise. o inverse Dynamic expansion. This is the inverse value of the light dynamic compression, i.e. it makes strong sounds stronger and weaker sounds weaker. -C LEVEL, --clevel=LEVEL ------------------------ Default: BSI This command option controls the center dowmix level into the LR channels. Normally, the BSI section contains a field which tells the decoder of how to downmix the center channel into the LR channels. With this option, the user may override the BSI center downmix level and specify a custom value. Note that this option is only active when the output decode mode (-d) is 2/x. Allowable values is gain values (either in db's or a positive numerical value) or BSI. When BSI is selected, the center downmix level gets its value from the BSI section. --ch#=ATTRIB0[,ATTRIB1[,...]] ----------------------------- This option sets one or more attributes for the given channel. There are two major types of channels available: o The input channels. This is channels coming directly from the decoder prior to downmixing. Each of these channes represent the same as the channels put into the ac3 coder at the studio. Allowed channel names are: l,c,r,sl,sr,s or lfe. o The output channel or speaker. It refers to a output channel after downmixing and output selecting (-o). It refers directly to the index of the -o option. E.g. '-ol,r' implies that output channel 0 the left channel, and the output channel 1 is the right channel. If '-oc,c --ch0=12db' is used, both output will contain the center channel, but only the first channel will have 12db gain. Allowed output channel names are: 0,1,2,3,4 and 5. The attributes may be: o Channel gain. This specifies how much the signal on the given channel should be amplified/attenuated. Legal values are positive numbers or a logarithmic value written with the postfix 'db'. Examples: --chl=12 --chc=0 --ch0=+3db --ch1=-3db o Channel dynamic compression. This specifies the dynamic compression to use for that channel. Allowable values are: none,light,normal, heavy and inverse. Examples: --chc=normal Several attributes may be separated by commas. Like this: --chc=normal,3db or --chlfe=light,0.5 or --ch0=none,-3db -d FRONT/REAR, --decode=FRONT/REAR ---------------------------------- Default: 2/0 This option selects how many front and rear speakers the decoder should downmix for. The argument is given as front speakers/rear speakers. Note that this option only sets the downmix type, not the actual output. The -o option controls which channels to output. This option does not control the LFE channel (see -l and -L). Possible values are: 1/0, 2/0, 3/0, 1/1, 2/1, 3/1, 1/2, 2/2, 3/2 -e ERROR_ACTION, --erraction=ERROR_ACTION ----------------------------------------- Default: zero This options contols the decoder action in case of bitstream errors. Possible values are: o quit. This causes Azid to quit the entire application if it encounters an error in the bitstream. o zero. The decoder will skip the current frame of ac3-data and pad the output with silence and continue with the next frame of data. -E TIME, --end=TIME ------------------- This option sets when to stop decoding. (See -B). The argument can be given as a frame number (as #nn) or as a time (as [[HH:]MM:]SS[.mss]). The argument is inclusive, i.e. if #100 is given, it will decode frame 100 and then stop. -f BOOL, --rear-filter=BOOL --------------------------- Default: off This option controls rear-channel filtering in 2/0 output mode. The filter is a 2nd order Butterworth filter with at -3 dB point at 7 kHz. There are two major applications for this feature: o To provide proper Pro Logic downmix of the rear channels o Phasing-problems in the downmix (washy sound) caused by the rear channel downmix into the L R channels. Usually the rear channels are phased 90 deg in respect of the front channels prior or inside the ac3 encoder. This is done to avoid phasing problems when downmixing the program contents to two channels. Some sources do not provide this shifting, and thus this feature is added. The filter provides an increasing phase shift according to frequency. It is 90 deg at 7kHz. NOTE: This option is only effective when 2/0 output mode is selected (-d 2/0). -F FILE_TYPE, --filetype=FILE_TYPE ---------------------------------- Default: wav Selects the file type to generate. Possible values are: o wav. Generates "normal" 16-bits wav. o wav24. Generate 24-bit integer wav. o wav32. Generate 32-bits integer wav o wav_float. Generate 32-bits floating-point (IEEE) wavs. o pcm. Generate 16-bit pcm (equal to wav, only without the wav-header) o pcm32. Generate 32-bit integer PCM o pcm_float. Generate 32-bits floating-point (IEEE) PCM output. -g GAIN, --gain=GAIN -------------------- Default: 1.0 (or 0db) This option controls the main (output speaker) gain. The value can be given in db's (by specifying "db" after the argument) or a positive numerical value. Examples: -g-3db, -g5.3, -g6db -i FILE, --script=FILE ---------------------- This option enables you to set all azid option from a command script file. This file uses the following syntax: o All lines beginning with '#' or ';' is regarded as comments o Blank lines will be ignored o An option is given as command[=argument]. The argument can be omitted if the command does not require an argument. o The command are equal to the long names of the commandline options (the -- options). Example: gain = 9dB filetype = pcm32 no-output -l LFE_LEVEL, --lfe=LFE_LEVEL ----------------------------- Default: 0.0 This controls the downmix-level of the LFE channel into the LFE output speaker. I.e. if this options is set to a non-zero value, the LFE channel output may be listened to with the -olfe option. -L LRLFE_LEVEL, --lrlfe=LRLFE_LEVEL ----------------------------------- Default: 1.0 (or 0db) This controls the downmix-level of the LFE channel into the LR channels. -m MONO_MODE, --mono=MONO_MODE ------------------------------ Default: stereo This option control what type of 1+1 decoding should be used. A special channel configuration exists, where the stream contains two mono audio channels (called 1+1). Selectable options: o ch1. Route channel 1 into center. o ch2. Route channel 2 into center. o mono. Route channel 1 + channel to into center. o stereo. Route channel 1 into left and channel 2 into right. -M BOOL, --matrix-log=BOOL -------------------------- Default: off This option makes the decoder print the dowmix matrix with its individual coeffesients. A typical print would look something like this: +------ DOWNNMIX MATRIX ----- | IN0 IN1 IN2 IN3 IN4 IN5 | L : +0.2426 +0.1716 +0.0000 -0.1716 -0.1716 +0.2426 | C : +0.0000 +0.0000 +0.0000 +0.0000 +0.0000 | R : +0.0000 +0.1716 +0.2426 +0.1716 +0.1716 +0.2426 | SL : +0.0000 +0.0000 +0.0000 | SR : +0.0000 +0.0000 +0.0000 | LFE: +0.0000 +0.0000 +0.0000 +0.0000 +0.0000 +---------------------------- The channels on the top (INx) are the input channels. Which channel each of these inputs represent can be read from the audio mode section in the BSI printout: | Audio mode: 2/2 L,R,SL,SR Here L is IN0, R is IN1, etc. Note that the input channel gain does not affect the downmix matrix coeffesients, while -C and -S does. -n BOOL, --norm=BOOL -------------------- Default: false This selects if the decoder should use dialog normalization reduction. The normal dialogue level in a program is defined a reference of loudness, 0db. The BSI info variable "dialogue level" informs how much this dialogue level is under 0db full-scale (FS) - or how much headroom there is above the dialogue level before clip. One of Dolby's intentions with this variable is to ensure that all dialogue levels are played back with the same volume, regardless of the program's amount of headroom. It is good to have when the movie you're looking at is interrupted by a commercial break, where the headroom varies enormously. (It prevents blowing your ears off when the break comes.) This feature is implemented by attenuate everything such that all programs have 31 db headroom, regardless of its original headroom. For a typical -27db headroom program, this will case a -4db gain. -N, --no-output --------------- This options causes the decoder not to produce any output, neither to a wav- file nor to the soundcard. This is ideal for running through the file to check its validy. It requires no arguments. -o SEQUENCE, --output=SEQUENCE ------------------------------ Default: l,r This options controls the channel and the sequence of the output channels. The selectable channels are all input channels (l,c,r,s,sl,sr and lfe) and a special zero-data channel (0). Up to 6 channels may be listed with this command. The -d option controls what kind of decoding target to use. This -o option controls which of these channels to ouput (and their sequence). Let's say for example that you have a 4 channel soundcard. You would like to have left and right on one of the outputs and surround left and surround left on the other. To do this you must specify -ol,r,sl,sr. -p PRESET, --preset=PRESET -------------------------- Default: 2ch Azid has some pre-defined settings. The default is 2/0 which all other settings are derived from. The default command-prompt is: -ezero -b1 -z1 -M0 -mstereo -ssurround -d2/0 -ol,r -L1 -l0 -Cbsi -Sbsi -cnone -n0 -g1 (which is the same as using -p2ch and not using the -p option at all). The pre-defined options are: o 2ch. This is the configuration for 2/0 channel decoding. This option is really redundant, since this is the default preset. o 4ch. This setting will produce a 4 channel output, 2/2. The command prompt equivalent is: -d2/2 -ol,r,sl,sr o 6ch. This setting will produce a 6 channel output, 3/2+lfe. The command prompt equivalent is: -d3/2 -L0 -l1 -ol,r,sl,sr,c,lfe -q, --no-logging ---------------- This option will disable the output logging. No BSI info, no settings, nor bitstream error will be shown. This option overrides the -b, -z and -M option. It requires no argument. -Q, --no-progress ----------------- This option will disable the decoding progress indicator. It does not require any arguments. -s STEREO_MODE, --stereo=STEREO_MODE ------------------------------------ Default: surround When 2/0 decoding is selected, this option controls what kind of stereo downmix should be applied. Possible values are: o mono. Used for mono downmixes o stereo. Standard stereo downmix o dpl (default). Dolby Pro Logic downmix is selected o dplii. The new Dolby Pro Logic II downmix is selected -S LEVEL, --slevel=LEVEL ------------------------ Default: BSI This command option controls the surround dowmix level into the LR channels. Normally, the BSI section contains a field which tells the decoder of how to downmix the surround channels into the LR channels. With this option, the user may override the BSI surround downmix level and specify a custom value. Note that this option is only active when the output decode mode (-d) is 2/x and the input stream is either x/1 or x/2. Allowable values is gain values (either in db's or a positive numerical value) or BSI. When BSI is selected, the surround downmix level gets its value from the BSI section. -w BOOL, --warn=BOOL -------------------- Default: on This options selects if warning output should be printed. Warnings are messages like "Downmix overflow" etc. -z BOOL, --set-log=BOOL ----------------------- Default: on This option selects if the current settings should be printed in an easy- readable output. Like this: +------ SETTINGS ----- | Input channel configuration: | Left : None compression, +0dB gain | Center : None compression, +0dB gain | Right : None compression, +0dB gain | Sur Left : None compression, +0dB gain | Sur Right: None compression, +0dB gain | LFE : None compression, +0dB gain | Output configuration: 2/0 | Ch0 [Left ]: None compression, +0dB gain | Ch1 [Right ]: None compression, +0dB gain | Output Dual mono mode: Stereo | Output Stereo mode: Dolby surround compatible | LFE levels: To LR +0dB, To LFE -INF | Center mix level: +40.0dB | Surround mix level: BSI | Dialog normalization: No +---------------------