home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
PC World 2007 March
/
PCWorld_2007-03_cd.bin
/
audio-video
/
autogk
/
AutoGordianKnot.2.40.Setup.exe
/
tools
/
azid.txt
< prev
next >
Wrap
Text File
|
2003-12-18
|
26KB
|
782 lines
AZID Readme v1.9 build 922 (2003-12-18)
Copyright (C) 1997-2003 Midas <midas@egon.gyaloglo.hu>
-------------------------------------------------------------------------------
Introduction
============
This is the documentation for the AC3 decoder application, Azid. It is
written by Midas <midas@egon.gyaloglo.hu>, copyright (C) 1997-2003
Midas.
The most recent update of this program can be found on the excellent
pages of http://www.doom9.org/
Usage and legal conditions:
---------------------------
This is a test implemenation of standard A/52 from ATSC (Digital Audio
Compression Standard), and it may contain algorithms covered by pending
patents. This application may solely be used for proving that bitstreams
are compliant to this standard for test and demonstration purposes only.
Any other use may be prohibited by law in your country. The author has
no liability regarding this application whatsoever. This application
may be distributed freely unless prohibited by law.
Overview
--------
This document assumes that you know what AC3 is. I wont give any
introduction to what AC3 is and how it is used. The internet is full
of pages describing what AC3 is all about, the more authorative is
probably for the authors of AC3: Dolby. http://www.dolby.com
Now for the technical stuff: AC3 is a digital compression algorithm
which may compress up to 5 full bandwidth channels and one
low-frequency effects channel (with limited bandwidth of 120Hz) into a
bitstream. The size of this compressed bitstream is typically reduced
by a factor of 13 compared with the raw data rate.
The specification for the AC3 decoder can be found in the ATSC
specification A/52 at http://www.dolby.com or http://www.atsc.org.
AC3 encoded audio is divided into frames. One frame gives 1536 samples
of audio or 32ms of audio at 48kHz sampling rate. A single frame is
divided into several sub-sections:
- syncronization
- bit stream information (BSI)
- 6 audio blocks
- auxilliary data (and CRC)
The BSI section contains information regarding the bitstream and the
current frame. It contains information like samplerate, number of
encoded channels, downmix-levels, dynamic compression types, program
contents, etc.
The audio block contains the actual encoded audio. One block gives 256
samples (approx. 5.3ms at 48kHz). One audio block is atomic; the audio
decoding operation is repeated for each of these six blocks. The
specific details of this operation can be found in the A/52 spec.
Decoder operation
-----------------
The decoder decodes an audio block into elementary channels of
audio. These elementary channels represents the same channels that
where fed into the ac3 coder at the studio; like center, left, right,
etc. If the number of actual output channels is fewer than the encoded
number of audio channels, the decoder must downmix these channels into
the correct number of channels.
In this documentation these channels are called "input channels" and
the channels are named: left (l), right (r), center (c), surround left
(sl or s), surround right (sr) and low-frequency effect channel (lfe).
The downmix operation reduces the number of input channels to the
requested number of speakers. This operation is controlled by the
option -d. It selects how many front and rear speakers the decoder
should decode to. If for example -d2/0 (2 front speakers, no rear
speakers) is selected, everything is mixed into these two
speakers/channels.
Then the audio is fed to the output selector. This controls which of
these channels to route to the actual speaker output. The first
speaker output is named output speaker 0, the next output speaker 1,
etc. The decoder supports up to max. 6 output speakers.
The -o operaion controls what channel(s) to output. If -d2/0 is
selected, the left and the right channel may be output with the -ol,r
option. In this case, all other channels than l or r does not contain
any audio because -d2/0 generates only audio in the l and r
channels. Other sequences of channel output may also be chosen, or the
same channel may be output several times. For example: -or,l or -oc,c
is both legal options.
The -l and -L options control how the LFE channel is downmixed. The -l
selects the downmix level of the LFE channel into the LFE (output)
channel, while -L controls the amount of LFE audio into the left and
right channels.
TIP: If -d3/2 decoding is chosen, no downmix is performed and it's
possible to listen to individual channels selectable with the -o
option. E.g. use -ol,r to listen to the left and right channels, or
use -osl,sr to listen to the surround channels.
A special command option named --ch may be used to individually change
the attributes for each channel. This option may be used both on input
channels (named l,c,r,etc.) or on output speakers (numbers 0-5). The
attributes may contain gain (see -g for syntax) and/or a dynamic
compression value (see -c).
NEWS
====
Thanks to everyone helping me out finding those bugs and suggesting
new features. A special thanks to DSPGuru for making an superb
frontend to Azid, and giving meg lots and lots of feedback.
This is the list of new features added to Azid:
v1.9
----
- Added support for Dolby Pro Logic II downmix output
- Fixed popping bug when playing back ac3 files with low volumes.
- Increased disk-caching memory to decrease frantic disk activities by
buffering more data.
- Added support for 32-bits integer wav's
- Default LFE into LR downmix level is changed from 0dB to -3dB, since
there are to front-speakers.
- Added PIII and P4 optimized version of the decoder
v1.8
----
- Fixed DRC bug that caused occational clicks and pops when decoding
using normal dynamic compression.
- Fixed WAV-header bug. Azid now generates WAV-files with proper
headers. This bug occurred on wav32 output, which would generate an
integer header with float data.
- Added new file output types and renamed them to give them simpler
and more informative names. See -F option for more information.
- Fixed bug in azid soundcard playback that caused azid to hang on
ac3-files shorter than 0.5 seconds.
- Fixed the downmix overflow logic. The decoder will now print the
largest overflow within a block, not the first, making it more
reliable for finding the maximum level of the file.
- Added a warning output enable/disable function (-w).
- Added statistics output printing the total maximum level of the
channels, and how many samples that overflowed.
- Added support of stdout/stdin streaming. This can be done by using
the '-' character as input and/or output filename.
- Proper Ctrl+C handling
- Fixed proper returnvalue from azid.exe
- Added a two-pass maximize function. This can be used to maximize the
volume of the decoded audio.
- Added sectional decoding opertunities with -B and -E. With this
function a section of the input file can be decoded, not only the
entire file
- Added support for commandfile-scripts for easier reuse of options
- Fixed proper dynamic compression for channel 1 on 1+1 ac3 files
DESCRIPTION
===========
This section describes each setting of the decoder and how it affects the
decoding process.
The command-line syntax of Azid is:
azid.exe [options] input.ac3 [output.wav]
There are three versions of Azid available. One generic version
(azid.exe), one optimized for Intel Pentium III CPU's (azid_P3.exe)
and one for Intel Pentium 4 (azid_P4.exe). Please use the version that
fits your CPU, the higher CPU version you can use, the faster the
application works.
The output.wav file is optional. If omitted, Azid sends the audio to
your soundcard. If the -N option is used, no ouput will be produced
(neither to a file nor to the soundcard).
The numbers of entries in the -o option controls the number of output
channels that Azid will produce. The default option -ol,r will produce
a stereo wav or play stereo sound. If, however, -oc is used, the
wav-file will be mono and the playback will be mono. More than 2 items
in the -o option will create multi-channel output. Dependent on your
soundcard (driver), multichannel output might not be possible.
Streaming
---------
If the input file is given as '-' Azid will take its input from stdin
stream instead of a file, and if the output file is given as '-' its
output will be sent to stdout stream. This has two major side effects:
If the input file is stdin, you cannot use options that will try to
seek the file, like the two-pass maximize in the input. The same
applies to the output, you cannot use wav-types outputs, because the
same seeking mechanisms are used there to write the proper wav-header.
The second side effect is that the streaming ability of Windows is
broken.
Decoding overflow
-----------------
When decoding 1 on 1 (2ch AC3 to 2 channel output and 6ch AC3 to 6
channel output, etc.), small overflows can be observed from time to
time when not using any dynamic compression. This is a normal because
of the way AC3 works. Quote from the AC3 specification (p.93
1.paragraph):
"... Since the output signal consists of the original signal plus
coding error, it is possible for the output signal to exceed 100%
level even though the original input signal was less than or equal to
100% level."
When a downmix overflow is encountered, the output signal will be
saturated to 0dB FS to prevent overflow (wrap around).
COMMAND OPTIONS
===============
-a, --maximize
--------------
Default: omitted
This option will enable a two-pass maximize function of azid. Azid
will in the first pass scan the entire file to find the maximum
level. In the second pass the audio will be properly decoded, gaining
it up to 0dB FS.
NOTE: Sometimes when you use this function, the output will still
create downmix overflow warnings. This is normal. It happens because
the signal has touched the 0dB FS, or because of some random value
within the signal has caused it to slightly overload.
-b BOOL, --bsi-log=BOOL
-----------------------
Default: true
The AC3 bitstream contains a BSI (Bit Stream Information)
section. This section contains information about the bitstream, like
sampling rate, number of channels, and other informative information.
This command option enables/disables such print-outs. A typlical BSI
print-out looks like this:
+------ BSI -----
| Bitrate: 448 kbit (48 kHz)
| Mode: Complete Main (CM)
| Audio mode: 2/2 L,R,SL,SR
| Surround mix level: -3.0dB
| Dialogue level: -27dB
| Language: English
| Mixlevel: 105dB SPL
| Roomtype: Small root, flat monitor
| Stream: Copyright protected, Original stream
+----------------
-B TIME, --begin=TIME
---------------------
Default: #0
This options enables you to control when or where in the file the
decoding should start. The decoder skips the frames until the
specified time or frame has been reached and starts from there to
produce output.
Please note that azid doesnt simply skip to the indicated time, but
will parse through the stream to the indicated time. This is done to
be able to find the correct point to start decoding.
The argument can be given as a frame number (#num) or as a time
([[HH:]MM:]SS[.mss]). Examples:
-B#0 Start decoding at frame 0 (inclusive)
-B#100 Start decoding at frame 100
-B23 Start decoding at second 23
-B1:00 Start decoding at one minute
-c COMPR, --dcompr=COMPR
------------------------
Default: none
This option sets the overall dynamic compression in the decoder. This
value is applied to every output speaker.
The bitstream contains information of how much to amplify or attenuate
the sound to decrease the overall dynamic variations (loudness) in the
program contents. Different options exists to choose the wanted
dynamic reduction:
o none No dynamic compression. The program contents is unchanged.
o normal Normal dynamic compression. Normal in-store decoders use
this as an hardcoded default.
o light Light dynamic compression. This is 50% (-6db) of the
reduction/gain that normal dynamic compression would give.
o heavy Heavy dynamic compression. Intended for poor listening
environment with much background noise.
o inverse Dynamic expansion. This is the inverse value of the light
dynamic compression, i.e. it makes strong sounds stronger
and weaker sounds weaker.
-C LEVEL, --clevel=LEVEL
------------------------
Default: BSI
This command option controls the center dowmix level into the LR
channels. Normally, the BSI section contains a field which tells the
decoder of how to downmix the center channel into the LR channels.
With this option, the user may override the BSI center downmix level
and specify a custom value. Note that this option is only active when
the output decode mode (-d) is 2/x.
Allowable values is gain values (either in db's or a positive
numerical value) or BSI. When BSI is selected, the center downmix
level gets its value from the BSI section.
--ch#=ATTRIB0[,ATTRIB1[,...]]
-----------------------------
This option sets one or more attributes for the given channel. There
are two major types of channels available:
o The input channels. This is channels coming directly from the
decoder prior to downmixing. Each of these channes represent the
same as the channels put into the ac3 coder at the studio.
Allowed channel names are: l,c,r,sl,sr,s or lfe.
o The output channel or speaker. It refers to a output channel
after downmixing and output selecting (-o). It refers directly
to the index of the -o option. E.g. '-ol,r' implies that output
channel 0 the left channel, and the output channel 1 is the
right channel. If '-oc,c --ch0=12db' is used, both output will
contain the center channel, but only the first channel will have
12db gain. Allowed output channel names are: 0,1,2,3,4 and 5.
The attributes may be:
o Channel gain. This specifies how much the signal on the given
channel should be amplified/attenuated. Legal values are
positive numbers or a logarithmic value written with the postfix
'db'. Examples: --chl=12 --chc=0 --ch0=+3db --ch1=-3db
o Channel dynamic compression. This specifies the dynamic
compression to use for that channel. Allowable values are:
none,light,normal, heavy and inverse. Examples: --chc=normal
Several attributes may be separated by commas. Like this:
--chc=normal,3db or --chlfe=light,0.5 or --ch0=none,-3db
-d FRONT/REAR, --decode=FRONT/REAR
----------------------------------
Default: 2/0
This option selects how many front and rear speakers the decoder
should downmix for. The argument is given as front speakers/rear
speakers.
Note that this option only sets the downmix type, not the actual
output. The -o option controls which channels to output. This option
does not control the LFE channel (see -l and -L).
Possible values are: 1/0, 2/0, 3/0, 1/1, 2/1, 3/1, 1/2, 2/2, 3/2
-e ERROR_ACTION, --erraction=ERROR_ACTION
-----------------------------------------
Default: zero
This options contols the decoder action in case of bitstream
errors. Possible values are:
o quit. This causes Azid to quit the entire application if it
encounters an error in the bitstream.
o zero. The decoder will skip the current frame of ac3-data and
pad the output with silence and continue with the next frame of
data.
-E TIME, --end=TIME
-------------------
This option sets when to stop decoding. (See -B). The argument can be
given as a frame number (as #nn) or as a time (as [[HH:]MM:]SS[.mss]).
The argument is inclusive, i.e. if #100 is given, it will decode frame
100 and then stop.
-f BOOL, --rear-filter=BOOL
---------------------------
Default: off
This option controls rear-channel filtering in 2/0 output mode. The
filter is a 2nd order Butterworth filter with at -3 dB point at 7
kHz. There are two major applications for this feature:
o To provide proper Pro Logic downmix of the rear channels
o Phasing-problems in the downmix (washy sound) caused by the rear
channel downmix into the L R channels.
Usually the rear channels are phased 90 deg in respect of the front
channels prior or inside the ac3 encoder. This is done to avoid
phasing problems when downmixing the program contents to two
channels. Some sources do not provide this shifting, and thus this
feature is added.
The filter provides an increasing phase shift according to
frequency. It is 90 deg at 7kHz.
NOTE: This option is only effective when 2/0 output mode is selected
(-d 2/0).
-F FILE_TYPE, --filetype=FILE_TYPE
----------------------------------
Default: wav
Selects the file type to generate. Possible values are:
o wav. Generates "normal" 16-bits wav.
o wav24. Generate 24-bit integer wav.
o wav32. Generate 32-bits integer wav
o wav_float. Generate 32-bits floating-point (IEEE) wavs.
o pcm. Generate 16-bit pcm (equal to wav, only without
the wav-header)
o pcm32. Generate 32-bit integer PCM
o pcm_float. Generate 32-bits floating-point (IEEE) PCM
output.
-g GAIN, --gain=GAIN
--------------------
Default: 1.0 (or 0db)
This option controls the main (output speaker) gain. The value can be
given in db's (by specifying "db" after the argument) or a positive
numerical value. Examples: -g-3db, -g5.3, -g6db
-i FILE, --script=FILE
----------------------
This option enables you to set all azid option from a command script
file. This file uses the following syntax:
o All lines beginning with '#' or ';' is regarded as comments
o Blank lines will be ignored
o An option is given as command[=argument]. The argument can be
omitted if the command does not require an argument.
o The command are equal to the long names of the commandline options
(the -- options).
Example:
gain = 9dB
filetype = pcm32
no-output
-l LFE_LEVEL, --lfe=LFE_LEVEL
-----------------------------
Default: 0.0
This controls the downmix-level of the LFE channel into the LFE output
speaker. I.e. if this options is set to a non-zero value, the LFE
channel output may be listened to with the -olfe option.
-L LRLFE_LEVEL, --lrlfe=LRLFE_LEVEL
-----------------------------------
Default: 1.0 (or 0db)
This controls the downmix-level of the LFE channel into the LR
channels.
-m MONO_MODE, --mono=MONO_MODE
------------------------------
Default: stereo
This option control what type of 1+1 decoding should be used. A
special channel configuration exists, where the stream contains two
mono audio channels (called 1+1). Selectable options:
o ch1. Route channel 1 into center.
o ch2. Route channel 2 into center.
o mono. Route channel 1 + channel to into center.
o stereo. Route channel 1 into left and channel 2 into right.
-M BOOL, --matrix-log=BOOL
--------------------------
Default: off
This option makes the decoder print the dowmix matrix with its
individual coeffesients. A typical print would look something like
this:
+------ DOWNNMIX MATRIX -----
| IN0 IN1 IN2 IN3 IN4 IN5
| L : +0.2426 +0.1716 +0.0000 -0.1716 -0.1716 +0.2426
| C : +0.0000 +0.0000 +0.0000 +0.0000 +0.0000
| R : +0.0000 +0.1716 +0.2426 +0.1716 +0.1716 +0.2426
| SL : +0.0000 +0.0000 +0.0000
| SR : +0.0000 +0.0000 +0.0000
| LFE: +0.0000 +0.0000 +0.0000 +0.0000 +0.0000
+----------------------------
The channels on the top (INx) are the input channels. Which channel
each of these inputs represent can be read from the audio mode section
in the BSI printout:
| Audio mode: 2/2 L,R,SL,SR
Here L is IN0, R is IN1, etc. Note that the input channel gain does
not affect the downmix matrix coeffesients, while -C and -S does.
-n BOOL, --norm=BOOL
--------------------
Default: false
This selects if the decoder should use dialog normalization
reduction. The normal dialogue level in a program is defined a
reference of loudness, 0db. The BSI info variable "dialogue level"
informs how much this dialogue level is under 0db full-scale (FS) - or
how much headroom there is above the dialogue level before clip.
One of Dolby's intentions with this variable is to ensure that all
dialogue levels are played back with the same volume, regardless of
the program's amount of headroom. It is good to have when the movie
you're looking at is interrupted by a commercial break, where the
headroom varies enormously. (It prevents blowing your ears off when
the break comes.)
This feature is implemented by attenuate everything such that all
programs have 31 db headroom, regardless of its original headroom. For
a typical -27db headroom program, this will case a -4db gain.
-N, --no-output
---------------
This options causes the decoder not to produce any output, neither to
a wav- file nor to the soundcard. This is ideal for running through
the file to check its validy. It requires no arguments.
-o SEQUENCE, --output=SEQUENCE
------------------------------
Default: l,r
This options controls the channel and the sequence of the output
channels. The selectable channels are all input channels
(l,c,r,s,sl,sr and lfe) and a special zero-data channel (0). Up to 6
channels may be listed with this command.
The -d option controls what kind of decoding target to use. This -o
option controls which of these channels to ouput (and their
sequence). Let's say for example that you have a 4 channel
soundcard. You would like to have left and right on one of the outputs
and surround left and surround left on the other. To do this you must
specify -ol,r,sl,sr.
-p PRESET, --preset=PRESET
--------------------------
Default: 2ch
Azid has some pre-defined settings. The default is 2/0 which all other
settings are derived from. The default command-prompt is:
-ezero -b1 -z1 -M0 -mstereo -ssurround -d2/0 -ol,r -L1 -l0
-Cbsi -Sbsi -cnone -n0 -g1
(which is the same as using -p2ch and not using the -p option at
all). The pre-defined options are:
o 2ch. This is the configuration for 2/0 channel decoding. This
option is really redundant, since this is the default preset.
o 4ch. This setting will produce a 4 channel output, 2/2. The
command prompt equivalent is: -d2/2 -ol,r,sl,sr
o 6ch. This setting will produce a 6 channel output, 3/2+lfe. The
command prompt equivalent is: -d3/2 -L0 -l1 -ol,r,sl,sr,c,lfe
-q, --no-logging
----------------
This option will disable the output logging. No BSI info, no settings,
nor bitstream error will be shown. This option overrides the -b, -z
and -M option. It requires no argument.
-Q, --no-progress
-----------------
This option will disable the decoding progress indicator. It does not
require any arguments.
-s STEREO_MODE, --stereo=STEREO_MODE
------------------------------------
Default: surround
When 2/0 decoding is selected, this option controls what kind of
stereo downmix should be applied. Possible values are:
o mono. Used for mono downmixes
o stereo. Standard stereo downmix
o dpl (default). Dolby Pro Logic downmix is selected
o dplii. The new Dolby Pro Logic II downmix is selected
-S LEVEL, --slevel=LEVEL
------------------------
Default: BSI
This command option controls the surround dowmix level into the LR
channels. Normally, the BSI section contains a field which tells the
decoder of how to downmix the surround channels into the LR channels.
With this option, the user may override the BSI surround downmix level
and specify a custom value. Note that this option is only active when
the output decode mode (-d) is 2/x and the input stream is either x/1
or x/2.
Allowable values is gain values (either in db's or a positive
numerical value) or BSI. When BSI is selected, the surround downmix
level gets its value from the BSI section.
-w BOOL, --warn=BOOL
--------------------
Default: on
This options selects if warning output should be printed. Warnings are
messages like "Downmix overflow" etc.
-z BOOL, --set-log=BOOL
-----------------------
Default: on
This option selects if the current settings should be printed in an
easy- readable output. Like this:
+------ SETTINGS -----
| Input channel configuration:
| Left : None compression, +0dB gain
| Center : None compression, +0dB gain
| Right : None compression, +0dB gain
| Sur Left : None compression, +0dB gain
| Sur Right: None compression, +0dB gain
| LFE : None compression, +0dB gain
| Output configuration: 2/0
| Ch0 [Left ]: None compression, +0dB gain
| Ch1 [Right ]: None compression, +0dB gain
| Output Dual mono mode: Stereo
| Output Stereo mode: Dolby surround compatible
| LFE levels: To LR +0dB, To LFE -INF
| Center mix level: +40.0dB
| Surround mix level: BSI
| Dialog normalization: No
+---------------------