home *** CD-ROM | disk | FTP | other *** search
- Archive-name: audio-fmts/part1
- Submitted-by: Guido van Rossum <guido@cwi.nl>
- Version: 3.08
- Last-modified: 22-Feb-1994
-
- FAQ: Audio File Formats
- =======================
-
- Table of contents
- -----------------
-
- Introduction
- Device characteristics
- Popular sampling rates
- Compression schemes
- Current hardware
- File formats
- File conversions
- Playing audio files on UNIX
- Playing audio files on micros
- The Sound Site Newsletter
- Posting sounds
-
- Appendices (in part 2):
-
- FTP access for non-internet sites
- AIFF Format (Audio IFF)
- The NeXT/Sun audio file format
- IFF/8SVX Format
- Playing sound on a PC
- The EA-IFF-85 documentation
- US Federal Standard 1016 availability
- Creative Voice (VOC) file format
- RIFF WAVE (.WAV) file format
- U-LAW and A-LAW definitions
- AVR File Format
- The Amiga MOD Format
-
-
- Introduction
- ------------
-
- This is version 3 of this FAQ, which I started in November 1991 under
- the name "The audio formats guide". I bumped the major version number
- again at the occasion of the split in two parts: part one is the main
- text and part two consists of the collection of appendices.
-
- I am posting this about once a fortnight, either unchanged (just to
- inform new readers), or updated (if I learn more or when new hardware
- or software becomes popular). I post to alt.binaries.sounds.{misc,d}
- and to comp.dsp, for maximal coverage of people interested in audio,
- and to {news,comp}.answers, for easy reference.
-
- The entire FAQ is also available by anonymous ftp from ftp.cwi.nl
- [192.16.184.180], directory pub/audio, files AudioFormats.{part1,part2}.
-
- BTW: All FAQs, including this one, are available for anonymous ftp on
- the archive site rtfm.mit.edu in directory /pub/usenet/news.answers/.
- The name under which a FAQ is archived appears in the "Archive-Name:"
- line at the top of the article. This FAQ is archived as
- audio-fmts/part[12].
-
- A companion posting with subject "Changes to: ..." is occasionally
- posted listing the diffs between a new version and the last. This is
- not reposted, and it is suppressed when the diffs are bigger than the
- new version.
-
- Send updates, comments and questions to <guido@cwi.nl>. I'd like to
- thank everyone who sent updates in the past.
-
- --Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
-
-
- Device characteristics
- ----------------------
-
- In this text, I will only use the term "sample" to refer to a single
- output value from an A/D converter, i.e., a small integer number
- (usually 8 or 16 bits).
-
- Audio data is characterized by the following parameters, which
- correspond to settings of the A/D converter when the data was
- recorded. Naturally, the same settings must be used to play the data.
-
- - sampling rate (in samples per second), e.g. 8000 or 44100
-
- - number of bits per sample, e.g. 8 or 16
-
- - number of channels (1 for mono, 2 for stereo, etc.)
-
- Approximate sampling rates are often quoted in Hz or kHz ([kilo-]
- Hertz), however, the politically correct term is samples per second
- (samples/sec). Sampling rates are always measured per channel, so for
- stereo data recorded at 8000 samples/sec, there are actually 16000
- samples in a second. I will sometimes write 8 k as a shorthand for
- 8000 samples/sec.
-
- Multi-channel samples are generally interleaved on a frame-by-frame
- basis: if there are N channels, the data is a sequence of frames,
- where each frame contains N samples, one from each channel. (Thus,
- the sampling rate is really the number of *frames* per second.) For
- stereo, the left channel usually comes first.
-
- The specification of the number of bits for U-LAW (pronounced mu-law
- -- the u really stands for the Greek letter mu) samples is somewhat
- problematic. These samples are logarithmically encoded in 8 bits,
- like a tiny floating point number; however, their dynamic range is
- that of 12 bit linear data. Source for converting to/from U-LAW
- (written by Jef Poskanzer) is distributed as part of the SOX package
- mentioned below; it can easily be ripped apart to serve in other
- applications. The official definition is the CCITT standard G.711.
-
- There exists another encoding similar to U-LAW, called A-LAW, which
- is used as a European telephony standard. There is less support for
- it in UNIX workstations.
-
- (See the Appendix for some formulae describing U-LAW and A-LAW.)
-
-
- Popular sampling rates
- ----------------------
-
- Some sampling rates are more popular than others, for various reasons.
- Some recording hardware is restricted to (approximations of) some of
- these rates, some playback hardware has direct support for some. The
- popularity of divisors of common rates can be explained by the
- simplicity of clock frequency dividing circuits :-).
-
- Samples/sec Description
-
- 5500 One fourth of the Mac sampling rate (rarely seen).
-
- 7333 One third of the Mac sampling rate (rarely seen).
-
- 8000 Exactly 8000 samples/sec is a telephony standard that
- goes together with U-LAW (and also A-LAW) encoding.
- Some systems use an slightly different rate; in
- particular, the NeXT workstation uses 8012.8210513,
- apparently the rate used by Telco CODECs.
-
- 11 k Either 11025, a quarter of the CD sampling rate,
- or half the Mac sampling rate (perhaps the most
- popular rate on the Mac).
-
- 16000 Used by, e.g. the G.722 compression standard.
-
- 18.9 k CD-ROM/XA standard.
-
- 22 k Either 22050, half the CD sampling rate, or the Mac
- rate; the latter is precisely 22254.545454545454 but
- usually misquoted as 22000. (Historical note:
- 22254.5454... was the horizontal scan rate of the
- original 128k Mac.)
-
- 32000 Used in digital radio, NICAM (Nearly-Instantaneous
- Companded Audio Multiplex [IBA/BREMA/BBC]) and other
- TV work, at least in the UK; also long play DAT and
- Japanese HDTV.
-
- 37.8 k CD-ROM/XA standard for higher quality.
-
- 44056 This weird rate is used by professional audio
- equipment to fit an integral number of samples in a
- video frame.
-
- 44100 The CD sampling rate. (DAT players recording
- digitally from CD also use this rate.)
-
- 48000 The DAT (Digital Audio Tape) sampling rate for
- domestic use.
-
- Files samples on SoundBlaster hardware have sampling rates that are
- divisors of 1000000.
-
- While professinal musicians disagree, most people don't have a problem
- if recorded sound is played at a slightly different rate, say, 1-2%.
- On the other hand, if recorded data is being fed into a playback
- device in real time (say, over a network), even the smallest
- difference in sampling rate can frustrate the buffering scheme used...
-
- There may be an emerging tendency to standardize on only a few
- sampling rates and encoding styles, even if the file formats may
- differ. The suggested rates and styles are:
-
- rate (samp/sec) style mono/stereo
-
- 8000 8-bit U-LAW mono
- 22050 8-bit linear unsigned mono and stereo
- 44100 16-bit linear signed mono and stereo
-
-
- Compression schemes
- -------------------
-
- Strange though it seems, audio data is remarkably hard to compress
- effectively. For 8-bit data, a Huffman encoding of the deltas between
- successive samples is relatively successful. For 16-bit data,
- companies like Sony and Philips have spent millions to develop
- proprietary schemes. Information about PASC (Philips' scheme) can be
- found in Advanced Digital Audio by Ken C. Pohlmann.
-
- Public standards for voice compression are slowly gaining popularity,
- e.g. CCITT G.721 (ADPCM at 32 kbits/sec) and G.723 (ADPCM at 24 and 40
- kbits/sec). (ADPCM == Adaptive Delta Pulse Code Modulation.) Sun
- Microsoft has placed the source code of a portable implementation of
- these algorithms (as well as G.711, which defines A-LAW and U-LAW) in
- the public domain (needless to say, their proprietary implementation
- distributed in binary form with Solaris is better :-). One place to
- ftp this source code from is ftp.cwi.nl:/pub/audio/ccitt-adpcm.tar.Z.
- Source for another 32 kbits/sec ADPCM implementation, assumed to be
- compatible with Intel's DVI audio format, can be ftp'ed from
- ftp.cwi.nl:/pub/audio/adpcm.shar. (** NOTE: if you are using v1.0,
- you should get v1.1, released 17-Dec-1992, which fixes a serious bug
- -- the quality of v1.1 is claimed to be better than U-LAW **)
-
- GSM 06.10 is a speech encoding in use in Europe that compresses 160
- 13-bit samples into 260 bits (or 33 bytes), i.e. 1650 bytes/sec (at
- 8000 samples/sec). A free implementation can be ftp'ed from
- tub.cs.tu-berlin.de, file /pub/tubmik/gsm-1.0.tar.Z.
-
- There are also two US federal standards, 1016 (Code excited linear
- prediction (CELP), 4800 bits/s) and 1015 (LPC-10E, 2400 bits/s). See
- also the appendix for 1016.
-
- Tony Robinson <ajr@eng.cam.ac.uk> has written a good FAST loss-less
- compression for lots of different audio formats (particularly good for
- WAV and MOD files). The software is available by anonymous ftp from
- svr-ftp.eng.cam.ac.uk [129.169.24.20], directory misc, file
- shorten-1.08.tar.Z.
-
- (Note that U-LAW and silence detection can also be considered
- compression schemes.)
-
- Here's a note about audio codings by Van Jacobson <van@ee.lbl.gov>:
- Several people used the words "LPC" and "CELP" interchangably. They
- are very different. An LPC (Linear Predictive Coding) coder fits
- speech to a simple, analytic model of the vocal tract, then throws
- away the speech & ships the parameters of the best-fit model. An LPC
- decoder uses those parameters to generate synthetic speech that is
- usually more-or-less similar to the original. The result is
- intelligible but sounds like a machine is talking. A CELP (Code
- Excited Linear Predictor) coder does the same LPC modeling but then
- computes the errors between the original speech & the synthetic model
- and transmits both model parameters and a very compressed
- representation of the errors (the compressed representation is an
- index into a 'code book' shared between coders & decoders -- this is
- why it's called "Code Excited"). A CELP coder does much more work
- than an LPC coder (usually about an order of magnitude more) but the
- result is much higher quality speech: The FIPS-1016 CELP we're working
- on is essentially the same quality as the 32Kb/s ADPCM coder but uses
- only 4.8Kb/s (the same as the LPC coder).
-
- The comp.compression FAQ has some text on the 6:1 audio compression
- scheme used by MPEG (a video compression standard-to-be). It's
- interesting to note that video compression reaches much higher ratios
- (like 26:1). This FAQ is ftp'able from rtfm.mit.edu [18.72.1.58] in
- directory /pub/usenet/news.answers/compression-faq, files part1 and
- part2.
-
- Comp.compression also carries a regular posting "How to uncompress
- anything" by David Lemson <lemson@uiuc.edu>, which (tersely) hints on
- which program you need to uncompress a file whose name ends in .<foo>
- for almost any conceivable <foo>. Ftp'able from ftp.cso.uiuc.edu
- (128.174.5.59) in the directory /doc/pcnet as the file compression.
-
- Documentation on a digital cellular telephone system by Qualcomm Inc.
- can be ftp'ed from ftp.qualcomm.com:/pub/cdma; the vocoder is in
- appendix A.
-
- Apple has an Audio Compression/Expansion scheme called ACE (on the GS)
- / MACE (on the Macintosh). It's a lossy scheme that attempts to
- predict where the wave will go on the next sample. There's very little
- quality change on 8:4 compression, somewhat more for 8:3. It does
- guarantee exactly 50% or 62.5% compression, though. I believe MACE
- uses larger ratios/more loss, but I'm unsure of the specific numbers.
- (Marc Sira)
-
-
- Current hardware
- ----------------
-
- I am aware of the following computer systems that can play back and
- (sometimes) record audio data, with their characteristics. Note that
- for most systems you can also buy "professional" sampling hardware,
- which supports much better quality, e.g. >= 44.1 k 16 bits stereo.
- The characteristics listed here are a rough estimate of the
- capabilities of the basic hardware only (and even here I am on thin
- ice, with systems becoming ever more powerful).
-
- machine bits max sampling rate #output channels
-
- Mac (all types) 8 22k 1
- Mac (newer ones) 16 64k 4(128)
- Apple IIgs 8 32k / >70k 16(st)
- PC/soundblaster pro 8 ?/(22k st, 44.1k mo) 1(st)
- PC/soundblaster 16 16 44.1k 1(st)
- PC/pas 8 44.1k st, 88.2k mo 1(st)
- PC/pas-16 16 44.1k st, 88.2k mo 1(st)
- PC/turtle beach multisound 16 44.1k 1(st)
- PC/cards with aria chipset 16 44.1k 1(st)
- PC/roland rap-10 16 44.1k 1(st)
- PC/gravis ultrasound 8/16 44.1k 14-32(st)
- Atari ST 8 22k 1
- Atari STE,TT 8 50k 2
- Atari Falcon 030 16 50k 8(st)
- Amiga 8 varies above 29k 4(st)
- Sun Sparc U-LAW 8k 1
- Sun Sparcst. 10 U-LAW,8,16 48k 1(st)
- NeXT U-LAW,8,16 44.1k 1(st)
- SGI Indigo 8,16 48k 4(st)
- SGI Indigo2,Indy 8,16 48k 16(st,4-channel)
- Acorn Archimedes ~U-LAW ~180k 8(st)
- Sony NWS-3xxx U,A,8,16 8-37.8k 1(st)
- Sony NWS-5xxx U,A,8,16 8-48k 1(st)
- VAXstation 4000 U-LAW 8k 1
- DEC 3000/300-500 U-LAW 8k 1
- DEC 5000/20-25 U-LAW 8k 1
- Tandy 1000/*L* 8 22k 3
- Tandy 2500 8 22k 3
- HP9000/705,710,425e U,A-LAW,16 8k 1
- HP9000/715,725,735 U,A-LAW,16 48k 1(st)
- HP9000/755 option: U,A-LAW,16 48k 1(st)
- NCD MCX terminal U,A,8,16 52k 1(st)
-
- 4(st) means "four voices, stereo"; sampling rates xx/yy are
- different recording/playback rates; *L* is any type with 'L' in it.
-
- All these machines can play back sound without additional hardware,
- although the needed software is not always standard; also, some
- machines need external hardware to record sound (or to record at
- higher quality, like the NeXT, whose built-in sampling hardware only
- does 8000 samples/sec in U-LAW). Please don't send me details on
- optional or 3rd party hardware, there is too much and it is really
- beyond the scope of this FAQ. In particular, there is a separate
- newsgroup devoted to PC sound cards: comp.sys.ibm.pc.soundcard, which
- includes FAQ of its own (also posted to comp.answers and news.answers).
-
- The new VAXstation 4000 (VLC and model 60) series lets you PLAY audio
- (.au) files, and the package DECsound will let you do the recording.
- In fact, DECsound is given away free with Motif 1.1 and supports the
- VAXstation, Sun SPARCstation, DECvoice, and DECaudio devices. Sun
- sound files work without change. The Alpha systems (DEC 3000 Model
- 300, 400, 500) also have DECsound bundled with Motif.
-
- Notes for the DECstation 5000/20-25: You need either XMedia tools from
- DEC ($$$$), or the AudioFile package (which works nicely) from
- crl.dec.com (see below). The audio device is "/dev/bba", you cannot
- send ".au" files directly to the device, the Xmedia/AF software
- provide an "audioserver" which must be run to play/record sounds.
-
- The SGI Personal IRIS 4D/30 and 4D/35 have the same capabilities as
- the Indigo. The audio board was optional on the 4D/30.
- The Indigo2 and Indy features are a superset of the Indigo features.
-
- The new Apple Macs have more powerful audio hardware; the latest
- models have built-in microphones.
-
- Software exists for the PC that can play sound on its 1-bit speaker
- using pulse width modulation (see appendix); the Soundblaster board
- records at rates up to 13 k and plays back up to 22 k (weird
- combination, but that's the way it is).
-
- Here's some info about the newest Atari machine, the Falcon030. This
- machine has stereo 16 bit CODECs and a 32 MHz Motorola 56001 that can
- handle 8 channels of 16 bit audio, up to 50 khz/channel with
- simultaneous playback and record. The Falcon DMA sound engine is also
- compatible with the 8 bit stereo DMA used on the STe and TT. All of
- these systems use signed data.
-
- On the NeXT, the Motorola 56001 DSP chip is programmable and you can
- (in principle) do what you want. The SGI Indigo uses the same DSP chip but
- it can't be programmed by users -- SGI prefers to offer it as a shared
- system resource to multiple applications, thus enabling developers to
- program audio with their Audio Library and avoid code modifications
- for execution on future machines with different audio hardware, i.e. a
- different DSP. For example, the Indigo2 and Indy do not have a DSP chip.
-
- The Amiga also has a 6-bit volume, which can be used to produce
- something like a 14-bit output for each voice. The hardware can also
- use one of each voice-pair to modulate the other in FM (period) or AM
- (volume, 6-bits).
-
- The Acorn Archimedes uses a variation on U-LAW with the bit order
- reversed and the sign bit in bit 0. Being a 'minority' architecture,
- Arc owners are quite adept at converting sound/image formats from
- other machines, and it is unlikely that you'll ever encounter sound in
- one of the Arc's own formats (there are several).
-
- The NCD MCX terminal has audio integrated with its X server. The
- NCDAudio server is an extension of the X server, working together with
- it, with stress on the networking capability of sound transmission.
- The NCDAudio API provides format handling (ULAW8, Linear Unsig 8,
- Linear Sig 8, Linear Sig 16 MSB, Linear Unsig 16 MSB), flowing (to the
- server, from the server, to the i/o, from the i/o), wave form
- generators (Square, Sine, Saw, Constant) and the capability of area
- broadcast using UDP. Provision for manipulating data files
- (SND, WAV, VOC & AU) is also provided.
-
- CD-I machines form a special category. The following formats are used:
-
- - PCM 44.1 kHz standard CD format
- - ADPCM - Addaptive Delta PCM
- - Level A 37.8 kHz 8-bit
- - Level B 37.8 kHz 4-bit
- - Level C 18.9 kHz 4-bit
-
-
- File formats
- ------------
-
- Historically, almost every type of machine used its own file format
- for audio data, but some file formats are more generally applicable,
- and in general it is possible to define conversions between almost any
- pair of file formats -- sometimes losing information, however.
-
- File formats are a separate issue from device characteristics. There
- are two types of file formats: self-describing formats, where the
- device parameters and encoding are made explicit in some form of
- header, and "raw" formats, where the device parameters and encoding
- are fixed.
-
- Self-describing file formats generally define a family of data
- encodings, where a header fields indicates the particular encoding
- variant used. Headerless formats define a single encoding and usually
- allows no variation in device parameters (except sometimes sampling
- rate, which can be a pain to figure out other than by listening to the
- sample).
-
- The header of self-describing formats contains the parameters of the
- sampling device and sometimes other information (e.g. a
- human-readable description of the sound, or a copyright notice). Most
- headers begin with a simple "magic word". (Some formats do not simply
- define a header format, but may contain chunks of data intermingled
- with chunks of encoding info.) The data encoding defines how the
- actual samples are stored in the file, e.g. signed or unsigned, as
- bytes or short integers, in little-endian or big-endian byte order,
- etc. Strictly spoken, channel interleaving is also part of the
- encoding, although so far I have seen little variation in this area.
-
- Some file formats apply some kind of compression to the data, e.g.
- Huffman encoding, or simple silence deletion.
-
- Here's an overview of popular file formats.
-
- Self-describing file formats
- ----------------------------
-
- extension, name origin variable parameters (fixed; comments)
-
- .au or .snd NeXT, Sun rate, #channels, encoding, info string
- .aif(f), AIFF Apple, SGI rate, #channels, sample width, lots of info
- .aif(f), AIFC Apple, SGI same (extension of AIFF with compression)
- .iff, IFF/8SVX Amiga rate, #channels, instrument info (8 bits)
- .voc Soundblaster rate (8 bits/1 ch; can use silence deletion)
- .wav, WAVE Microsoft rate, #channels, sample width, lots of info
- .sf IRCAM rate, #channels, encoding, info
- none, HCOM Mac rate (8 bits/1 ch; uses Huffman compression)
- none, MIME Internet (see below)
- none, NIST SPHERE DARPA speech community (see below)
- .mod or .nst Amiga (see below)
-
- Note that the filename extension ".snd" is ambiguous: it can be either
- the self-describing NeXT format or the headerless Mac/PC format, or
- even a headerless Amiga format.
-
- I know nothing for sure about the origin of HCOM files, only that
- there are a lot of them floating around on our system and probably at
- FTP sites over the world. The filenames usually don't have a ".hcom"
- extension, but this is what SOX (see below) uses. The file format
- recognized by SOX includes a MacBinary header, where the file
- type field is "FSSD". The data fork begins with the magic word "HCOM"
- and contains Huffman compressed data; after decompression it it is 8
- bits unsigned data.
-
- IFF/8SVX allows for amplitude contours for sounds (attack/decay/etc).
- Compression is optional (and extensible); volume is variable; author,
- notes and copyright properties; etc.
-
- AIFF, AIFC and WAVE are similar in spirit but allow more freedom in
- encoding style (other than 8 bit/sample), amongst others.
-
- There are other sound formats in use on Amiga by digitizers and music
- programs, such as IFF/SMUS.
-
- Appendices describes the NeXT and VOC formats; pointers to more info
- about AIFF, AIFC, 8SVX and WAVE (which are too complex to describe
- here) are also in appendices.
-
- DEC systems (e.g. DECstation 5000) use a variant of the NeXT format
- that uses little-endian encoding and has a different magic number
- (0x0064732E in little-endian encoding).
-
- Standard file formats used in the CD-I world are IFF but on the disc
- they're in realtime files.
-
- An interesting "interchange format" for audio data is described in the
- proposed Internet Standard "MIME", which describes a family of
- transport encodings and structuring devices for electronic mail. This
- is an extensible format, and initially standardizes a type of audio
- data dubbed "audio/basic", which is 8-bit U-LAW data sampled at 8000
- samples/sec.
-
- The "IRCAM" sound file system has now been superseded by the so-called
- "BICSF" (for Berkeley/IRCAM/CARL Sound File system) software release.
- More recently, there has been an effort at Princeton (Prof. Paul
- Lansky) and Stanford (Stephen Travis Pope) to standardize several
- extensions to BICSF. A description of BICSF and the
- Princeton/Stanford extensions is available by anonymous ftp from
- ftp.cwi.nl [192.16.184.180], in directory /pub/audio/BICSF-info. This
- file contains further ftp pointers to software.
-
- A sound file format popular in the DARPA speech community is the NIST
- SPHERE standard. The most recent version of the SPHERE package is
- available via anonymous ftp from jaguar.ncsl.nist.gov [129.6.48.157]
- in compressed tar form as "sphere-v.tar.Z" (where "v" is the version
- code). The NIST SPHERE header is an object-oriented, 1024-byte
- blocked, ASCII structure which is prepended to the waveform data. The
- header is composed of a fixed-format portion followed by an
- object-oriented variable portion. I have placed a short description
- of NIST SPHERE on ftp.cwi.nl:/pub/audio/NIST-SPHERE.
-
- Finally, a somewhat different but popular format are "MOD" files,
- usually with extension ".mod" or ".nst" (they can also have a prefix
- of "mod."). This originated at the Amiga but players now exist for
- many platforms. MOD files are music files containing 2 parts: (1) a
- bank of digitized samples; (2) sequencing information describing how
- and when to play the samples. See the appendix "The Amiga MOD Format"
- for a description of this file format (and pointers to ftp'able
- players and example MOD files).
-
- Headerless file formats
- -----------------------
-
- extension origin parameters
- or name
-
- .snd, .fssd Mac, PC variable rate, 1 channel, 8 bits unsigned
- .ul US telephony 8 k, 1 channel, 8 bit "U-LAW" encoding
- .snd? Amiga variable rate, 1 channel, 8 bits signed
-
- It is usually easy to distinguish 8-bit signed formats from unsigned
- by looking at the beginning of the data with 'od -b <file | head';
- since most sounds start with a little bit of silence containing small
- amounts of background noise, the signed formats will have an abundance
- of bytes with values 0376, 0377, 0, 1, 2, while the unsigned formats
- will have 0176, 0177, 0200, 0201, 0202 instead. (Using "od -c" will
- also show any headers that are tacked in front of the file.)
-
- The Apple IIgs records raw data in the same format as the Mac, but
- uses a 0 byte as a terminator; samples with value 0 are replaced by 1.
-
- Sound formats and the Apple Macintosh
- -------------------------------------
-
- (Thanks to Bill Houle, <Bill.Houle@SanDiegoCA.NCR.COM>)
-
- SOX/DOS MAC
- Sound Format file ext type Mac program to convert to 'snd'
- ---------------------- -------- ---- -------------------------------
- Mac snd .snd sfil [n/a]
- Amiga IFF/8SVX .iff AmigaSndConverter, BST
- Amiga SoundTracker .mod STrk ModVoicer
- Audio IFF .aiff AIFF SoundExtractor, Sample Editor,
- UUTool, BST, M5Mac
- DSP Designer DSPs SoundHack
- IRCAM .sf IRCM SoundHack
- MacMix MSND SoundHack
- RIFF WAVE .wav SoundExtractor, BST, Balthazar
- SoundBlaster .voc SoundExtractor, BST
- SoundDesigner/AudioMedia Sd2f SoundHack
- Sound[Edit|Cap|Wave] .hcom FSSD SoundExtractor, SoundEdit,
- Wavicle, BST
- Sun uLaw/Next .snd .au/.snd NxTS SoundExtractor, SoundHack,
- au<->snd, UUTool, BST
-
-
- File conversions
- ----------------
-
- SOX (UNIX, PC, Amiga)
- ---------------------
-
- The most versatile tool for converting between various audio formats
- is SOX ("Sound Exchange"). It can read and write various types of
- audio files, and optionally applies some special effects (e.g. echo,
- channel averaging, or rate conversion).
-
- SOX recognizes all filename extensions listed above except ".snd",
- which would be ambiguous anyway, and ".wav" (but there's a patch, see
- below). Use type ".au" for NeXT ".snd" files. Mac and PC ".snd"
- files are completely described by these parameters:
-
- -t raw -b -u -r 11000
-
- (or -r 22000 or -r 7333 or -r 5500; 11000 seems to be the most common
- rate).
-
- The source for SOX, version 6, platchlevel 8, was posted to
- alt.sources, and should be widely archived. (Patch 9 was posted later
- and incporporates some important .wav fixes.) To save you the trouble
- of hunting it down, it can be gotten by anonymous ftp from
- wuarchive.wustl.edu, in the directory usenet/alt.sources/articles,
- files 7288.Z through 7295.Z. (These files are compressed news
- articles containing shar files, if you hadn't guessed.) I am sure
- many sites have similar archives, I'm just listing one that I know of
- and which carries a lot of this kind of stuff. (Also see the appendix
- if you don't have Internet access.)
-
- A compressed tar file containing the same version of SOX is available
- by anonymous ftp from ftp.cwi.nl [192.16.184.180], in directory
- /pub/audio/sox7.tar.Z. You may be able to locate a nearer version
- using archie!
-
- Ports of SOX:
-
- - The source as posted should compile on any UNIX and PC system.
-
- - A PC version is available by ftp from ftp.cwi.nl (see above) as
- pub/audio/sox5dos.zip; also available from the garbo mail server.
-
- - The latest Amiga SOX is available via anonymous ftp to
- wuarchive.wustl.edu, files systems/amiga/audio/utils/amisox*. (See
- below for a non-SOX solution.)
- The final release of r6 will compile as distributed on the Amiga with
- SAS/C version 6. Binaries (since many Amiga users do not own
- compilers) will continue to be available for FTP.
-
- SOX usage hints:
-
- - Often, the filename extension of sound files posted on the net is
- wrong. Don't give up, try a few other possibilities using the
- "-t <type>" option. Remember that the most common file type is
- unsigned bytes, which can be indicated with "-t ub". You'll have to
- guess the proper sampling rate, but often it's 11k or 22k.
-
- - In particular, with SOX version 4 (or earlier), you have to
- specify "-t 8svx" for files with an .iff extension.
-
- - When converting linear samples to U-LAW using the .au type for the
- output file, you must specify "-U" for the output file, otherwise
- you will end up with a file containing a NeXT/Sun header but linear
- samples -- only the NeXT will play such files correctly. Also, you
- must explicitly specify an output sampling rate with "-r 8000".
- (This may seem fixed for most cases in version 5, but it is still
- occasionally necessary, so I'm keeping this warning in.)
-
- Sun Sparc
- ---------
-
- On Sun Sparcs, starting at SunOS 4.1, a program "raw2audio" is
- provided by Sun (in /usr/demo/SOUND -- see below) which takes a raw
- U-LAW file and turns it into a ".au" file by prefixing it with an
- appropriate header.
-
- NeXT
- ----
-
- On NeXTs, you can usually rename .au files to .snd and it'll work like
- a charm, but some .au files lack header info that the NeXT needs.
- This can be fixed by using sndconvert:
-
- sndconvert -c 1 -f 1 -s 8012.8210513 -o nextfile.snd sunfile.au
-
- SGI Indigo, Indigo2, Indy and Personal IRIS
- -------------------------------------------
-
- SGI supports "soundfiler" (in /usr/sbin), a program similar in
- spirit to SOX but with a GUI. Soundfiler plays aiff, aifc, NeXT/Sun
- and .wav formats. It can do conversions between any of these formats
- and to and from raw formats including mulaw. It also does sample rate
- conversions.
-
- Three shell commands are also provided that give the same functionality:
- "sfplay", "sfconvert", and "aifcresample" (all in /usr/sbin).
-
- Amiga
- -----
-
- Mike Cramer's SoundZAP can do no effects except rate change and it
- only does conversions to IFF, but it is generally much faster than
- SOX. (Ftp'able from the same directory as amisox above.)
-
- Newer versions of OmniPlay (see below) will also convert to IFF.
-
- Tandy
- -----
-
- The Tandy 1000 uses a (proprietary?) compressed format. There is a PD
- Mac to Tandy conversion program called CONVERT. Leonard Erickson
- <leonard@qiclab.scn.rain.com> writes: There is a WAV driver from Tandy
- if people ask. There also appears to be a program that purports to
- convert other formats to Tandy, but I haven't tested this one yet.
-
- Apple Macintosh
- ---------------
-
- Bill Houle sent the following list:
-
- Popular commercial apps are indicated with a [*]. All other programs
- mentioned are shareware/freeware available from SUMEX and the various
- mirror sites, or check archie for the nearest FTP location.
-
- MAC SOUND CONVERSION PROGRAMS
-
- SoundHack [Tom Erbe, tom@mills.edu]
- Can read/write Sound Designer II, Audio IFF, IRCAM, DSP Designer and NeXT
- .snd (or Sun .au); 8-bit uLaw, 8-bit linear, 32-bit floating point and 16-bit
- linear data encoding. Can read (but not write) raw data files. Implements
- soundfile convolution, a phase vocoder, a binaural filter and an amplitude
- analysis & gain change module.
-
- SoundExtractor [Alberto Ricci, FRicci@polito.it]
- Extracts 'snd' resources, AIFF, SoundEdit, VOC, and WAV data from
- practically anything, converting to 'snd' files.
-
- Balthazar [Craig Marciniak, AOL:TemplarDev]
- Converts WAV files to 'snd'.
-
- Brian's Sound Tool [Brian Scott, bscott@ironbark.ucnv.edu.au]
- Converts 'snd' or SoundEdit to WAV. Can also convert WAV, VOC, AIFF, Amiga
- 8SVX and uLaw to 'snd'.
-
- AmigaSndConverter [Povl H. Pederson, eco861771@ecostat.aau.dk]
- Converts Amiga IFF/8SVX to Mac 'snd'.
-
- au<->Mac [Victor J. Heinz, vic:wbst128@xerox.com]
- Converts Sun uLaw to Mac 'snd'.
-
- ULAW [Rod Kennedy, rod@faceng.anu.edu.au]
- Converts 'snd' to Sun uLaw.
-
- UUTool [Bernie Wieser, wieser@acs.ucalgary.ca]
- Primarily a uuencode/decode program, but in true Swiss Army Knife
- fashion can also read/write Sun uLaw, AIFF, and 'snd' files.
-
- ModVoicer [Kip Walker, Kip_Walker@mcimail.com]
- Converts Amiga MOD voices into SoundEdit files or 'snd' resources.
-
- Music 5 Mac [Simone Bettini, space@maya.dei.unipd.it]
- Primarily a Music Synthesis system, but can also convert between 'snd', AIFF,
- and IBM .DAT(?).
-
- See also the section on players -- some players also do conversions.
-
-
- Playing audio files on UNIX
- ---------------------------
-
- The commands needed to play an audio file depend on the file format
- and the available hardware and software. Most systems can only
- directly play sound in their native format; use a conversion program
- (see above) to play other formats.
-
- Sun Sparcstation running SunOS 4.x
- ----------------------------------
-
- Raw U-LAW files can be played using "cat file >/dev/audio".
-
- A whole package for dealing with ".au" files is provided by Sun on an
- experimental basis, in /usr/demo/SOUND. You may have to compile the
- programs first. (If you can't find this directory, either you are not
- running SunOS 4.1 yet, or your system administrator hasn't installed
- it -- go ask him for it, not me!) The program "play" in this
- directory recognizes all files in Sun/NeXT format, but a SS 1 or 2 can
- play only those using U-LAW encoding at 8 k -- the SS 10 hardware
- plays other encodings, too.
-
- If you ca't find "play", you can also cat a ".au" file to /dev/audio,
- if it uses U-LAW; the header will sound like a short burst of noise
- but the rest of the data will sound OK (really, the only difference in
- this case between raw U-LAW and ".au" files is the header; the U-LAW
- data is exactly the same).
-
- Finally, OpenWindows 3.0 has a full-fledged audio tool. You can drop
- audio file icons into it, edit them, etc.
-
- Sun Sparcstation running Solaris 2.0
- ------------------------------------
-
- Under SVR4 (and hence Solaris 2.0), writing to /dev/audio from the
- shell is a bad idea, because the device driver will flush its queue as
- soon as the file is closed. Use "audioplay" instead. The supported
- formats and sampling rates are the same as above.
-
- NeXT
- ----
-
- On NeXT machines, the standard "sndplay" program can play all NeXT
- format files (this include Sun ".au" files). It supports at least
- U-LAW at 8 k and 16 bits samples at 22 or 44.1 k. It attempts
- on-the-fly conversions for other formats.
-
- Sound files are also played if you double-click on them in the file
- browser.
-
- SGI Indigo, Indigo2, Indy and Personal IRIS
- -------------------------------------------
-
- On SGI Indigo, Indigo2, Indy and the 4D/30 and /35 Personal IRIS workstations,
- "WorkSpace" plays audio files in .aiff, .aifc, .au, and .wav formats if
- you double click them and the sampling rate is one of 8000, 11025,
- 16000, 22050, 32000, 44100, or 48000. On the Personal IRIS, you need
- to have the audio board installed (check the output from hinv) and you
- must run IRIX 3.3.2 or 4.0 or higher. These files can also be played
- with "soundfiler" and "sfplay". ".aiff" and ".aifc" files at the above
- sampling rates can also be played with playaifc. (All in /usr/sbin)
-
- There is no simple /dev/audio interface on these SGI machines. (There
- was one on 4D/25 machines, reading and writing signed linear 8-bit
- samples at rates of 8, 16 and 32 k.)
-
- A program "playulaw" was posted as part of the "radio 2.0" release
- that I posted to several source groups; it plays raw U-LAW files on
- the Indigo, Indigo2, Indy or Personal IRIS audio hardware.
-
- Sony NEWS
- ---------
-
- The whole current Sony NEWS line (laptop, desktop, server) have
- builtin sound capabilities. You can buy an external board for the
- older NEWS machines. In the default mode (8k/8-bit mulaw), Sun .au
- files are directly supported (you can 'cat' .au files to /dev/sb0 and
- have them play.) The /usr/sony/bin/sbplay command on NEWS-OS 6.0
- also supports Sun .au files.
-
- Others
- ------
-
- Most other UNIX boxes don't have audio hardware and thus can't play
- audio data. This is actually rapidly changing and most new hardware
- that hits the market has some form of audio support. Unfortunately
- there is no single portable interface for audio that comes near the
- acceptance and functionality (let alone code size :-) of X11 for
- graphics. There are at least two network-transparent packages, both
- in some way based on the X11 architecture, that attempt to fillo the
- gap:
-
- DEC CRL's AudioFile supports Digital RISC systems running Ultrix,
- Digital Alpha AXP systems running OSF/1, Sun Sparcs, and SGI
- AL-capable systems (e.g., Indigo, Indy). The source kit is located at
- ftp site crl.dec.com [192.58.206.2] in /pub/DEC/AF.
-
- NCD's NetAudio supports NCD's MCX line of X terminals as well as
- Sparcs running either SunOS 4.1.3 or Solaris 2.2, using the /dev/audio
- interface (they claim it should be easy to port). The source it
- located at ftp.x.org [198.112.44.100] in contrib/netaudio. It is also
- ported to SGI (tested on IRIX 5.x), and there are unconfirmed rumors
- that it is being ported to SCI and Linux.
-
-
- Playing audio files on the Vaxstation 4000 (VMS)
- ------------------------------------------------
-
- 1) Without DECsound
-
- ".au" files can be played by COPYING them to device "SOA0:". This
- device is set up by enabling the driver SODRIVER. You can use the
- following command file:
-
- $!---------------- cut here -------------------------------
- $! sound_setup.com enable SOUND driver
- $ run sys$system:sysgen
- connect soa0 /adapter=0 /csr=%x0e00 /vector=%o304 /driver=sodriver
- exit
- $ exit
- $!----------------- cut here ------------------------------------
-
- 2) With DECsound (bundled with motif)
-
- Just start DECsound by selecting it from the session manager in the
- applications menu. (Not there use "@vue$library:sound$vue_startup").
- Make sure settings; device type (vaxstation 4000) and play settings
- (headphone jack) are selected. To play files from the DCL prompt
- (handy if you want to play sounds on a remote workstation) set a
- symbol up as follows;
- PLAY == "$DECSOUND -VOLUME 50 -PLAY"
- usage;
- DCL> play sound.au
-
- 3) Audio port
-
- The external audio port comes with a telephone-jack-like port. For
- starters, you can plug a telephone RECEIVER right into this port to
- hear your first sound files. After that, you can use the adapter
- (that came with the VaxStation), and plug in a small set of stereo
- speakers or headphones (the kind you'd plug into a WALKMAN, for
- example), for more volume. The adapter also has a microphone plug so
- that you can record sounds if DECsound is installed.
-
-
- Playing audio files on micros
- -----------------------------
-
- Most micros have at least a speaker built in, so theoretically all you
- need is the right software. Unfortunately most systems don't come
- bundled with sound-playing software, so there are many public domain
- or shareware software packages, each with their own bugs and features.
- Most separate sound recording hardware also comes with playing
- software, most of which can play sound (in the file format used by
- that hardware) even on machines that don't have that hardware
- installed.
-
- PC or compatible
- ----------------
-
- Chris S. Craig announces the following software for PCs:
-
- ScopeTrax This is a complete PC sound player/editor package. Sounds
- can be played back at ANY rate between 1kHz to 65kHz through
- the PC speaker or the Sound Blaster. It supports several
- file formats including VOC, IFF/8SVX, raw signed and raw
- unsigned. A separate executable is provided to convert
- .au and mu-law to raw format. ScopeTrax requires EGA/VGA
- graphics for editing and displaying sounds on a REALTIME
- oscilloscope. The package also includes:
- * An expanded memory player which can play sounds
- larger than 640K in size.
- * Basic (rough) sound compression/uncompression
- utilities.
- * Complete documentation.
- The package is FREEWARE! It is available on SIMTEL in the
- PD1:[MSDOS.SOUND] directory.
-
- One of the appendices below contains a list of more programs to play
- sound on the PC.
-
- Atari
- -----
-
- For sounds on Atari STs - programs are in the atari/sound/players
- directory on atari.archive.umich.edu (141.211.164.8).
-
- Tandy
- -----
-
- On a Tandy 1000, sounds can be played and recorded with DeskMate Sound
- (SOUND.PDM), or if they not stored in compressed format, they can also
- be played be a program called PLAYSND. No indication of whether
- PLAYSND is PD or not. It hasn't been updated since March of 89.
-
- Amiga
- -----
-
- On the Amiga, OmniPlay by David Champion <dgc3@midway.uchicago.edu>
- plays and converts IFF-8SVX, AIFF, WAV, VOC, .au, .snd, and 8 bit raw
- (signed, unsigned, u-law) samples. As of version 1.23, OmniPlay will
- also convert any playable sample to 8SVX. Files: wuarchive.wustl.edu
- in /systems/amiga/audio/sampleplayers/oplay123.lha (?)
- amiga.physik.unizh.ch in mus/play/oplay123.lha
-
- Apple Macintosh
- ---------------
-
- Malcolm Slaney from Apple writes:
-
- "We do have tools to play sound back on most of our Unix hosts. We wrote
- a program called TcpPlay that lets us read a sound file on a Unix host,
- open a TCP/IP connection to the Mac on my desk, and plays the file. We
- think of it as X windows for sound (at least a step in that direction.)
-
- This software is available for anonymous FTP from ftp.apple.com
- [IP address 130.43.2.3 -- Guido].
- Look for ~ftp/pub/TcpPlay/TcpPlay.sit.hqx.
-
- Finally, there are MANY tools for working with sound on the Macintosh. Three
- applications that come to mind immediately are SoundEdit (formerly by
- Farralon and now by MacroMind/Paracomp), Alchemy and Eric Keller's Signalyze.
- There are lots of other tools available for sound editing (including some
- of the QuickTime Movie tools.)"
-
- Bill Houle sent the following lists:
-
- Popular commercial apps are indicated with a [*]. All other programs
- mentioned are shareware/freeware available from SUMEX and the various
- mirror sites, or check archie for the nearest FTP location.
-
- MAC SOUND EDITORS
-
- Sample Editor [Garrick McFarlane, McFarlaneGA@Kirk.Vax.Aston.Ac.UK]
- Plays AIFF and 'snd' sounds. Can convert between AIFF and 'snd'.
- Can record from built-in mic. Can add effects such as fade,
- normalize, delay, etc.
-
- Wavicle [Lee Fyock]
- Plays SoundEdit files. Can convert to 'snd'. Can record from built-in mic.
- Can add effects such as fade, filter, reverb, etc.
-
- [*]SoundEdit/SoundEdit Pro [Farallon/MacroMind*Paracomp]
- Plays SoundEdit and 'snd' sounds. Can read/write SoundEdit files and 'snd'
- sounds. Can record from built-in mic. Can add effects such as
- echo, filter, reverb, etc.
-
-
- MAC SOUND PLAYERS
-
- Sound-Tracker [Frank Seide]
- Plays Amiga SoundTracker files in foreground or background.
-
- Macintosh Tracker [Thomas R. Lawrance, tomlaw@world.std.com]
- Plays Amiga SoundTracker files in foreground or background. A port of Marc
- Espie's Unix Tracker version with Frank Seide's core player thrown in for
- good measure.
-
- The Player [Antoine Rosset & Mike Venturi]
- Plays AIFF, SoundEdit, MOD, and 'snd' files.
-
- SoundMaster (aka [*]Kaboom!) [Bruce Tomlin]
- Associates SoundEdit files to MacOS events.
-
- SndControl [Riccardo Ettore, 72277.1344@compuserve.com]
- Associates 'snd' sounds to MacOS events.
-
- Canon 2 [Glenn Anderson, glenn@otago.ac.nz; Jeff Home, jeff@otago.ac.nz]
- Plays AIFF or 'snd' files in foreground or background.
-
- Another Mac play/convert program: "It's called SoundApp. I wrote it,
- (franke1@llnl.gov) and it's FreeWare. It will play: SoundCap,
- SoundEdit, WAVE, VOC, MOD, Amiga IFF (8SVX), Sound Designer, AIFF, AU,
- Mac Resource, and DVI ADPCM. It can convert all the above to System 7
- sound resources (except MOD where just the samples are extracted.) And
- it will double buffer."
-
-
- The Sound Site Newsletter
- -------------------------
-
- An electronic publication with lots of info about digitised sound and
- sound formats, albeit mostly on PCs, is "The Sound Site Newsletter",
- maintained by David Komatsu <davek@uhunix.uhcc.hawaii.edu>.
- Issue 14 appeared in July 1993. As of that issue, the Sound Site
- Newsletter has expanded its charter to include commercial products and
- will appear monthly. There is now also a sound site network of ftp
- servers, bulletin boards and authors. The Sound Site Newsletter (once
- again!) has its own ftp site: sound.usach.cl.
-
- The Sound Newsletter is posted to: comp.sys.ibm.pc.soundcard
- comp.sys.ibm.pc.misc
- rec.games.misc
- FTP: oak.oakland.edu (misc/sound)
- garbo.uwasa.fi (pc/sound)
- sound.usach.cl (pub/Sound/Newsltr) [Home Base]
-
-
- Posting sounds
- --------------
-
- The newsgroup alt.binaries.sounds.misc is dedicated to postings
- containing sound. (Discussions related to such postings belong in
- alt.binaries.sounds.d.)
-
- There is no set standard for posting sounds; uuencoded files in most
- popular formats are welcome, if split in parts under 50 kBytes. To
- accomodate automatic decoding software (such as the ":decode" command
- of the nn newsreader), please place a part indicator of the form
- (mm/nn) at the end of your subject meaning this is number mm of a
- total of nn part.
-
- It is recommended to post sounds in the format that was used for the
- original recording; conversions to other formats often lose
- information and would do people with identical hardware as the poster
- no favor. For instance, convering 8-bit linear sound to U-LAW loses
- the lower few bits of the data, and rate changing conversions almost
- always add noise. Converting from U-LAW to linear requires expansion
- to 16 bit samples if no information loss is allowed!
-
- U-LAW data is best posted with a NeXT/Sun header.
-
- If you have to post a file in a headerless format (usually 8-bit
- linear, like ".snd"), please add a description giving at least the
- sampling rate and whether the bytes are signed (zero at 0) or unsigned
- (zero at 0200). However, it is highly recommended to add a header
- that indicates the sampling rate and encoding scheme; if necessary you
- can use SOX to add a header of your choice to raw data.
-
- Compression of sound files usually isn't worth it; the standard
- "compress" algorithm doesn't save much when applied to sound data
- (typically at most 10-20 percent), and compression algorithms
- specifically designed for sound (e.g. NeXT's) are usually
- proprietary. (See also the section "Compression schemes" earlier.)