home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cis.ohio-state.edu!pacific.mps.ohio-state.edu!linac!uwm.edu!ogicse!news.u.washington.edu!stein.u.washington.edu!hlab
- From: bkarr@carson.u.washington.edu (Brian Karr)
- Newsgroups: sci.virtual-worlds
- Subject: Re: SCI: Three dimensional sound?
- Message-ID: <1992Nov23.161203.25170@u.washington.edu>
- Date: 23 Nov 92 01:19:25 GMT
- Article-I.D.: u.1992Nov23.161203.25170
- References: <1992Nov19.072149.29598@hitl.washington.edu>
- Sender: news@u.washington.edu (USENET News System)
- Organization: Human Interface Technology Lab, Seattle
- Lines: 251
- Approved: cyberoid@milton.u.washington.edu
- Originator: hlab@stein.u.washington.edu
-
-
-
- In article <1992Nov19.072149.29598@hitl.washington.edu> "Human Int.
- Technology" <hlab@milton.u.washington.edu> writes:
-
- >From: fsjdj1@acad3.alaska.edu
- >Subject: Re: SCI: Three dimensional sound?
- >Date: Wed, 18 Nov 1992 20:16:59 GMT
- >Organization: University of Alaska Fairbanks
- >
- >In article <1992Nov16.095935.10365@u.washington.edu>, mcmains@unt.edu (Sean
- > McMains) writes:
- >>
- >> What is the theory behind creating the illusion of a sound emanating
- >> from a particular point in three dimensional space? With regard to
- >> lateral motion, the amplitude and timing of the sounds entering each
- >> ear could obviously be adjusted to create the desired effect. How
- >> would one create the illusion of a sound coming from above or below
- >> the listener? Or is this effect only possible through adjusting what
- >> the listener hears as he moves his head?
- >
- > [Mentions articles regarding bi- and multi-directional sound recording.]
-
- _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-
- Not sure if my postings are the ones you are referring to, but I will repost
- the following info as it seems relevant to the question. Much of it is
- recycled, and the rest is related news since those postings.
-
- -bk
- (Brian Karr)
-
- _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-
- Here is a quick and dirty explaination of 3D sound filter functions:
-
- Most of the cues needed for presenting a spatial audio image, are
- embedded in the 'earprint' or HRTF (Head Related Transfer Function).
- An HRTF is a description of how the ears of a test subject filter
- sound at various points on a sphere, the listener's head being at the
- center. This is derived by placing mics in the ears of the subject
- and chirping pseudo- random noise (impulses) at them from the various
- directions. If a Fourier Transform is performed on what is picked up
- by the mics, the resulting specra show how the ears and head shape
- sounds before they reach the eardrum. The sampled signals therefore
- contain the impulse response (amplitude and phase) of the ear at that
- angle. Phase is implied since the two ears are supposedly sampled
- phase-coherently. The earprint then is an array of these responses
- which are used as filter coefficients for shaping the input signal to
- be spatialized. If we wish to hear a sound where there has been no
- actual measurement, the nearest resposes are interpolated.
-
- To implement a simulation of this, we effectively need two
- time-dependent realtime filters for each sound source we wish to
- localize. As a frequency- domain analogy, imagine two graphic
- equalizers whose sliders move to new positions whenever we want the
- source to appear to move to a new location in space. Spatial sound
- systems use a mathematical version of this called convolution to
- filter signals digitally.
-
- So the major computation going on is interpolation of coefficient sets
- and the convolution of the input signal to be localized with the
- appropriate filter responses.
-
- That provides a 'free field' spatial display, meaning there are no
- environ- ment cues since the earprint is usually derived in an
- acoustically insulated booth of some kind. Also, distance is
- simulated by 1/distance-squared attenuation. For the anechoic models
- used here, this may be more like 1/d^.5 since the reverberent energy
- is no longer included. This method is not entirely correct because
- the ear tends to normalize volume of sounds that don't have an
- intrinsic volume. To give convincing distance and environment cues, a
- reference wall or walls can be placed in the image by manipulating the
- earprint (adding the responses of the reflections before convolving).
-
- For a far better description of all of this, see:
-
- Blauert, Jens, 1983. _Spatial Hearing: The Psychophysics of Human Sound
- Localization_, Cambridge, MA: MIT Press
-
- Lehnert, Hilmar & Blauert, Jens, 1992. 'Aspects of Auralization in
- Binaural Room Simulation.' AES Proceedings, 93rd Convention
- October 1-4, 1992.
-
- Moller, Heinrik, 1992. 'Fundamentals of Binaural Technology.' _Applied
- Acoustics_, _36_, pp. 171-218.
-
- Wenzel, Elizabeth M. 'Localization in Virtual Acoustic Displays,'
- _Presence_ 1st issue.
-
- Wightman, F.L. & Kistler, D.J. 1989a,b. 'Headphone Simulation of Free-field
- Listening I, II.' _Journal of the Acoustical Society of America_,
- _85_, 858-878.
-
-
- Hope this info helps.
-
- -Brian
-
- _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-
-
- There are a few spatial audio display systems out these days that I am
- aware of. Some use speaker arrays and some use ordinary headphones.
-
- There is of course surround sound, which requires a loudspeaker array
- and previously encoded material. Decoders are commonplace today while
- encoders are not so ubiquitous and are expensive. There is also the
- 'Ambisonic' system that uses a special 4-element microphone to record
- and encode natural sound environments. This also requires
- loudspeakers (4 or 6). There is also of course quadraphonic sound.
- This is used in the headset of the Virtuality arcade games. They use
- two speakers in each ear. This display gave me a good azimuth
- impression, especially with the head tracking, elevation was not so
- convincing.
-
- Mannequin heads with microphones in the ears have been used to make
- binaural recordings for decades. Recording this way (or with real
- folks' heads and tiny mics) gives an excellent 3D impression and can
- really sound like you are _there_.
-
- The benifit of using headphones and head tracking is that many people
- in the same room can have different sonic environments, or cohabitate
- the same environment from different perspectives, simultaneously.
-
- The AL-100 was developed by the Air Force to take advantage of this.
- The AL-100 was a coffin-sized box fitted with a binaural mannequin
- head which was spun around in front of a loudspeaker with high-torque
- motors. This way, a sound could be made to travel around the listener
- by moving the head appropriately. Last I heard, this box is still
- working.
-
- This whole process has since been realized computationally in a number
- of excellent systems. Jens Blauert developed the first system in East
- Germany some years back. This group is now working on a new version
- of their 'Binaural Mixing Console,' and are doing some excellent work
- with room simulation (localized reflections in addition to the direct
- localized source).
-
-
- Commercial Systems:
-
- There are many systems that I know of available right now. One is a
- MacintoshII-based system called Focal Point (Gehring Inc., Bo Gehring)
- that uses a special DSP card (Audiomedia, which has recently been
- discontinued by digidesign, alas) for each independently localized
- sound. This can be used with a CDEV interface or a MIDI application
- right out of the box, and also comes with a Think C interface to use
- in your own apps. I also saw/heard an early version of this on the
- NeXT, but a commercial version is not being persued for the NeXT.
-
- This system is also now available for the IBM-PC flavored machine
- under the same product name. Bo tells me that it has the added
- benefit that it can alternatively run without the bus. You simply
- give it power and it wakes up spatializing whatever signal is at its
- input. Position commands can then be sent to a serial port built onto
- the card. Handy if you want to skip the PC host. This card
- spatializes two sounds simultaneously and independently. Focal Point
- 3D Audio, Niagra Falls, NY. Bo Gehring (716) 285-3930.
-
- There is also the IBM-PC based Convolvotron (Crystal River
- Engineering, Scott Foster) which localizes 4 independent sounds for
- each 2-card set. This system lets you switch 'earprints' (HRTF's) and
- comes with a set of earprints, C-programming libraries and sample
- programs. This system also optionally includes a reflection package
- which localizes reflections to give an impression of objects (walls)
- present in the environment and adds another crutial distance cue.
-
- This now has been implemented for the PC on a Turtle Beach DSP card.
- CRE is calling this the 'Beachtron.' This card spatilizes two sounds
- simultaneously and independently. It has a sample-based synthesizer
- on the card and a MIDI port (Yes!). Multiple cards can be cascaded,
- which avoids the need for mixing and the cabling plague. CRE has
- developed a protocol and software libraries that let you load your AT
- or an AT backplane up with B-trons or C-trons and talk to it as if it
- were an audio resource pool. The code autosenses what is on the bus
- and does the right thing. This makes alot of your code portable
- between C-trons and B-trons. The B-tron, however does not do room
- simulations. This audio resource pool package is called the
- 'Acoustetron.' Crystal River Engineering, Groveland, CA. Scott
- Foster (209) 962-6382.
-
- VPL has worked with the CRE crowd to port this to a Mac-based card for
- VPL's VR systems. They are calling this the 'CosmTron,' for their
- MicroCosm system. VPL Research, Foster City, CA. (415) 361-1710.
-
- There is also a pro-audio system called the Sound Space processor
- (Roland Co., Curtis Chan) which is designed to give a 3D image using
- two loudspeakers. The idea is to compute sounds in their locations as
- usual and then compensate for speaker cross-talk before the signal
- goes to the speakers (this is called 'transaural processing'). The
- result is a sweet spot which is actually a line that is all
- equidistant points from both speakers. Chances are you have probably
- already heard this on the radio. Bob Todrank is now the contact at
- Roland for this machine. RSS processor. Roland Pro Audio/Video Group
- (213) 685-5141.
-
- These systems (F.P., A,B,C-tron, RSS) require no decoding and the
- signal can be stored on regular audio cassettes (prefferably on DAT,
- Hi-Fi VHS or MO.) Focal Point and the Convolvotron are designed for
- headphones, while Roland's box is designed for headphones or speakers.
- I heard some interesting 'effects' with speakers, though the spatial
- image didn't always come across.
-
- There are transaural processors available if you must use loudspeakers
- with the personal computer based systems such as F.P. and the
- A,B,C-tron.
-
- Related stuff:
-
- The 'Spatializer,' from Audio Intervisual Design is a system that
- produces eight moving sources in azimuth only. This is not binaural
- processing in the sense of filtering with pinna responses but I
- thought I would mention it for completeness. Audio Intervisual
- Design. (213) 845-1155.
-
- The 'Intelliverb,' from RSP Technologies is an ultra-parameterized
- reverb unit. It can be configured to produce many standard effects
- but the room simulation effects are most relevant here. Variables
- include room width, height, and depth, source position, listener
- position, reverb ducking (can be used to change room absorbtion).
- These parameters can all be controlled with MIDI (i.e. from your VR
- code). These variables affect the 'early reflections' in the
- simulation. The more diffuse late reflections are added in from a
- selection of algorithms. This box stands out from the vast array of
- effects processors in my opinion because of the attention to external
- control of the right variables in the delay/reverb, and some excellent
- audio specs. The early reflections are not, however, spatialized.
- They are intended to be correct in time. I have found reverb effects
- to give an excellent enhancement of presence in VR and this type of
- box seems to be a good alternative for completely correct room
- simulations until they become real and affordable. I will post a more
- complete reveiw of the box after the holidays. RSP Technologies,
- Rochester Hills, MI. (313) 853-3055.
-
- Any others? I'm not sure. It is of course possible to do it with
- slow hardware in non-realtime so I wouldn't be suprised if many people
- have developed spatial displays. The critical part of the process is
- getting a good earprint. Much of the work being done today with room
- simulation (manipulation impulse responses with raytraced reflections)
- is being done done off-line because of the computation needs.
-
- _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-
- I hope this is helpful to enough folks to justify the bandwidth.
-
- -Brian
-
- bk@hitl.washington.edu
-
- _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-