Some topics to be discussed in a lecture on Spatialization and Reverberation

-----

Why stereo? 'Cause we have two ears. For various reasons our two ears receive slightly different versions of the sound space. (Interaural time difference, interaural intensity difference, diffraction/filtering effect of head, shoulders, pinnae, etc.)

What is the effect of the pinnae, the head, the shoulders? This creates a filtering effect duplicated in computer music using head-related transfer functions (HRTF).

How does sound propagate in an open space? Spherically, with intensity dropping off as the square of distance (inverse square law).

How can we imitate that distance effect in computer audio? Amplitude (A) squared is proportional to intensity (I), and I is proportional to 1 over distance (D) squared, so A is proportional to 1 over D. Thus, if a sound with amplitude 1 seems to be 3 meters away, then the same sound with amplitude of 0.5 would seem to be 6 meters away.

How do wall reflections complicate this? They make the overall sound louder because relected sound mixes (mostly constructively) with direct sound, although absorption of the wall material must be taken into account. In a theoretical vectoral model of sound propagation, they contribute a complex mixture of delayed (usually attenuated and and possibly lowpass filtered) versions of the direct sound known as reverberation (reverb).

Does reverb affect our sense of distance? Yes. We make distance judgements also on the basis of the ratio of direct to reverberated sound. A close sound will give mostly direct sound, and relatively little reverberated sound compared to the direct sound; a distant sound will be a nearly equal balance of direct and reverberated sound, and the delay of the reverbed sound relative to the direct sound will be less.

Can we give a sense of the direction of sound with just ITD? Yes, but it's subtle and depends on audience placement relative to the two speakers. (Try an example in MSP with two clicks, same volume same time, then delay one of them by a coule of ms, and see if your sense of location changes.)

Can we give a sense of the direction of sound with just IID? Yes. This is known as intensity panning, and is generally a bit more blatant than the ITD technique, but still depends on audience placement relative to the two speakers. (Try an example in MSP with two clicks, same volume same time, then change the amplitude balance of the two speakers and see if your sense of location changes.)

Show an example of linear amplitude panning in MSP. Note the "hole in the middle" effect, how the sound seems to get softer (i.e., more "distant") in the middle because the intensity (which is more closely related to our subjective sense of "loudness") is sum of the SQUARES of the amplitudes. So something panned hard left has an "intensity" of 1.0 squared plus 0.0 squared = 1.0, whereas something panned in the middle has an "intensity" of 0.5 squared (i.e., 0.25) plus 0.5 squared  (i.e., 0.25) = 0.5, so it seems approximately 1.414 (i.e. the square root of 2) times as distant.

Explain how, by taking the square root of the calculated amplitude for each speaker instead, we can get an "equal intensity" or "constant power" pan. Demonstrate in MSP.

Explain how, through trigonometric identities, taking the square root of the amplitude value for each speaker is the same as looking up the value in 1/4 cycle of a cosine table in one speaker and 1/4 cycle of a sine table in the other speaker. Show how this can be implemented in MSP.

Briefly discuss how quadrophonic panning can work. Using the azimuth angle method, when a the virtual location of a sound is 90 degrees -- pi over 2 radians -- or more from the location of a given speaker, the amplitude of that speaker should be 0; when the azimuth angle of the virtual location of the sound is within the necessary tolerance (i.e., between any two speakers), it should be constant-power panned between them, with over-all amplitude scaled inversely according to the virtual distance of the sound. One can also use a simple "two-fader" or "x-y" model, in which we use one pan value to indicate location on the x axis, and another pan value to indicate location on the y axis.

Briefly discuss 5.1 as it is used in cinema soundtracks.