TR#373: Vision-Steered Audio for Interactive Environments

Sumit Basu, Michael Casey, William Gardner, Ali Azarbayejani, Alex Pentland

IMAGE'COM '96 Proceedings
Bordeaux, France

We present novel techniques for obtaining and producing audio information in an interactive virtual environment using vision information. These techniques are free of mechanisms that would encumber the user, such as clip-on microphones, headphones, etc. Methods are described for both extracting sound from a given position in space and for rendering an ``auditory scene,'' i.e., given a user location, producing sounds that appear to the user to be coming from an arbitrary point in 3-D space. In both cases, vision information about user position is used to guide the algorithms, resulting in solutions to problems that are difficult and often impossible to robustly solve in the auditory domain alone.