Perfect auditory illusion over loudspeakers and headphones:How to use the properties of the human ears and brain
The dream of perfect recreation of sound has always consisted of two parts: Reproduction of monaural sounds such that they seem to be exact copies of an original signal and the plausible recreation of complex sound environments, the possibility to be immersed in sound. The latter goal seems to be much more difficult, especially if we consider reproduction over headphones.The talk will both touch on historic developments including .mp3 and AAC and new results especially regarding spatial hearing and how to get nearer to the dream of perfect recreation of sound. Modern music distribution systems largely depend on lossy audio coding. To include the properties of the human ear into the design of signal processing systems made it possible to get equal audio quality at much lower bit-rates. The talk will briefly to into the basics of such systems. However, our knowledge about human perception is far from complete. With tasks like the dream to reproduce spatial sound in a perfect way both over loudspeakers and over headphones, we have to acknowledge that our current models are either plainly wrong or at least not accurate enough.The second part of the talk will present new results for headphone listening while trying to externalize the sound. From standard two-channel sounds reproduced over headphones through artificial head recordings, the inclusion of HRTF and binaural room impulse responses , always something was missing to create a perfect auditory illusion. Depending on refinements like individually adapted HRTF etc. these methods work for many people, but not for everybody. As we know now, in addition to the static, source and listener dependent modifications to headphone sound we need to pay attention to cognitive effects: The perceived presence of an acoustical room rendering changes depending on our expectations. Prominent context effects are for example acoustic divergence between the listening room and the synthesized scene, visibility of the listening room, and prior knowledge triggered by where we have been before. Furthermore, cognitive effects are mostly time variant which includes anticipation and assimilation processes caused by training and adaptation. We present experiments proving some of these well-known contextual effects by investigating features like distance perception, externalization, and localization. These features are shifted by adaptation and training. Furthermore, we present some proposals how to get to a next level of fidelity in headphone listening. This includes the use of room simulation software and the adaptation of its auralization to different listening rooms by changing acoustical parameters.