Doctor Thesis
Title
A Study of the Method for the Reduction of Information in Sound Field Reproduction Based on Wave Field Synthesis
Summary
Wave field synthesis is a sound field reproduction technique that synthesizes wave fronts by using Huygens' principle. This technique enables multiple listeners to move about in a listening area or to turn their heads and still hear the same sound. This type of sound field reproduction is not possible if conventional sound field reproduction techniques such as binaural and transaural techniques are used. By the practical application of this technique, people in different places can conduct and participate in events such as conferences (teleconferencing system) and music concerts (tele-ensemble system) at the same time. Thus, it can be stated that the use of telecommunication and virtual reality systems in societies will increase rapidly as thise systems are capable of creating more realistic environments than conventional systems (TV phone and 5.1 ch audio).
However, the amount of information is very enormous in this technique. In the previous studies, although the reproduction of wave fronts is studied, the reduction of the amount of information, which is neccesary to reproduce the sound field, is not studied. If listeners feel the same realistic sensation even when the amount of information is reduced, it is expected that the amount of information, which is neccesary to reproduce the sound field, can be reduced. In this thesis, according to the point of view described above, the reduction of the amount of information, which is neccesary to reproduce the sound field, is studied based on the subjective assessments.
This thesis consists of six chapters.
In Chapter 1, the sound field reproduction technique based on wave field synthesis and the aim of this thesis are described.
In Chapter 2, the diagram of the sound field reproduction system based on wave field synthesis, which is constructed in this thesis, is described. Because the condition, in which wave fronts are accurately reproduced in this system, have not been studied, the condition, in which wave fronts are accurately reproduced in this system, is studied by a computer simulation. From the results of the computer simulation, it is indicated that wave fronts are accurately reproduced when the interval of microphones and loudspeakers is less than the half of a wavelength and when the unidirectional or shotgun microphone is applied.
If the interval of microphones and loudspeakers is less than the half of a wavelength according to the result of Chapter 2, the interval of microphones and loudspeakers is about 1 cm when the bandwidth of a sound is at least 16 kHz, in which a sound is such as musical sound. Thus, it is very difficult to construct the system because the size of microphones and loudspeakers is not small. In Chapter 3, the neccesary number of microphones and loudspeakers is studied in order to construct the system. The effect of the number of microphones and loudspeakers on the realistic sensation, which is the directional perception and the spatial impression in this thesis, is evaluated by two subjective assessments. From the results of two subjective assessments, it is indicated that the directional perception and the spatial impression are reproduced enough even when the wave fronts are reproduced in the relatively low frequency range and that the practical system can be constructed even when 24 microphones and loudspeakers are applied.
In the system constructed based on the results of Chapter 3, the amount of transmission is very enormous compared with the conventional systems. Thus, when the system incorporated the visual display, the amount of transmission is scarce. In Chapter 4, a method, in which the amount of transmission is reduced, is proposed. In the proposed method, source signals are extracted from channel signals recorded by microphones by convolving the inverse filters calulated from a room impulse response between sound sources and micropones and transmitted. In order to evaluate the performance of the proposed method, a experiment is performed in which the amount of transmission is reduced from the number of channel signals (24) to the number of sound sources (5). From the results of the subjective assessment, it is confirmed that the perceptual distortion of the proposed method has little effect on the perceptual quality of sound field even if the source signals are not completely extracted.
In Chapter 5, the method to reduce the amount of transmission, which can apply when the sound sources are moving, is proposed. The proposed method consists of the method described in Chapter 4 and the position senser of sound sources. In order to evaluate the performance of the proposed method, a simulated reverberant sound field is synthesized by using an image method and the amount of transmission is reduced from the number of channel signals (24) to the number of moving sound sources (1). From the results of the subjecive assessment, it is indicated that the effect of the proposed method on the sound moving perception and the perceptual quality is little.
From the results of the experiments described above, it is indicated that listeners can adequately feel the realistic sensation even if wave fronts are accurately reproduced in the relatively low frequency range when the system is constructed by using 24 microphones and loudspeakers, and that listeners don't feel the distortion of the sound field even if the method to reduce the amount of transmission is applied. If the result of this thesis is applied, the practical sound field reproduction system can be realized because the information can be adequately transmitted in the current commnucations infrastructure. In Chapter 6, the meaning of this thesis described above and the future vision of the sound field reproduction are descrived.