Loading…
AES Show 2024 NY has ended
Exhibits+ badges provide access to the ADAM Audio Immersive Room, the Genelec Immersive Room, Tech Tours, and the presentations on the Main Stage.

All Access badges provide access to all content in the Program (Tech Tours still require registration)

View the Exhibit Floor Plan.
strong>Poster [clear filter]
arrow_back View All Dates
Tuesday, October 8
 

2:00pm EDT

Bitrate adaptation in object-based audio coding in communication immersive voice and audio systems
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
The object-based audio is one of the spatial audio representations providing an immersive audio experience. While it can be found in a wide variety of audio reproduction systems, its use in communication systems is very limited as it faces many constraints like the complexity of the system, short delay, or limited available bitrate for coding and transmission. This paper presents a new bitrate adaptation method to be used in object-based audio coding systems that overcomes these constraints and enables their use in 5G voice and audio communication systems. The presented method distributes an available codec bit budget to encode waveforms of the individual audio objects based on a classification of the objects’ subjective importance in particular frames. The presented method has been used in the Immersive Voice and Audio Services (IVAS) codec, recently standardized by 3GPP, but it can be employed in other codecs as well. Test results show the performance advantage of the bitrate adaptation method over the conventional uniformly distributed bitrate method. The paper also presents IVAS selection test results for object-based audio with four audio objects, rendered to binaural headphone representation, in which the presented method plays a substantial role.
Speakers Authors
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Enhancing Realism for Digital Piano Players: A Perceptual Evaluation of Head-Tracked Binaural Audio
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This paper outlines a process for achieving and perceptually evaluating a head-tracked binaural audio system designed to enhance realism for players of digital pianos. Using an Ambisonic microphone to sample an acoustic piano, followed by leveraging off-the-shelf equipment, the system allows players to experience changes in the sound field in real-time as they rotate their heads while wearing headphones under three degrees of freedom (3DoF). The evaluation criteria included spatial clarity, spectral clarity, envelopment, and preference. These criteria were assessed across three different listening systems: stereo speakers, stereo headphones, and head-tracked binaural audio. Results showed a strong preference for the head-tracked binaural audio system, with players noting significantly greater realism and immersion.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Exploring Immersive Opera: Recording and Post-Production with Spatial Multi-Microphone System and Volumetric Microphone Array
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Traditional opera recording techniques using large microphone systems are typically less flexible towards experimental singer choreographies, which have the potential of being adapted to immersive and interactive representations such as Virtual Reality (VR) applications. The authors present an engineering report on implementing two microphone systems for recording an experimental opera production in a medium-sized theatre: a 7.0.4 hybrid array of Lindberg’s 2L and the Bowles spatial arrays and a volumetric array consisting of three higher-order Ambisonic microphones in Left/Center/Right (LCR) formation. Details of both microphone setups are first described, followed by post-production techniques for multichannel loudspeaker playback and 6 degrees-of-freedom (6DoF) binaural rendering for VR experiences. Finally, the authors conclude with observations from informal listening critique sessions and discuss the technical challenges and aesthetic choices involved during the recording and post-production stages in the hope of inspiring future projects on a larger scale.
Speakers
JM

Jiawen Mao

PhD student, McGill University
Authors
JM

Jiawen Mao

PhD student, McGill University
avatar for Michael Ikonomidis

Michael Ikonomidis

Doctoral student, McGill University
Michael Ikonomidis (Michail Oikonomidis) is an accomplished audio engineer and PhD student in Sound Recording at McGill University, specializing in immersive audio, high-channel count orchestral recordings and scoring sessions.With a diverse background in music production, live sound... Read More →
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Exploring the Directivity of the Lute, Lavta, and Oud Plucked String Instruments
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This study investigates the spherical directivity and radiation patterns of the Lute, Lavta, and Oud, pear-shaped traditional plucked-string instruments from the Middle East, Turkey, Greece, and the surrounding areas, providing insights into the acoustic qualities of their propagated sound in a three-dimensional space. Data was recorded in an acoustically controlled environment with a 29-microphone array, using multiple instruments of each type, performed by several professional musicians. Directivity is investigated in terms of sound projection and radiation patterns. Instruments were categorized according to string material. The analysis revealed that all instruments, regardless of their variations in geometry and material, exhibit similar radiation patterns across all frequency bands, justifying their intuitive classification within the “Lute family”. Nevertheless, variations in sound projection across all directions are evident between instrument types, which can be attributed to differences in construction details and string material. The impact of the musician's body on directivity is also observed. Practical implications of this study include the development of guidelines for the proper recording of these instruments, as well as the simulation of their directivity properties for use in spatial auralizations and acoustic simulations with direct applications in extended reality environments and remote collaborative music performances.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Generate acoustic responses of virtual microphone arrays from a single set of measured FOA responses. - Apply to multiple sound sources.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
V2MA (VSVerb Virtual Microphone Array)
Demos and related docs are available at https://bit.ly/3BmDBbL .
Once we have measured a set of four impulse responses (IRs) with an A-format microphone in a hall, we can make a virtual recording using a virtual microphone array built in the hall at will. The measurement does not require an A-format microphone and a loudspeaker to be placed at specific positions in the hall. Typical positions, such as at an audience seat and on a stage, are recommended, but you can place them anywhere you like. We will generate any type of virtual microphone response in a target room from an easy one-time IR measurement.
-------------------------------
We propose a method, V2MA, that virtually generates acoustic responses of any type of microphone array from a single set of FOA responses measured in a target room. An A-format microphone is used for the measurement, but no Ambisonics operation is included in the processing. V2MA is a method based on geometrical acoustics. We calculate sound intensities in the x, y, and z directions from a measured FOA response, then the virtual sound sources of the room are detected from them. Although it is desirable to have an A-format microphone place close to the attempted position of the virtual microphone array in the room, it is not a mandatory requirement. Since our method allows to generate SRIRs at arbitrary receiver positions in the room by updating the acoustic properties of the virtual sound sources detected at a certain position of the room, an A-format microphone can be placed anywhere you like. On the other hand, a loudspeaker must be placed at the source position where a player is assumed to be. Since the positions of virtual sound sources change when a real sound source moves, we used to measure the responses for each assumed real source position. To improve this inconvenient restriction, we developed the technique of updating the positions of the virtual sound sources when a real sound source moves from its original position. Although the technique requires some approximations, it is ascertained that the generated SRIRs provide fine acoustic properties in both physical and auditory aspect.
Speakers
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician specializing in studio acoustic design and R&D work on room acoustics, as well as an educator.After studying acoustics at the Kyushu Institute of Design, he joined SONA Corporation and began his career as an acoustic designer.In 2005, he received... Read More →
Authors
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician specializing in studio acoustic design and R&D work on room acoustics, as well as an educator.After studying acoustics at the Kyushu Institute of Design, he joined SONA Corporation and began his career as an acoustic designer.In 2005, he received... Read More →

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Measurement and Applications of Directional Room Impulse Responses (DRIRs) for Immersive Sound Reproduction
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Traditional methods for characterizing Room Impulse Responses (RIRs) employing omnidirectional microphones do not fully capture the spatial properties of sound in an acoustic space. In this paper we explore a method for the characterization of room acoustics employing Directional Room Impulse Responses (DRIRs), which include the direction of arrival of the reflected sound waves in an acoustic space in addition to their time of arrival and strength. We measured DRIRs using a commercial 3D sound intensity probe (Weles Acoustics WA301) containing x, y, z acoustic velocity channels in addition to a scalar pressure channel. We then employed the measured DRIR’s to predict the binaural signals that would be measured by binaural dummy head microphones placed at the same location in the room where the DRIR was measured. The predictions can then be compared to the actual measured binaural signals. Successful implementation of DRIRs could significantly enhance applications in AR/VR and immersive sound reproduction by providing listeners with room-specific directional cues for early room reflections in addition to the diffuse reverberant impulse response tail.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Quantitative Assessment of Acoustical Attributes and Listener Preferences in Binaural Renderers with Head-tracking Function
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
The rapid advancement of immersive audio technologies has popularized binaural renderers that create 3D auditory experiences using head-related transfer functions (HRTFs). Various renderers with unique algorithms have emerged, offering head-tracking functionality for real-time adjustments to spatial audio perception. Building on our previous study, we compared binauralized music from five renderers with the dynamic head-tracking function enabled, focusing on how differences in HRTFs and algorithms affect listener perceptions. Participants assessed overall preference, spatial fidelity, and timbral fidelity by comparing paired stimuli. Consistent with our earlier findings, one renderer received the highest ratings for overall preference and spatial fidelity, while others rated lower in these attributes. Physical analysis showed that interaural time differences (ITD), interaural level differences (ILD), and frequency response variations contributed to these outcomes. Notably, hierarchical cluster analysis of participants' timbral fidelity evaluations revealed two distinct groups, suggesting variability in individual sensitivities to timbral nuances. While spatial cues, enhanced by head tracking, were generally found to be more influential in determining overall preference, the results also highlight that timbral fidelity plays a significant role for certain listener groups, indicating that both spatial and timbral factors should be considered in future developments.
Speakers
avatar for Rai Sato

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology
Rai Sato (佐藤 来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →
Authors
avatar for Rai Sato

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology
Rai Sato (佐藤 来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Review: Head-Related Impulse Response Measurement Methods
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This review paper discusses the advancements in Head-Related Impulse Response measurement methods. HRIR (Head-Related Impulse Response) measurement methods, often referred to as HRTF (Head-Related Transfer Function) measurement methods, have undergone significant changes over the last few decades [1]. A frequently employed method is the Discrete stop-and-go method [1][2]. It involves changing the location of a single speaker, used as the sound source, and recording the impulse response at each location. [2]. Since the measurement is for 1 location of the sound source at a time, using the discrete stop-and-go method is time-consuming [1]. Hence improvements are required to enhance the efficiency of the measurement process such as using more sound sources (speakers) [1][3]. A typical HRTF measurement is usually conducted in an anechoic chamber to achieve a simulated free-field measurement condition without room reverberation. It measures the transfer function between the source and the ears to perceive localisation cues such as inter-aural time differences (ITDs), inter-aural level differences (ILDs), as well as monaural spectral cues [4]. Newer techniques such as the Multiple Exponential Sweep Method (MESM) and the reciprocal method offer alternatives. These methods enhance measurement efficiency and address challenges like inter-reflections and low-frequency response [5][6]. Individualised HRTF measurement techniques can be categorised into acoustical measurement, anthropometric data, and perceptual feedback [7]. Interpolation methods and non-anechoic environment measurements have expanded the practical application and feasibility of HRTF measurements [8][9][10][7].
Speakers
avatar for Jeremy Tsuaye

Jeremy Tsuaye

New York University
Authors
avatar for Jeremy Tsuaye

Jeremy Tsuaye

New York University
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

The effects of interaural time difference and interaural level difference on sound source localization on the horizontal plane
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Interaural Time Difference (ITD) and Interaural Level Difference (ILD) are the main cues used by the human auditory system to localize sound sources on the horizontal plane. To explore the relationship between ITD, ILD, and the perceived azimuth, a study was conducted to measure and analyze the localization effects on the horizontal plane by combining ITD and ILD. Pure tones were used as sound sources in the experiment. For each of the three different frequency bands, 25 combinations of ITD and ILD test values were selected. These combinations were used to process the perceived sound from directly in front of the listener (pure tone signals collected using an artificial head in an anechoic chamber). The tests were conducted using the 1up/2down and 2AFC (two-alternative forced-choice) psychophysical testing methods. The results showed that the perceived azimuth at 350 Hz and 570 Hz was generally higher than at 1000 Hz. Additionally, the perceived azimuth at 350 Hz and 570 Hz was similar under certain combinations. The experimental data and conclusions can provide foundational data and theoretical support for efficient compression of multi-channel audio.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -