Loading…
AES Show 2024 NY has ended
Exhibits+ badges provide access to the ADAM Audio Immersive Room, the Genelec Immersive Room, Tech Tours, and the presentations on the Main Stage.

All Access badges provide access to all content in the Program (Tech Tours still require registration)

View the Exhibit Floor Plan.
strong>1E03 [clear filter]
arrow_back View All Dates
Thursday, October 10
 

10:00am EDT

Delay detection in hearing with moving audio objects at various azimuths and bandwidths
Thursday October 10, 2024 10:00am - 10:20am EDT
In order to design efficient binaural rendering systems for 3D audio, it’s important to understand how delays in updating the relative directions of sound sources from a listener to compensate for listener’s head movements affect the sense of realism. However, sufficient study has not yet been done with this problem. We therefore investigated the delay detection capability (threshold) of hearing during localization of audio objects. We used moving sound sources emitted from loudspeakers emulating smooth update of head related transfer functions (HRTFs) and delayed update of HRTFs. We investigated the delay detection threshold with different bandwidth, direction, and speed of sound source signals. As a result, the delay detection thresholds in this experiment were found to be approximately 100ms to 500ms, and it was observed that the detection thresholds vary depending on the bandwidth and direction of the sound source. On the other hand, no significant variation in detection thresholds was observed based on the speed of sound source movement.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers
avatar for Masayuki Nishiguchi

Masayuki Nishiguchi

Professor, Akita Prefectural University
Masayuki Nishiguchi received his B.E., M.S., and Ph.D. degrees from Tokyo Institute of Technology, University of California Santa Barbara, and Tokyo Institute of Technology, in 1981, 1989, and 2006 respectively.  He was with Sony corporation from 1981 to 2015, where he was involved... Read More →
Authors
avatar for Masayuki Nishiguchi

Masayuki Nishiguchi

Professor, Akita Prefectural University
Masayuki Nishiguchi received his B.E., M.S., and Ph.D. degrees from Tokyo Institute of Technology, University of California Santa Barbara, and Tokyo Institute of Technology, in 1981, 1989, and 2006 respectively.  He was with Sony corporation from 1981 to 2015, where he was involved... Read More →
Thursday October 10, 2024 10:00am - 10:20am EDT
1E03

10:20am EDT

Expanding and Analyzing ODAQ - The Open Dataset of Audio Quality
Thursday October 10, 2024 10:20am - 10:40am EDT
Datasets of processed audio signals along with subjective quality scores are instrumental for research into perception-based audio processing algorithms and objective audio quality metrics. However, openly available datasets are scarce due to listening test effort and copyright concerns limiting the distribution of audio material in existing datasets. To address this problem, Open Dataset of Audio Quality (ODAQ) was introduced, containing audio material along with extensive subjective test results with permissive licenses. The dataset comprises processed audio material with six different classes of signal impairments in multiple levels of processing strength covering a wide range of quality levels. The subjective quality evaluation has recently been extended and now comprises results from three international laboratories providing a total 42 listeners and 10080 subjective scores overall. Furthermore, ODAQ was recently expanded by a performance evaluation of common objective metrics for perceptual quality evaluation in their ability to predict subjective scores. The wide variety of audio material and test subjects in the test provides insight into influences and biases in subjective evaluation, which we investigated by statistical analysis, finding listener-based, training-based and lab-based influences. We also demonstrate the methodology for contributing to ODAQ, and make a request for additional contributors. In conclusion, the diversity of the processing methods and quality levels, along with a large pool of international listeners and permissive licenses make ODAQ particularly suited for further research into subjective and objective audio quality.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Authors
avatar for Christoph Thompson

Christoph Thompson

Director of Music Media Production, Ball State University
Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →
avatar for Pablo Delgado

Pablo Delgado

Fraunhofer IIS
Pablo Delgado is part of the scientific staff of the Advanced Audio Research Group at the Fraunhofer Institute for Integrated Circuits (IIS) in Erlangen, Germany. He specializes in psychoacoustics applied to audio and speech coding, as well as machine learning applications in audio... Read More →
Thursday October 10, 2024 10:20am - 10:40am EDT
1E03

10:40am EDT

Perceptual Evaluation of Hybrid Immersive Audio Systems in Orchestral Settings
Thursday October 10, 2024 10:40am - 11:00am EDT
This study investigates the perceptual strengths and weaknesses of various immersive audio capture techniques within an orchestral setting, employing channel-based, object-based, and scene-based methodologies concurrently. Conducted at McGill University’s Pollack Hall in Montreal, Canada, the research featured orchestral works by Boulanger, Prokofiev, and Schubert, performed by the McGill Symphony Orchestra in April 2024.
The innovative aspect of this study lies in the simultaneous use of multiple recording techniques, employing traditional microphone setups such as a Decca tree with outriggers, alongside an experimental pyramidal immersive capture system and a 6th order Ambisonic em64 “Eigenmike.” These diverse methodologies were selected to capture the performance with high fidelity and spatial accuracy, detailing both the performance's nuances and the sonic characteristics imparted by the room. The capture of this interplay is the focus of this study.
The project aimed to document the hall's sound quality in its last orchestral performance before closing for 2 years for renovations, providing the methodology and documentation needed for future comparative recordings of the acoustics before and after. The pyramidal system, designed with exaggerated spacing, improves decorrelation at low frequencies, allowing for the impression of a large room within a smaller listening space. Meanwhile, Ambisonic recordings provided insights into single-point versus spaced multi-viewpoint capture.
Preliminary results from informal subjective listening sessions suggest that combining different systems offers potential advantages over any single method alone, supporting exploration of hybrid solutions as a promising area of study for audio recording, enhancing the realism and spatial immersion of orchestral music recordings.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers Authors
avatar for Kathleen Ying-Ying Zhang

Kathleen Ying-Ying Zhang

PhD Candidate, McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Thursday October 10, 2024 10:40am - 11:00am EDT
1E03

11:00am EDT

Perception of the missing fundamental in vibrational complex tones
Thursday October 10, 2024 11:00am - 11:20am EDT
We present a study on the perception of the missing fundamental in vibrational complex tones. When asked to match an audible frequency to the frequency of a vibrational tone with a missing fundamental frequency, participants in two experiments associated the audible frequency with lower frequencies than those present in the vibration, often corresponding to the missing fundamental of the vibrational tone. This association was found regardless of whether the vibration was presented on the back (first experiment) or the feet (second experiment). One possible application of this finding could be the reinforcement of low frequencies via vibrational motors even when such motors have high resonance frequencies
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers Authors
Thursday October 10, 2024 11:00am - 11:20am EDT
1E03

11:20am EDT

Perceptual loudness compensation for evaluation of personalized earbud equalization
Thursday October 10, 2024 11:20am - 11:40am EDT
The ear canal geometries of individuals are very different, resulting in significant variations in SPL at the drum reference point (DRP). Knowledge of the personalized transfer function from a near-field microphone (NFM) in the earbud speaker tip to DRP allows personalized equalization (P.EQ). A method to perceptually compensate for loudness perception within the evaluation of different personalized equalization filters for earbuds has been developed. The method includes: measurements at NFM to estimate the personalized transfer function from the NFM point to DRP, calibration of the NFM microphone, the acquisition of the transfer function from the earbuds' speaker terminals to NFM, and the estimation of perceptual loudness in phones at DRP, when applying the different equalization filters. The loudness estimation was computed using the Moore-Glasberg method as implemented in ISO 532-2. The difference to DRP estimation was recursively adjusted using pink noise until the delta was within 0.1 dB. The corresponding gains were applied to the different conditions to be evaluated. A listening test was performed to evaluate three conditions using the described method for loudness compensation.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers
avatar for Adrian Celestinos

Adrian Celestinos

Samsung Research America
Authors
Thursday October 10, 2024 11:20am - 11:40am EDT
1E03

11:40am EDT

The audibility of true peak distortion (0 dBFS+)
Thursday October 10, 2024 11:40am - 12:00pm EDT
In a recent study, the authors interviewed five professional mastering engineers on the topic of contemporary loudness practices in music. Among the findings, all five mastering engineers targeted their peak levels very close to 0 dBFS and seemed somewhat unconcerned regarding true peak distortion emerging in the transcoding process, not adapting to the current recommendations of not exceeding -1 dB true peak. Furthermore, true peak measurements over the last four decades show that quite a few releases even measure true peaks above 0 dBFS in full quality. The aim of this study is to investigate the audibility of such overshoots by conducting a tailored listening test. The results indicate that even experienced and trained listeners may not be very sensitive to true peak distortion.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers
avatar for Pål Erik Jensen

Pål Erik Jensen

University College Teacher, Høyskolen Kristiania
TeachingAudio productionMusic studio productionPro ToolsGuitarBass
Authors
avatar for Pål Erik Jensen

Pål Erik Jensen

University College Teacher, Høyskolen Kristiania
TeachingAudio productionMusic studio productionPro ToolsGuitarBass
avatar for Tore Teigland

Tore Teigland

Professor, Kristiania University College
Thursday October 10, 2024 11:40am - 12:00pm EDT
1E03

12:00pm EDT

Evaluation of sound colour in headphones used for monitoring
Thursday October 10, 2024 12:00pm - 12:20pm EDT
Extensive studies have been made into achieving generally enjoyable sound colour in headphone listening, but few publications have been written focusing on the demanding requirements of a single audio professional, and what they actually hear. The present paper describes a structured and practical method, based on in-room monitoring, for getting to know yourself as a headphone listener, and the particular model and pair you are using. Headphones provide fundamentally different listening results compared to in-room monitoring adhering to professional standards; considering imaging, auditory envelopment, localization, haptic cues etc. Moreover, in headphone listening there may be no direct connection between the frequency response measured with a generic manikin and what a given user hears. Finding out just how a pair of headphones deviates from neutral sound colour must therefore be achieved personally. An evaluation scheme based on an ultra-nearfield reference system is described, augmented by a defined test setup and procedure.
Moderators
avatar for Sascha Dick

Sascha Dick

Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute... Read More →
Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →
Authors
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →
Thursday October 10, 2024 12:00pm - 12:20pm EDT
1E03

2:20pm EDT

2nd order Boom Microphone
Thursday October 10, 2024 2:20pm - 2:40pm EDT
Boom microphones are mostly used in noisy environments and equipped with directional microphones to enhance speech pickup from talkers and suppress noise from surroundings. Theory suggests that using a 2nd order directional microphone array, that is simply a pair of two directional microphones, would greatly increase the amount of far-field noise rejection, with negligibly little degradation of the near-field pickup. We conducted a series of laboratory measurements to validate the theory and assess the feasibility of 2nd order boom microphone applications. The laboratory measurements were designed using Knowles Electronics Manikin for Acoustic Research (KEMAR) with two speakers to play cockpit noise, one being placed on axis of the KEMAR front and one being set to side. A mouth simulator inside KEMAR played artificial voice. The measurements were done at two distances, 6 and 25 mm, from the KEMAR mouth with a 1st order boom microphone, that is a single directional boom microphone, and a 2nd order boom microphone. Spice simulations were also conducted to support the experimental findings. Both the measurements and Spice simulations proved the theory was true until the 2nd order boom microphone was placed near KEMAR. Then, reflections off the KEMAR head degraded the performance of the 2nd order microphone while improving that of the 1st order microphone. The net result shows that the 2nd order microphone is not superior than the 1st order microphone. This article describes details of the theory, our experimental measurements, and the findings.
Moderators
PG

Paul Geluso

Director of the Music Technolo, New York University
Speakers
avatar for Jeong Nyeon Kim

Jeong Nyeon Kim

Senior Electro-Acoustic Application Engineer, Knowles Electrics
Authors
avatar for Jeong Nyeon Kim

Jeong Nyeon Kim

Senior Electro-Acoustic Application Engineer, Knowles Electrics
Thursday October 10, 2024 2:20pm - 2:40pm EDT
1E03

2:40pm EDT

Improved Analogue-to-Digital Converter for High-quality Audio
Thursday October 10, 2024 2:40pm - 3:00pm EDT
A high-oversampled, low-bit modulator typical of modern audio ADCs needs a downsampler to provide PCM audio at sampling rates ranging from 44.1kHz to 768kHz. Traditionally, a multistage downsampler requantizes at each stage, raising questions about audio transparency. We present a decimator design in which there is no requantization other than a single dithered quantization when necessary to produce a final audio output of finite precision such as 24 bits. All processing is minimum-phase and concordant with the principles introduced in [1] which optimize for a specific compact impulse response and minimal (zero) modulation noise. [2]
Moderators
PG

Paul Geluso

Director of the Music Technolo, New York University
Speakers Authors
Thursday October 10, 2024 2:40pm - 3:00pm EDT
1E03

3:00pm EDT

The sound of storytelling: the role of sound design and music in the ’drama’ genre of film
Thursday October 10, 2024 3:00pm - 3:20pm EDT
The integration of sound effects and music plays a central role in shaping the audience's emotional engagement and narrative comprehension in film. The 'drama' genre in film is primarily concerned with depicting human emotions and human narrative-based storytelling. Ten scenes were analysed, with participants in three groups exposed to six combinations of audio and visual stimuli. Participants reported salient sounds and their interpretations, focusing on context and emotional responses. One hypothesis is that effective sound design blurs the line between music and sound effects; another is that music conveys more emotion while sound effects enhance immersion. The results showed that 63\% of participants found the score more relevant to the context. The evaluation highlights that music alone emphasizes certain emotions more, while sound effects alone create moderate variability between emotion and sound identification.
Moderators
PG

Paul Geluso

Director of the Music Technolo, New York University
Speakers Authors
Thursday October 10, 2024 3:00pm - 3:20pm EDT
1E03
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -