AES Show 2024 NY: Full Schedule

Exhibits+ badges provide access to the ADAM Audio Immersive Room , the Genelec Immersive Room, Tech Tours, and the presentations on the Main Stage .

All Access badges provide access to all content in the Program (Tech Tours still require registration)

View the Exhibit Floor Plan.

arrow_back View All Dates

8:00am EDT

Attendee Registration

Tuesday October 8, 2024 8:00am - 6:00pm EDT

Crystal Palace South

Tuesday October 8, 2024 8:00am - 6:00pm EDT
Crystal Palace South

9:00am EDT

Student Welcome Meeting

Tuesday October 8, 2024 9:00am - 9:30am EDT

1E08

Students! Joins us so we can find out where you are from, and tell you about all the exciting things happening at the convention.

Speakers

Ian Corbett

Coordinator & Professor, Audio Engineering & Music Technology, Kansas City Kansas Community College

Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates off-beat-open-hats LLC, providing live sound, recording, and audio production services to clients in the Kansas City area... Read More →

Angela Piva

Angela Piva, Audio Pro/Audio Professor, highly skilled in all aspects of music & audio production, recording, mixing and mastering with over 35 years of professional audio engineering experience and accolades. Known as an innovator in sound technology, and for contributing to the... Read More →

Tuesday October 8, 2024 9:00am - 9:30am EDT
1E08

Education, Workshop

9:00am EDT

NPR's Tiny Desk: A look back and a look forward

Tuesday October 8, 2024 9:00am - 10:00am EDT

1E09

NPR's Tiny Desk Concert Series has seen and increase in its popularity post-pandemic and is now a must for up and coming as well as established artists. Neil Tevault started the concert series when Bob Boilen came to NPR's music studio with an idea to record an artist at his desk. Josh Newell is the current technical director of NPR Music who is working to keep the standards set at the highest level and move the concert series into the future.

Neil will present a couple of concerts from early in the series looking at the small recording footprint that was employed. The very first concert with Laura Gibson, which was a single microphone and a DI and Bela Fleck, Edgar Meyer, and Zakir Hussain who played around a single stereo shotgun microphone.

Josh will present a couple of recent concerts to show how everything has grown and we have moved all the microphones closer and increased the number of microphones to capture the sound that the Tiny Desk is known for. Josh will also look at what we're doing to make things work better and make the recording process more efficient on concert day. We'll discuss Josh's vision for the sound of the Tiny Desk moving into the future.

We'll also leave ample time for Q&A

Speakers

Neil Tevault

Josh Newell

Tuesday October 8, 2024 9:00am - 10:00am EDT
1E09

Broadcast and Online Delivery, Workshop | Recording and Production, Workshop

9:00am EDT

Personalized Spatial Audio for Accessibility in Xbox Gaming

Tuesday October 8, 2024 9:00am - 10:00am EDT

1E16

In this enlightening panel, technologists from the Xbox platform and creatives from Xbox studios will join us to discuss how they are driving audio innovation and game sound design towards the vision of gaming for everyone. The discussion will focus on how personalized spatial audio can foster inclusivity by accommodating the unique auditory profiles of different ethnicities, genders, and age groups of gamers. By integrating these personalized Head-Related Transfer Functions (HRTFs) into audio middleware, we aim to enhance the Xbox gaming experience for all gamers. This approach not only enriches the auditory landscape but also breaks down barriers, making immersive gaming a truly inclusive experience. Join us as we explore the future of spatial audio on Xbox, where every gamer is heard and can fully engage with the immersive worlds we create.

Speakers

Kaushik Sunder

VP of Engineering, Embody

Kapil Jain

CEO, Embody

We are Embody, and we believe in the power of AI to push the boundaries of immersive sound experience in gaming and entertainment.We are a team of data scientists, audio engineers, musicians and gamers who are building an ecosystem of tools and technologies for Immersive entertainment... Read More →

Robert Ridihalgh

Technical Audio Specialist, Microsoft

A 33-year veteran of the games industry, Robert is an audio designer, composer, integrator, voice expert, and programmer with a passion for future audio technologies and audio accessibility.

Tuesday October 8, 2024 9:00am - 10:00am EDT
1E16

Game Audio & XR, Workshop | Immersive & Spatial Audio, Workshop

9:00am EDT

Adapting Immersive Microphone Techniques for different Acoustics Spaces

Tuesday October 8, 2024 9:00am - 10:00am EDT

1E06

By now we know a lot about how immersive microphone arrays work, from fully coincident, to semi-coincident, to fully spaced etc., and how they can be used to make spectacular 3D productions, But what happens when we use the same recording techniques in very different acoustic spaces with different size ensembles etc.? Various hi-resolution immersive recordings of acoustic music from around the globe made in extraordinary acoustic spaces including Skywalker Scoring Stage, Zlin Concert Hall, Notre Dame University, NYU's Paulson Center, CNSO Studios, Smecky Studios, and other exceptional churches and concert halls will be presented. Along with critical listening exercises, we will share microphone set-ups, multi-track recordings, videos and photo documentation of the spaces with the participants.

Speakers

Paul Geluso

Director of the Music Technolo, New York University

David Bowles

Owner, Swineshead Productions, LLC

David v.R Bowles formed Swineshead Productions, LLC as a classical recording production company in 1995. His recordings have been GRAMMY- and JUNO-nominated and critically acclaimed worldwide. His releases in 3D Dolby Atmos can be found on Avie, OutHere Music (Delos) and Navona labels.Mr... Read More →

Tuesday October 8, 2024 9:00am - 10:00am EDT
1E06

Immersive Music, Workshop | Recording and Production, Workshop

9:00am EDT

Best practices for wireless audio in live productions

Tuesday October 8, 2024 9:00am - 10:00am EDT

1E07

Wireless audio, both mics and in-ear-monitors, has become essential in many live productions of music and theatre, but it is often fraught with uneasiness and uncertainty. The panel of presenters will draw on their varied experience and knowledge to show how practitioners can use best engineering practices to ensure reliability and performance of their wireless mic and in-ear-monitor systems.

Speakers

Bob Lee

Applications Engineer / Trainer, RF Venue, Inc.

I'm a fellow of the AES, an RF and electronics geek, and live audio specialist, especially in both amateur and professional theater. My résumé includes Senhheiser, ARRL, and a 27-year-long tenure at QSC. Now I help live audio practitioners up their wireless mic and IEM game.I play... Read More →

Henry Cohen

ASR Co-Chair, AES 151

Jim Vanbergen

Tuesday October 8, 2024 9:00am - 10:00am EDT
1E07

Live Sound, Workshop | Sound Reinforcement, Workshop

9:00am EDT

dLive Certification

Tuesday October 8, 2024 9:00am - 12:00pm EDT

1E05

In addition to long-form dLive Certification classes, training sessions will cover a variety of topics relevant to live sound engineers, including: Mixing Monitors from FOH; Vocal Processing; Groups, Matrices and DCAs; Active Dynamics; and Gain Staging.

The training will be led by industry veterans Michael Bangs and Jake Hartsfield. Bangs, whose career includes experience as a monitor engineer and production manager, has worked with A-list artists, including Aerosmith, Katy Perry, Tom Petty, Lynyrd Skynyrd and Kid Rock. Hartsfield is a seasoned live sound engineer, having mixed for artists like Vulfpeck, Ben Rector, Fearless Flyers, and more.

Sign up link: https://zfrmz.com/DmSlX5gyZCfjrJUHa6bV

Tuesday October 8, 2024 9:00am - 12:00pm EDT
1E05

Sponsor Session

9:30am EDT

An Industry Focused Investigation into Immersive Commercial Melodic Rap Production - Part Two

Tuesday October 8, 2024 9:30am - 9:50am EDT

1E03

In part one of this study, five professional mixing engineers were asked to create a Dolby Atmos 7.1.4 mix of the same melodic rap song adhering to the following commercial music industry specifications: follow the framework of the stereo reference, implement binaural distance settings, and conform to –18LKFS, -1dBTP loudness levels. An analysis of the mix sessions and post-mix interviews with the engineers revealed that they felt creatively limited in their approaches due to the imposed industry specifications. The restricted approaches were evident through the minimal applications of mix processing, automation, and traditional positioning of key elements in the completed mixes.
In part two of this study, the same mix engineers were asked to complete a second mix of the same song without any imposed limitations and were encouraged to approach the mix creatively. Intra-subject comparisons between the restricted and unrestricted mixes were explored to identify differences in element positioning, mix processing techniques, panning automation, loudness levels, and binaural distance settings. Analysis of the mix sessions and interviews showed that when no restrictions were imposed on their work, the mix engineers emphasized the musical narrative through more diverse element positioning, increased use of automation, and applications of additional reverb with characteristics that differed from the reverb in the source material.

Moderators

Agnieszka Roginska

Professor, New York University

Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →

Speakers

Christal Jerez

Engineer, Christal's Sonic Lab

Christal Jerez is an audio engineer with experience recording, mixing and mastering music. After studying audio production at American University for her B.A. in Audio Production and at New York University for her Masters degree in Music Technology, she started working professionally... Read More →

Authors

Christal Jerez

Engineer, Christal's Sonic Lab

Andrew Scheps

Owner, Tonequake Records

Andrew Scheps has worked with some of the biggest bands in the world: Green Day, Red Hot Chili Peppers, Weezer, Audioslave, Black Sabbath, Metallica, Linkin Park, Hozier, Kaleo and U2. He’s worked with legends such as Johnny Cash, Neil Diamond and Iggy Pop, as well as indie artists... Read More →

Austin Moore

Christopher P. Dewey

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield

Professor

Tuesday October 8, 2024 9:30am - 9:50am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

9:30am EDT

Physiological measurement of the arousing effect of bass amplification in music

Tuesday October 8, 2024 9:30am - 10:00am EDT

1E04

Music's amazing ability to evoke emotions has been the focus of various scientific studies, with researchers testing how different musical structures or interpretations impacted the emotions induced in the listener. However, in the context of amplified music, little is known about the influence of the sound reinforcement system. In this study, we investigate whether the amount of low-frequency amplification produced by a sound system impacts the listener's arousal. We organized two listening experiments whereby we measured the skin conductance of the participants while they were listening to music excerpts with different levels of low-frequency amplification. Our results indicate that an increase in the level of bass is correlated with a small but measurable rise in electrodermal activity, which is correlated with arousal. In addition this effect seems to depend on the nature of the music.

Moderators

Brett Leonard

Director of Music Technology Programs, University of Indianapolis

Speakers

Nicolas Epain

Application Research Engineer, L-Acoustics

Authors

Nicolas Epain

Application Research Engineer, L-Acoustics

Etienne Corteel

Luc Arnal

Mérové Wallerich

Thomas Mouterde

Field application research engineer, L-Acoustics

Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →

Tuesday October 8, 2024 9:30am - 10:00am EDT
1E04

Recording and Production, Paper Lecture

9:45am EDT

Introducing the 2025 AES AI and ML for Audio Conference and its New Format

Tuesday October 8, 2024 9:45am - 10:30am EDT

1E08

The upcoming 2025 AES International Conference on Artificial Intelligence and Machine Learning for Audio (AIMLA) aims to foster a collaborative environment where researchers and practitioners from academia and industry can converge to share their latest work in Artificial Intelligence (AI) and Machine Learning (ML) for Audio.

We want to advertise the upcoming AIMLA at the AES Show, to encourage early involvement and awareness from the AES community. To better accommodate the central themes of the conference, we propose new additions to the typical AES proceedings, such as challenges and long workshops, that can more appropriately showcase the rapidly growing state of the art. In this presentation, we plan to give an overview and a discussion space about the upcoming conference, and the changes we want to bring into play, tailored for AI/ML research communities, with references to successfully organized cases outside of AES. Finally, we propose a standardized template with guidelines for hosting crowdsourced challenges and presenting long workshops.

Challenges are a staple in the ML/AI community, providing a platform where specific problems are tackled by multiple teams who develop and submit models to address the given issue. These events not only spur competition but also encourage collaboration and knowledge sharing, ultimately driving forward the collective understanding and capabilities of the community.

Complementing the challenges, we introduce long-format workshops to exchange knowledge about emerging AI approaches in audio. These workshops can help develop novel approaches from the ground up and produce high-quality material for diffusion among participants. Both additions could help the conference become an exciting and beneficial event at the forefront of AI/ML for audio, as they intend to cultivate a setting where ideas can be exchanged effectively, drawing inspiration from established conferences such as ISMIR, DCASE, and ICASSP, which have successfully fostered AI/ML communities.

As evidenced by the recent AES International Symposium on AI and the Musician, we believe AI and ML will play an increasingly important role in audio and music engineering. To facilitate and standardize the procedures for featuring and conducting challenges and long-form workshops, we will present a complete guideline for hosting long-form workshops and challenges at AES conferences.

Our final goal is to promote the upcoming 2025 International Conference on AI and Machine Learning for Audio, generate a space to discuss the new additions and ideas, connect with interested parties, advertise and provide guidelines regarding the calls for crowd-sourced challenges and workshops, and ultimately get feedback from the AES as a whole to tailor the new conference to the requirements of both our AES and the AI/ML communities.

Speakers

Soumya Sai Vanka

PhD Researcher, Queen Mary University of London

I am a doctoral researcher at the Centre for Digital Music, Queen Mary University of London under the AI and Music Centre for Doctoral Training Program. My research focuses on the design of user-centric context-aware AI-based tools for music production. As a hobbyist musician and producer myself, I am interested in developing tools that can support creativity and collaboration resulting in emergence and novelty. I am also interested... Read More →

Franco Caspe

Student, Queen Mary University of London

I’m an electronic engineer, a maker, hobbyist musician and a PhD Student at the Artificial Intelligence and Music CDT at Queen Mary University of London. I have experience in development of real-time systems for applications such as communication, neural network inference, and DSP... Read More →

Brecht De Man

Head of Research, PXL University of Applied Sciences and Arts

Brecht is an audio engineer with a broad background comprising research, software development, management and creative practice. He holds a PhD from the Centre for Digital Music at Queen Mary University of London on the topic of intelligent software tools for music production, and... Read More →

Tuesday October 8, 2024 9:45am - 10:30am EDT
1E08

Machine Learning and Artificial Intelligence, Workshop

9:50am EDT

Investigation of spatial resolution of first and high order ambisonics microphones as capturing tool for auralization of real spaces in recording studios equipped with virtual acoustics systems

Tuesday October 8, 2024 9:50am - 10:10am EDT

1E03

This paper proposes a methodology for studying the spatial resolution of a collection of first-order and high-order ambisonic microphones when employed as a capturing tool of Spatial Room Impulse Responses (SRIRs) for virtual acoustics applications. In this study, the spatial resolution is defined as the maximum number of mono statistically independent impulse responses that can be extracted through beamforming technique and used in multichannel convolution reverbs. The correlation of the responses is assessed as a function of the beam angle and frequency bands, adapted to the frequency response of the loudspeakers in use, with the aim to be used in recording studios equipped with virtual acoustics systems that operate in the creation of the spatial impression of reverberation of real environments. The study examines the differences introduced by the physical characteristics of the microphones, the normalization methodologies of the spherical harmonics, and the number of spherical harmonics introduced in the encoding (ambisonic order). Preliminary results show that the correlation is inversely proportional to frequency as a function of wavelength.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Gianluca Grazioli

Montreal, Canada, McGill University

Authors

Gianluca Grazioli

Montreal, Canada, McGill University

Jack Kelly

Mathew Vallejo

Richard King and Wieslaw Woszczyk

Tuesday October 8, 2024 9:50am - 10:10am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

10:00am EDT

Exploring trends in audio mixes and masters: Insights from a dataset analysis

Tuesday October 8, 2024 10:00am - 10:30am EDT

1E04

We present an analysis of a dataset of audio metrics and aesthetic considerations about mixes and masters provided by the web platform MixCheck studio.The platform is designed for educational purposes, primarily targeting amateur music producers, and aimed at analysing their recordings prior to them being released. The analysis focuses on the following data points: integrated loudness, mono compatibility, presence of clipping and phase issues, compression and tonal profile across 30 user-specified genres. Both mixed (mixes) and mastered audio (masters)are included in the analysis, where mixes refer to the initial combination and balance of individual tracks, and masters refer to the final refined version optimized for distribution. Results show that loudness-related issues along with dynamics issues are the most prevalent, particularly in mastered audio. However mastered audio presents better results in compression than just mixed audio. Additionally, results show that mastered audio has a lower percentage of stereo field and phase issues.

Moderators

Brett Leonard

Director of Music Technology Programs, University of Indianapolis

Speakers

David Ronan

CEO, RoEx

CEO

Authors

David Ronan

CEO, RoEx

CEO

Angeliki Mourgela

Elio Quinton

Joshua Reiss

Spyridon Bissas

Tuesday October 8, 2024 10:00am - 10:30am EDT
1E04

Recording and Production, Paper Lecture

10:10am EDT

A comparative study of volumetric microphone techniques and methods in a classical recording context

Tuesday October 8, 2024 10:10am - 10:30am EDT

1E03

This paper studies volumetric microphone techniques (i.e. using configurations of multiple Ambisonic microphones) in a classical recording context. A pilot study with expert opinions was designed to show its feasibility. Based on the findings from the pilot study, a trio recording of piano, violin, and cello was conducted where 6 Ambisonic microphones established a hexagon. Such a volumetric approach is believed to improve the sound characteristics where the recordings were processed with the SoundField by RØDE Ambisonic decoder and were produced into a 7.0.4 loudspeaker system. A blinded subject experiment was designed where the participants were asked to evaluate the volumetric hexagonal configuration, comparing it to a more traditional 5.0 immersive configuration and a single Ambisonic microphone, all of which were mixed with spot microphones. These results were quantitatively analyzed, and revealed that the volumetric configuration is the most localized amongst all, but less immersive than the single Ambisonic microphone. No significant difference occurred in focus, naturalness, and preference. The analyses are generalized because the demographic backgrounds of the participants have no effect on the sound characteristics.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Hanzhi Zhang

Authors

Hanzhi Zhang

Parichat Songmuang

Studio Manager/PhD Student, New York University

Parichat Songmuang graduated from New York University with her Master of Music degree in Music Technology at New York University and Advanced Certificate in Tonmeister Studies. As an undergraduate, she studied for her Bachelor of Science in Electronics Media and Film with a concentration... Read More →

Paul Geluso

Director of the Music Technolo, New York University

Tuesday October 8, 2024 10:10am - 10:30am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

10:15am EDT

For Future Use - Surround and Immersive from Familiar Sources

Tuesday October 8, 2024 10:15am - 11:15am EDT

1E06

In a perfect world, one would record projects in the perfect space with musicians performing in perfect balance, with perfect results, but for us, the perfect world doesn’t exist and we’re presented with projects that were never intended for surround or immersive presentation, In this illustrated talk, we will play projects that were recorded (even in the last century!) and have been turned into successful immersive and surround recordings..

Speakers

Jim Anderson

Producer/Engineer, Anderson Audio New York

Producer/Engineer

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York

Engineer/Producer, Co-Founder

Tuesday October 8, 2024 10:15am - 11:15am EDT
1E06

Immersive & Spatial Audio, Workshop | Immersive Music, Workshop

10:15am EDT

Adventures in Livestreaming

Tuesday October 8, 2024 10:15am - 11:45am EDT

1E09

During the Covid pandemic, streaming was the only way for musicians and artists to connect. As the pandemic subsided, many organizations continued to stream and livestream, and oftentimes it was the audio engineering team who were left to figure out how to produce video. Join this panel of experienced engineers and educators who found themselves learning how to produce compelling live concert video. The special challenges of livestreaming (i.e. streaming without a net) will be discussed, along with lessons learned. It is the intention of the panelists to provide encouragement to fellow engineers who are learning this new skillset themselves.

Speakers

Scott Burgess

Director, Audio and Media Production, Aspen Music Festival and School

Scott Burgess has worked on all facets of symphonic production. A graduate of Interlochen and the Cleveland Institute of Music, he has played bassoon and contrabassoon, sung with the Cleveland Orchestra Chorus, and produced, engineered, or edited numerous orchestral and chamber music... Read More →

John Castillo

Mary Mazurek

Audio Educator/ Recording Engineer, University of Lethbridge

Audio Educator/ Recording Engineer

Douglas McKinnie

Scott Wynne

Tuesday October 8, 2024 10:15am - 11:45am EDT
1E09

Audio for Multimedia, Workshop | Broadcast and Online Delivery, Workshop

10:15am EDT

The Plugin Kitchen: Coding a custom effect and putting it in the mix

Tuesday October 8, 2024 10:15am - 11:45am EDT

1E07

We are excited to present an educational workshop centered around the creation of custom audio plugins in Matlab. This hands-on session is designed for both students and educators on the endless possibilities of plugin development, while also encouraging participation in AES student competitions and fostering increased membership.

The workshop, titled “The Plugin Kitchen: Coding a custom effect and putting it in the mix” will offer a unique blend of real-time coding and practical application. Participants will be guided through the process of developing a comprehensive channel strip plugin, incorporating an EQ, a compressor, and a reverb effect.

Our session will begin with an introduction to audio plugins and their importance in modern music production. We will then dive into the hands-on coding segments:

1. Multi-band Parametric EQ:
Participants will learn to implement a parametric EQ, implementing multiple bands and adjusting parameters such as frequency and gain.

2. Compressor:
We will use Matlab’s built-in functions to create a compressor, covering essential concepts like threshold, ratio, attack, and release times. Attendees will see the immediate impact of these parameters on audio dynamics.

3. Reverb:
The final coding segment will focus on adding a reverb effect, including experimenting with decay time and pre-delay.

The workshop will culminate in a live demonstration by renowned engineer Paul Womack, showcasing the practical application of the developed plugin in a mixing session. This will illustrate the real-world benefits and versatility of the plugin, inspiring participants to explore further and engage with AES competitions.

Throughout the session, we emphasize interaction and practical application, ensuring that participants leave with both theoretical knowledge and a functional plugin they can continue to develop. Join us to unlock the potential of Matlab for audio plugin development and take a step towards innovative audio engineering.

Speakers

Christoph Thompson

Director of Music Media Production, Ball State University

Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →

Chris Bennett

Professor, University of Miami

Paul Womack

Record Producer/Recording Engineer

A producer, engineer and sonic artist, Paul "Willie Green" Womack has built a discography boasting names such as Armand Hammer, Wiz Khalifa, The Alchemist, The Roots, Billy Woods, ELUCID and many more, and established himself as one of the top names in independent Hip-Hop & R&B. Currently... Read More →

Tuesday October 8, 2024 10:15am - 11:45am EDT
1E07

Education, Workshop | Recording and Production, Workshop

10:15am EDT

The Anatomy of a Recording Session: Where Audio Technology and Musical Creativity Intersect (Part III)

Tuesday October 8, 2024 10:15am - 11:45am EDT

1E16

Abstract: Humans and machines interact to create new and interesting music content. Here we look at video excerpts from a particular recording session where the behind-the-scenes action comes to the forefront. Artists working with each other, artists working with the producer and engineer, and the influence (good or bad) of the technology with which they work will all be discussed during the workshop.

Summary:
The workshop will center on a discussion of using recording studio sessions to study creativity as a collaborative, but often complex and subtle practice. Our interdisciplinary team of recording engineers/producers and musicologists aims to understand how the interactions of musicians, engineers, recording technology, and musical instruments shape a recording’s outcome. Statements by participant-observers and the analysis of video footage from recording sessions will provide the starting point for discussions. In addition to first-hand recollections by members of our team, we are interviewing musicians who participated in the sessions. The workshop will focus on both musical interactions and on the interpersonal dynamics that affect the flow of various contributions and ideas during the recording process. Technology used also plays a role and will be analyzed as part of the workshop. The first workshop of this kind was a huge success in Helsinki at the 154th AES Convention. The room was packed, and we had an engaging discussion between panelists and audience members. Part II took place last week in Madrid (156th Convention), with many attendees saying it was “one of the highlights of the convention”. The workshop room was quite full, even though it was in the last timeslot of the last day. For this third workshop we plan on looking under the hood of a recording session involving a funk band with full horn section, and three lead singers. We plan to dig deeply into analyzing underlying events that percolate over time; the “quiet voices” that subtly influence the outcome of a recording session.
Along with our regular team of experts we are very excited to invite a guest panelist this time around – a venerable expert in music production, collaboration, and perception, Dr. Susan Rogers. This workshop was also proposed for AES 155th New York last fall but was declined due to lack of presentation space. Our requirements are simply a PowerPoint video presentation with stereo audio playback.

Speakers

Richard King

Professor, McGill University

Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →

Lisa Barg

Associate Professor, McGill University

Lisa Barg is Associate Professor of Music History and Musicology at the Schulich School of Music at McGill University and Associate Dean of Graduate Studies. She has published articles on race and modernist opera, Duke Ellington, Billy Strayhorn, Melba Liston and Paul Robeson. She... Read More →

David Brackett

Professor, McGill University

David Brackett is Professor of Music History and Musicology at the Schulich School of Music of McGill University, and Canada Research Chair in Popular Music Studies. His publications include Interpreting Popular Music (2000), The Pop, Rock, and Soul Reader: Histories and Debates... Read More →

Susan Rogers

Professor, Berklee Online

Susan Rogers holds a doctoral degree in experimental psychology from McGill University (2010). Prior to her science career, Susan was a multiplatinum-earning record producer, engineer, mixer and audio technician. She is best known for her work with Prince during his peak creative... Read More →

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works

George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →

Tuesday October 8, 2024 10:15am - 11:45am EDT
1E16

Historical, Workshop | Recording and Production, Workshop

10:15am EDT

SEIDS guide to Building Sustainable Businesses for Music Creators

Tuesday October 8, 2024 10:15am - 11:45am EDT

1E15

In this workshop, you'll learn how to turn your love for music into a successful business. Acclaimed music producer, Sabrina Seidman aka SEIDS' tutorials have garnered thousands of views online. She’ll talk about how to decide what you want to do, find the right clients, and make products or services they need. You’ll also learn how to get clients and create your own chances to succeed. By the end of the workshop, you'll have the tools to start building a music career that makes you happy and earns you money.

Speakers

Sabrina Seidman

Tuesday October 8, 2024 10:15am - 11:45am EDT
1E15

Recording and Production, Workshop

10:30am EDT

Bestiari: a hypnagogic experience created by combining complementary state-of-the-art spatial sound technologies, Catalan Pavilion, Venice Art Biennale 2024

Tuesday October 8, 2024 10:30am - 10:50am EDT

1E03

Bestiari, by artist Carlos Casas1, is a spatial audio installation created as the Catalan pavilion for the 2024 Venice Art Biennale. The installation was designed for ambulant visitors and the use of informal seating arrangements distributed throughout the reproduction space, so the technical installation design did not focus on listeners’ presence in a single “sweet-spot”. While high-quality conventional spatial loudspeaker arrays typically provide excellent surround-sound experiences, the particular challenge of this installation was to reach into the proximate space of individual, dispersed and mobile listeners, rather than providing an experience that was only peripherally enveloping. To that end, novel spatial audio workflows and combinations of reproduction technologies were employed, including: High-order Ambisonic (HoA), Wavefield Synthesis (WFS), beamforming icosahedral (IKO), directional/parametric ultrasound, and infrasound. The work features sound recordings made for each reproduction technology, e.g., ambient Ambisonic soundfields recorded in Catalan national parks combined with mono and stereo recordings of specific insects in that habitat simultaneously projected via the WFS system. In-situ production provided an opportunity to explore the differing attributes of the reproduction devices and their interactions with the acoustical characteristics of the space – a concrete and brick structure with a trussed wooden roof, built in the late 1800s for the Venetian shipping industry. The practitioners’ reflections on this exploration, including their perception of the capabilities of this unusual combination of spatial technologies, are presented. Design, workflows and implementation are detailed.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Craig Cieciura

Research Fellow, University of Surrey

Craig graduated from the Music and Sound Recording (Tonmeister) course at The University of Surrey in 2016. He then completed his PhD at the same institution in 2022. His PhD topic concerned reproduction of object-based audio in the domestic environment using combinations of installed... Read More →

Authors

Craig Cieciura

Research Fellow, University of Surrey

Anthony Myatt

Armand Lesecq

Carlos Casas

Filipa Ramos

Philip J B Jackson

Tuesday October 8, 2024 10:30am - 10:50am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

10:30am EDT

Audience Effect in the Low-Frequency Range, Part 2: Impact on Time Alignment of Loudspeaker Systems

Tuesday October 8, 2024 10:30am - 11:00am EDT

1E04

A sound reinforcement system typically combines a full-range system with a subwoofer system to deliver a consistent frequency bandwidth. The two systems must be time-aligned, which is usually done without an audience. This paper investigates the impact of the audience on the time alignment of loudspeaker systems at low frequencies. The study demonstrates, through on-site measurements and simulations, that the audience significantly affects sound propagation. The research highlights the greater phase shift observed with ground-stacked subwoofers compared to flown systems due to the audience’s presence, requiring adjustments of the system time alignment with the audience when flown and ground-stacked sources are used together. Moreover, in this case, the results demonstrate the lower quality of the summation with the audience even with the alignment adjustment. Lastly, recommendations for system design and calibration are proposed.

Moderators

Brett Leonard

Director of Music Technology Programs, University of Indianapolis

Speakers

Thomas Mouterde

Field application research engineer, L-Acoustics

Authors

Thomas Mouterde

Field application research engineer, L-Acoustics

Etienne Corteel

Nicolas Epain

Application Research Engineer, L-Acoustics

Tuesday October 8, 2024 10:30am - 11:00am EDT
1E04

Recording and Production, Paper Lecture

10:45am EDT

Generative AI For Novel Audio Content Creation

Tuesday October 8, 2024 10:45am - 11:45am EDT

1E08

The presence and hype associated with generative AI across most forms of recorded media have become undeniable realities. Generative AI tools are becoming increasingly more prevalent, with applications ranging from conversational chatbots to text-to-image generation. More recently, we have witnessed an influx of generative audio models which have the potential of disrupting how music may be created in the very near future. In this talk, we will highlight some of the core technologies that enable novel audio content creation for music production, reviewing some seminal text-to-music works from the past year. We will then delve deeper into common research themes and subsequent works which intend to map these technologies closer to musicians’ needs.

We will begin the talk by outlining a common framework underlying the generative audio models that we will touch on, consisting of an audio synthesizer “back-end” paired with a latent representation modeling “front-end.” Accordingly, we will overview two primary forms of back-ends in the forms of neural audio codecs and variational auto-encoders (with examples), and illustrate how they pair naturally with transformer language model (LM) and latent diffusion model (LDM) front-ends, respectively. Furthermore, we will briefly touch on CLAP and T5 embeddings as conditioning signals that enable text as an input interface, and explain the means by which they are integrated into modern text-to-audio systems.

Next, we will review some seminal works that have been released within the past year(s) (primarily in the field of text-to-music generation), and roughly categorize them according to the common framework that we have built up thus far. At the time of writing this proposal, we would naturally consider MusicLM/FX (LM), MusicGen (LM), Stable Audio (LDM), etc. as exemplary candidates for review. We will contextualize these new capabilities in terms of what they can enable for music production and opportunities for future improvements. Accordingly, we will draw on some subsequent works that intend on meeting musicians a bit closer to the creative process. At the time of writing this proposal, this may include but is not limited to ControlNet (LDM), SingSong (LM), StemGen (LM), VampNet (LM), as well as our own previous work, as approved time permits. We will cap off our talk by providing some perspectives on what AI researchers could stand to understand about music creators, and what musicians could stand to understand about scientific research. Time permitting, we may allow ourselves to conduct a live coding demonstration whereby we exemplify constructing, training, and inferring audio examples from a generative audio model on a toy data example leveraging several prevalent open source libraries.

We hope that such a talk would be both accessible and fruitful for technologists and musicians alike. It would assume no background knowledge in generative modeling, and may perhaps assume only the most notional conception as to how machine learning works. The goal of this talk would be for the audience at large to walk out with a rough understanding of the underlying technologies and challenges associated with novel audio content creation using generative AI.

Speakers

Shahan Nercessian

Tuesday October 8, 2024 10:45am - 11:45am EDT
1E08

Machine Learning and Artificial Intelligence, Workshop

10:50am EDT

Influence of Dolby Atmos versus Stereo Formats on Narrative Engagement: A Comparative Study Using Physiological and Self-Report Measures

Tuesday October 8, 2024 10:50am - 11:10am EDT

1E03

As spatial audio technology rapidly evolves, the conversation around immersion becomes ever more relevant, particularly in how these advancements enhance the creation of compelling sonic experiences. However, immersion is a complex, multidimensional construct, making it challenging to study in its entirety. This paper narrows the focus to one particular dimension—narrative engagement—to explore how it shapes the immersive experience. Specifically, we investigate whether the multichannel audio format, here 7.1.4, enhances narrative engagement compared to traditional stereo storytelling. Participants were exposed to two storytelling examples: one in an immersive format and another in a stereo fold-down. Physiological responses were recorded during listening sessions, followed by a self-report survey adapted from the Narrative Engagement Scale. The lack of significant differences between two formats in both subjective and objective measures is discussed in the context of existing studies.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield

Professor

Authors

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield

Professor

Christopher Dewey

Katarzyna Sochaczewska

Tuesday October 8, 2024 10:50am - 11:10am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

11:00am EDT

Part of the Band: Virtual Acoustic Space as a Participant in Musical Performance

Tuesday October 8, 2024 11:00am - 11:30am EDT

1E04

We detail a real-time application of active acoustics used to create a shared virtual environment over a closed audio network as a research-creation project exploring the concept of room participation in musical performance. As part of a concert given in the Immersive Media Lab at McGill University, musicians and audience members were located in a virtual acoustic environment while a second audience was located in an adjacent but acoustically isolated space on the same audio network. Overall, the blending of computer-generated and acoustic sources created a specific use case for virtual acoustics while the immersive capture and distribution method examined an avenue for producing a real-time shared experience. Future work in this area includes audio networks with multiple virtual acoustic environments.

Moderators

Brett Leonard

Director of Music Technology Programs, University of Indianapolis

Speakers

Kathleen Ying-Ying Zhang

PhD Candidate, McGill University

YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →

Authors

Kathleen Ying-Ying Zhang

PhD Candidate, McGill University

Michail Oikonomidis

Mihai-Vlad Baran

McGill University

Richard King

Professor, McGill University

Wieslaw Woszczyk

Tuesday October 8, 2024 11:00am - 11:30am EDT
1E04

Recording and Production, Paper Lecture

11:10am EDT

Creation of representative head-related impulse responses for binaural rendering of moving audio objects

Tuesday October 8, 2024 11:10am - 11:30am EDT

1E03

To achieve highly realistic 3D audio reproduction in virtual reality (VR) or augmented reality (AR) through binaural rendering, we must address the considerable computational complexity involved in convoluting head-related impulse responses (HRIRs). To reduce this complexity, an algorithm is proposed where audio signals are distributed to pre-defined representative directions through panning. Only the distributed signals are then convoluted with the corresponding HRIRs. In this study, we explored a method for generating representative HRIRs through learning, utilizing a full-sphere HRIR set. This approach takes into account smooth transitions and minimal degradation introduced during rendering, for both moving and static audio objects. Compared with conventional panning, the proposed method reduces average distortion by approximately 47% while maintaining the runtime complexity of the rendering.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Kazuki Hoshito

Authors

Kazuki Hoshito

Hikaru Usami

Kanji Watanabe

Koji Abe

Masayuki Nishiguchi

Professor, Akita Prefectural University

Masayuki Nishiguchi received his B.E., M.S., and Ph.D. degrees from Tokyo Institute of Technology, University of California Santa Barbara, and Tokyo Institute of Technology, in 1981, 1989, and 2006 respectively. He was with Sony corporation from 1981 to 2015, where he was involved... Read More →

Tomokazu Ishikawa

Tuesday October 8, 2024 11:10am - 11:30am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

11:30am EDT

Quantifying the Impact of Head-Tracked Spatial Audio on Common User Auditory Experiences using Facial Microexpressions

Tuesday October 8, 2024 11:30am - 11:50am EDT

1E03

The study aims to enhance the understanding of how Head Tracked Spatial Audio technology influences both emotional responses and immersion levels among listeners. By employing micro facial gesture recognition technology, it quantifies the depth of immersion and the intensity of emotional responses elicited by various types of binaural content, measuring categories such as Neutral, Happy, Sad, Angry, Surprised, Scared, Disgusted, Contempt, Valence, and Arousal. Subjects were presented with a randomized set of audio stimuli consisting of stereo music, stereo speech, and 5.1 movie content. Each audio piece lasted 15 seconds, and the Spatial Audio processing was On or Off randomly throughout the experiment. The FaceReader software was detecting the facial microexpressions of the subjects constantly. Statistical analysis was conducted using R software, applying Granger causality tests in time series, T-tests, and the P-value criterion for hypothesis validation. After consolidating the records of 78 participants, the final database consisted of 212,862 unique data points. With a 95% confidence, it was determined that the average level of "Arousal" is significantly higher when Head Tracked Spatial Audio is activated compared to when it is deactivated, suggesting that HT technology increases the emotional arousal of audio listeners. Regarding the happiness reaction, the highest levels were recorded in mode 5 (HT on and Voice) with an average of 0.038, while the lowest levels were detected in mode 6 (HT off and Voice). Preliminary conclusions indicate that surprise effectively causes a decrease in neutrality, supporting the dynamic interaction between these emotional variables.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Simon Calle

Authors

Simon Calle

César Alonso Cardona Cano

Jaime López Díez

Tuesday October 8, 2024 11:30am - 11:50am EDT
1E03

Immersive & Spatial Audio, Paper Lecture

11:30am EDT

INFLUENCE OF RECORDING TECHNIQUE AND ENSEMBLE SIZE ON APPARENT SOURCE WIDTH

Tuesday October 8, 2024 11:30am - 12:00pm EDT

1E04

The impression of listeners to aurally “see” the size of a performing entity is crucial to the success of both a concert hall and a reproduced sound field. Previous studies have looked at how different concert halls with different lateral reflections affect apparent source width. Yet, the perceptual effects of different source distributions with different recording techniques on apparent source width are not well understood. This study explores how listeners perceive the width of an orchestra by using four stereo and one binaural recording techniques and three wave field synthesis ensemble settings. Subjective experiments were conducted using stereo loudspeakers and headphone to play back the recording clips asking the listeners to rate the perceived wideness of the sound source. Results show that recording techniques greatly influence how wide an orchestra is perceived. The primary mechanism used to judge auditory spatial impression differs between stereo loudspeaker and headphone listening. When western classical symphony is recorded and reproduced by two-channel stereophony, the changes in instrument positions in terms of increasing or reducing the physical source width do not lead to an obvious increase or reduction on the spatial impression of the performing entity.

Moderators

Brett Leonard

Director of Music Technology Programs, University of Indianapolis

Speakers

Jonas Braasch

Authors

Jonas Braasch

Renzhi Guo

Tuesday October 8, 2024 11:30am - 12:00pm EDT
1E04

Recording and Production, Paper Lecture

11:50am EDT

Investigating the Role of Customized Interaural Time Differences on First-Person Shooter Gaming Performance

Tuesday October 8, 2024 11:50am - 12:10pm EDT

1E03

Binaural listening with personalized Head-Related Transfer Functions (HRTFs) is known to enhance a listener's auditory localization in virtual environments, including gaming. However, the methods for achieving personalized HRTFs are often inaccessible for average game players due to measurement complexity and cost. This study explores a simplified approach to improving game performance, particularly in First-Person Shooter (FPS) games by optimizing Interaural Time Difference (ITD). Recognizing that horizontal localization is particularly important for identifying opponent positions in FPS games, this study hypothesizes that optimizing ITD alone may be sufficient for better game performances, potentially alleviating the need for full HRTF personalization. To test this hypothesis, a simplified FPS game environment was developed in Unity. Participants performed tasks to detect sound positions under three HRTF conditions: MIT-KEMAR, Steam Audio’s default HRTF, and the proposed ITD optimization method. The results indicated that our proposed method significantly reduced players' response times compared to other HRTF conditions. These findings allow players to improve their gaming performance within FPS games through simplified HRTF optimization, broadening accessibility to optimized HRTFs for a wider range of game users.

Moderators

Agnieszka Roginska

Professor, New York University

Speakers

Sungjoon Kim

Research Intern, Korea Advanced Institute of Science and Technology

Authors

Sungjoon Kim

Research Intern, Korea Advanced Institute of Science and Technology

Kiyoung Lee and Sungyoung Kim

Pooseung Koh

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology

Rai Sato (佐藤来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →

Tuesday October 8, 2024 11:50am - 12:10pm EDT
1E03

Immersive & Spatial Audio, Paper Lecture

12:00pm EDT

Opening & Awards Ceremony

Tuesday October 8, 2024 12:00pm - 12:45pm EDT

Stage

Join the AES Committee and Chairs as we celebrate award recipients from the past year. The Opening and Awards will be followed by the keynote speech by Ebonie Smith.

Speakers

Leslie Gaston-Bird

Owner, Mix Messiah Productions

Leslie Gaston-Bird (AMPS, MPSE) is author of the book "Women in Audio", part of the AES Presents series and published by Focal Press (Routledge). She is a voting member of the Recording Academy (The Grammys®) and its P&E (Producers and Engineers) Wing. Currently, she is a freelance... Read More →

Gary Gottlieb

AES President-Elect, Mendocino College

President-Elect, Co-Chair of the Events Coordination Committee, Chair of the Conference Policy Committee, and former Vice President of the Eastern Region, US and Canada; AES Fellow, Engineer, Author, Educator and Guest Speaker Gary Gottlieb refers to himself as a music generalist... Read More →

Michael Hagen

System Administrator and Studio Manager, New York University - Clive Davis Institute

Michael Hagen is the driving force behind the technical operations at the Clive Davis Institute at New York University, where he oversees the maintenance and continual improvement of the institute’s advanced production facilities. With extensive hands-on experience in recording... Read More →

Jeanne Montalvo

Engineer/Producer, Self

Jeanne Montalvo is a Grammy-nominated audio engineer and award-winning radio producer. She was selected amongst thousands of applicants as the 2018 EQL resident at Spotify Studios and Electric Lady Studios in New York City, assisting in the recording process for artists like John... Read More →

Tuesday October 8, 2024 12:00pm - 12:45pm EDT
Stage

12:45pm EDT

Keynote: Ebonie Smith

Tuesday October 8, 2024 12:45pm - 1:45pm EDT

Stage

Ebonie Smith is a celebrated music producer, audio engineer, and singer-songwriter, based in the vibrant hub of Los Angeles. As a prominent figure in the industry, she currently holds the esteemed roles of senior audio engineer and producer at Atlantic Records. Ebonie's remarkable portfolio features notable credits, including the Broadway cast album of Hamilton, Janelle Monae's groundbreaking Dirty Computer, and Cardi B's chart-topping Invasion Of Privacy.

Notably, Ebonie serves as the Co-Chair of the Producers & Engineers Wing of The Recording Academy, underscoring her dedication to advancing excellence in music production. Beyond her professional achievements, she is the visionary founder and president of Gender Amplified, Inc., a nonprofit organization committed to celebrating and empowering women music producers.

Ebonie's educational foundation includes a master's degree in music technology from New York University and an undergraduate degree from Barnard College, Columbia University, solidifying her position as a distinguished leader in the music industry.

Speakers

Ebonie Smith

Tuesday October 8, 2024 12:45pm - 1:45pm EDT
Stage

1:00pm EDT

Mixing Monitors from FOH

Tuesday October 8, 2024 1:00pm - 2:00pm EDT

1E05

Tuesday October 8, 2024 1:00pm - 2:00pm EDT
1E05

Sponsor Session

1:00pm EDT

Exhibit Hall

Tuesday October 8, 2024 1:00pm - 6:00pm EDT

Exhibit Hall

Step into the heart of innovation at the Audio Engineering Society’s Annual Conference Exhibit Hall. This dynamic space brings together leading companies and cutting-edge technologies from across the audio engineering industry. Attendees will have the opportunity to explore the latest advancements in audio equipment, software, and services, engage with industry experts, and discover new solutions to enhance their projects. Whether you’re looking to network with professionals, gain insights from live demonstrations, or simply stay ahead of the curve, the Exhibit Hall is the place to be. Don’t miss this chance to immerse yourself in the future of audio engineering!

Tuesday October 8, 2024 1:00pm - 6:00pm EDT
Exhibit Hall

1:45pm EDT

AES Ice Cream Social

Tuesday October 8, 2024 1:45pm - 2:45pm EDT

AES Membership Booth

Tuesday October 8, 2024 1:45pm - 2:45pm EDT
AES Membership Booth

2:00pm EDT

Towards prediction of high-fidelity earplug subjective ratings using acoustic metrics

Tuesday October 8, 2024 2:00pm - 2:30pm EDT

1E03

High-fidelity earplugs are used by musicians and live sound engineers to prevent hearing damage while allowing musical sounds to reach the eardrum without distortion. To determine objective methods for judging earplug fidelity in a similar way to headphones or loudspeakers, a small sample of trained listeners were asked to judge the attenuation level and clarity of music through seven commercially available passive earplugs. These scores were then compared to acoustic/musical metrics measured in a laboratory. It was found that NRR is strongly predictive of both attenuation and clarity scores, and that insertion loss flatness provides no advantage over NRR. A different metric measuring spectral flatness distortion seems to predict clarity independently from attenuation and will be subject to further study.

Moderators

Dave Anderson

Speakers

David Anderson

Assistant Professor, University of Minnesota Duluth

Authors

David Anderson

Assistant Professor, University of Minnesota Duluth

Stephen Roessner

Tuesday October 8, 2024 2:00pm - 2:30pm EDT
1E03

Perception, Paper Lecture

2:00pm EDT

Fourier Paradoxes

Tuesday October 8, 2024 2:00pm - 2:30pm EDT

1E04

Fourier theory is quite ubiquitous in modern audio signal processing. However, this framework is often at odds with our intuitions behind audio signals. Strictly speaking, Fourier theory is ideal to analyze periodic behaviors but when periodicities change across time it is easy to misinterpret its results. Of course, we have developed strategies around it like the Short Time Fourier Transform, yet again many times our interpretations of it falls beyond what the theory really says. This paper pushes the exact theoretical description showing examples where our interpretation of the data is incorrect. Furthermore, it shows specific instances where we incorrectly take decisions based on such paradoxical framework.

Moderators

Rob Maher

Professor, Montana State University

Audio digital signal processing, audio forensics, music analysis and synthesis.

Speakers

Juan Sierra

NYU

Currently, I am a PhD Candidate in Music Technology at NYU and am currently based in NYUAD as part of the Global Fellowship program. As a professional musician, my expertise lies in Audio Engineering, and I hold a master's degree in Music, Science, and Technology from the prestigious... Read More →

Authors

Juan Sierra

NYU

Tuesday October 8, 2024 2:00pm - 2:30pm EDT
1E04

Signal Processing, Paper Lecture

2:00pm EDT

Archiving multi-track and multi-channel: challeges and recommendations

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E09

Where are the files of my immersive production and why don't they open correctly on another DAW?
This workshop will outline some of the major challenges when working with archival materials and it will discuss ongoing activities in AES Standards Group SC-03-06 relating to multi-track and multi-channel audio. The workshop will give an overview of the status of the document, "Implementation of a Recommended Workflow for the Creation and Archiving of Digital Archival Materials from Professional Audio Production Formats," as well as why the interaction with related partner institutions (NARAS P&E Wing, SMPTE) and internal working groups is so important. The topic is relevant not only for preservation in an archival setting, but far beyond, for keeping audio productions safely stored and accessible in general.

Speakers

Nadja Wallaszkovits

Stuttgart State Academy of fine Arts

Brad McCoy

Audio Engineer, Retired, Library of Congress

Audio Engineer (Archiving/Preservation)

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York

Engineer/Producer, Co-Founder

Jim Anderson

Producer/Engineer, Anderson Audio New York

Producer/Engineer

Jeff Willens

Media Preservation Engineer, New York Public Library

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E09

Archiving & Restoration, Workshop | Historical, Workshop

2:00pm EDT

Designing With Constraints in an Era of Abundance

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E08

Every year, the availability and capabilites of processors and sensors expand greatly while their cost decreases correspondingly. As an instrument designer, the temptation to include every possible advance is an obvious one, but one that comes at the cost of additional complexity and probably more concerningly, reduced *character*.

Looking back on electronic instruments from the past, note that what we love about the classics are the quirks and idiosyncracies that come from the technical limitations of the time and how the designers ended up leveraging those limitations creatively. This panel will bring together designers to discuss how we are self-imposing constraints on our designs in the face of that temptation to include everything that's possible.

As Brian Eno said: "Whatever you now find weird, ugly, uncomfortable and nasty about a new medium will surely become its signature. CD distortion, the jitteriness of digital video, the crap sound of 8-bit - all of these will be cherished and emulated as soon as they can be avoided."

Speakers

Brett Porter

Lead Software Engineer, Artiphon

Brett g Porter is a software developer and engineering manager with 3 decades of experience in the pro audio/music instrument industry; currently Lead Software Engineer at Artiphon, he leads the team that develops companion applications for the company's family of instruments. Previously... Read More →

Alexandra Fierra

Eternal Research

Adam McHeffey

CMO, Artiphon

Ben Neill

Former Professor, Ramapo College

Composer/performer Ben Neill is the inventor of the mutantrumpet, a hybrid electro-acoustic instrument, and is widely recognized as a musical innovator through his recordings, performances and installations. Neill has recorded ten CDs of his music on the Universal/Verve, Thirsty Ear... Read More →

Nick Yulman

Kickstarter

Nick Yulman has worked with Kickstarter’s community of creators for the last ten years and currently leads the company’s Outreach team, helping designers, technologists, and artists of all kinds bring their ideas to life through crowdfunding. He was previously Kickstarter’s... Read More →

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E08

Electronic Instrument Design & Applications, Workshop | Product Development, Workshop

2:00pm EDT

Recent Advances in Volumetric Recording Techniques for Music Production

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E06

Volumetric sound recording techniques , where multiple 3D sound signals are captured throughout the recording area and mapped later for playback, have shown great promise in the world of positional tracking immersive experiences. But is volumetric sound capture an effective and plausible technique for music production ? What advantages does volumetric have over more traditional recording techniques and single position Ambisonic recordings and how the signals are combined for a stationary immersive listening experience ? These questions will be addressed by sharing immersive recordings, documentation, and discussing the outcomes of recent and ongoing work conducted at New York University and McGill University.

Speakers

Paul Geluso

Director of the Music Technolo, New York University

Wieslaw Woszczyk

Ying-Ying Zhang

PhD Candidate, McGill University

Parichat Songmuang

Studio Manager/PhD Student, New York University

Hanzhi Zhang

Michael Ikonomidis

Doctoral student, McGill University

Michael Ikonomidis (Michail Oikonomidis) is an accomplished audio engineer and PhD student in Sound Recording at McGill University, specializing in immersive audio, high-channel count orchestral recordings and scoring sessions.With a diverse background in music production, live sound... Read More →

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E06

Immersive & Spatial Audio, Workshop | Immersive Music, Workshop

2:00pm EDT

Richard King: Enveloping Masterclass

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

3D06

Richard King plays high resolution 7.1.4 music recordings of Yo-Yo Ma and Eric Clapton, and from the excellent Chevalier sound track album, describing also some of the techniques used.

This masterclass series, featuring remarkable recording artists, is a chance to hear 3D audio at its best; as we discuss factors of production, distribution and reproduction that makes it worth the effort. Thomas exemplifies the important qualia, auditory envelopment (AE); and we evaluate how well AE latent in the content - from intimate to grand - comes across in this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats each time.

Speakers

Thomas Lund

Senior Technologist, Genelec Oy

Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →

Richard King

Professor, McGill University

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
3D06

Immersive Music, Workshop

2:00pm EDT

Bridging the Gap: Lessons for Live Media Networking from IT

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E16

The rapid evolution of live media networking has brought it closer to converged networking, where robust and efficient communication is paramount. While protocols such as MILAN/AVB, Dante and AES67 are staples, significant opportunities exist to enhance live media networking by adopting architectural blueprints, tools, and widely used protocols from the Information Technology (IT) sector. This workshop explores the specific requirements of live media networking, identifies potential learnings from IT workflows, and examines how other industries, particularly broadcast and video markets, have successfully integrated IT principles to propose technical recommendations.
Live media networking, encompassing audio, video, and control signals, demands high precision, low latency, and synchronization. Unlike traditional IT networks, which prioritize data integrity and security, live media networks must ensure seamless real-time transmission without compromising quality. The workshop will delve into these specificities, highlighting the challenges unique to live media and how they differ from typical IT networking scenarios and the use of Time Sensitive Networking (TSN)..
A significant challenge in this transition is the learning curve faced by sound technicians. Traditionally focused on audio-specific knowledge, these professionals now need to acquire IT networking skills to manage complex media networks effectively. This gap in expertise necessitates a new role emerging in the industry: the "Live Media Network Manager," a specialist who bridges the knowledge gap between traditional sound engineering and advanced IT networking.
A key focus area will be examining IT architectural blueprints and their applicability to live media networking. IT networks often leverage scalable, redundant, and resilient architectures to ensure uninterrupted service delivery. By adopting similar principles, live media networks can achieve greater reliability and scalability. The workshop will discuss how concepts such as network segmentation, redundancy, and failover mechanisms from IT can be tailored to meet the stringent requirements of live media.
Additionally, we will explore the tools and protocols widely used in IT that can benefit live media networking. Network monitoring and management tools, such as SNMP and Syslog, offer comprehensive insights into network performance and can aid in proactive maintenance and troubleshooting. Furthermore, protocols like QoS can be adapted to prioritize media traffic, ensuring that critical audio and video streams are delivered with minimal delay and jitter.
The workshop will also draw parallels from the broadcast and video markets, which have already embraced IT-based solutions to enhance their networking capabilities. These industries have developed technical recommendations and standards, such as SMPTE ST 2110 for professional media over managed IP networks, which can serve as valuable references for the live media domain. By examining these examples, participants will gain a broader perspective on how cross-industry learnings can drive innovation in live media networking.
This workshop will provide a comprehensive overview of the specific needs of live media networking and present actionable insights from IT workflows and other industries. Participants will leave with a deeper understanding of how to leverage IT principles to enhance the efficiency, reliability, and scalability of live media networks, paving the way for a more integrated and future-proof approach.

Speakers

Nicolas Sturmel

Directout GmbH

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E16

Live Sound, Workshop | Networked Audio, Workshop

2:00pm EDT

HELA Certification: Elevating standards in live event sound management

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E07

The Healthy Ears, Limited Annoyance (HELA) certification scheme, which originated within the AES Technical Committee on Acoustics and Sound Reinforcement, launched in summer 2024. Tailored for event organizers, sound engineers, venue managers, musicians, and all other key live event industry stakeholders, HELA offers a comprehensive framework for delivering live music experiences that protect audience hearing and minimize neighborhood disturbances. This session will delve into the balance between sound quality, hearing health and community harmony. Attendees will gain practical insights into HELA's guidance on sound level management and effective communication strategies, fostering a community dedicated to sustainable live event production. Join us to discover how HELA Certification can set a new industry standard, creating memorable yet safe and respectful experiences for everyone involved.

Speakers

Adam Hill

Associate Professor of Electroacoustics, University of Derby

Adam Hill is an Associate Professor of Electroacoustics at the University of Derby where he leads the Electro-Acoustics Research Lab (EARLab) and runs the MSc Audio Engineering program. He received a Ph.D. from the University of Essex, an M.Sc. in Acoustics and Music Technology from... Read More →

Jon Burton

Senior Lecturer, Derby University

A live sound engineer with over 40 years of concert touring experience. Jon has toured internationally with artists such as Bryan Ferry, Stereophonics, Biffy Clyro and The Prodigy. Jon is also a partner in a five-studio recording complex in Sheffield, UK. Involved in education for... Read More →

Laura Sinnott

Owner, Sound Culture

A long time audio engineer for film, Laura career-expanded into hearing health as an audiologist. She ran the hearing clinic at Sensaphonics, a Chicago-based institution that has served musicians for over 30 years. Now based in Central New York, she sees patients in her Utica, NY... Read More →

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E07

Live Sound, Workshop | Sound Reinforcement, Workshop

2:00pm EDT

WMAS – The way forward for multichannel wireless audio

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

1E15

Join the industry experts on this exciting panel to learn more about Wireless Multichannel Audio Systems, the wide scope of approaches that WMAS facilitates, and how the various technologies can help you with multi-channel wireless set ups in challenging RF environments.

Speakers

Joe Ciaudelli

Sennheiser

Joe Ciaudelli was hired by Sennheiser in 1987 upon graduating from Columbia University with an electrical engineering degree. He provided frequency coordination for large multi-channel wireless microphone systems used by Broadway productions, major theme parks, and broadcast networks... Read More →

Ben Escobedo

Shure

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E15

Networked Audio, Workshop

2:00pm EDT

Jack Antonoff & Laura Sisk, Up Close and Personal

Tuesday October 8, 2024 2:00pm - 3:00pm EDT

Stage

GRAMMY® Award-winning producer Jack Antonoff and GRAMMY® Award-winning recording and mix engineer Laura Sisk will be the focal point of a special event session titled “Up Close and Personal.” Jack Antonoff is an eleven-time GRAMMY® Award-winning producer, artist, songwriter, and musician, as well as the creative force behind Bleachers. In February 2024, Antonoff won Producer of the Year at the GRAMMY Awards® for an incredible third consecutive year, becoming only the second producer in history to win three years running. He will be joined by five-time GRAMMY® Award-winning recording and mix engineer and long-time collaborator Laura Sisk in a session revealing their studio work with such leading artists as Taylor Swift, Lana Del Rey, Sabrina Carpenter, Kendrick Lamar, Nick Cave, St. Vincent, Diana Ross and more. The event will be moderated by producer/engineer/musician Glenn Lorbecki.

Speakers

Jack Antonoff

Credited by the BBC for having “redefined pop music,” the globally celebrated, eleven-time Grammy Award-winning singer, songwriter, musician, and producer, Jack Antonoff has collaborated with the likes of Taylor Swift, Kendrick Lamar, Lana Del Rey, The 1975, Diana Ross, Lorde... Read More →

Laura Sisk

Laura Sisk is a five-time GRAMMY® Award-winning recording and mix engineer, widely recognized for her work with producer Jack Antonoff on Taylor Swift, as well as working with renowned artists like Lana Del Rey, Jon Batiste, Florence + The Machine, Diana Ross, Lorde, St.Vincent... Read More →

Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Stage

Recording and Production, Workshop

2:00pm EDT

Bitrate adaptation in object-based audio coding in communication immersive voice and audio systems

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

The object-based audio is one of the spatial audio representations providing an immersive audio experience. While it can be found in a wide variety of audio reproduction systems, its use in communication systems is very limited as it faces many constraints like the complexity of the system, short delay, or limited available bitrate for coding and transmission. This paper presents a new bitrate adaptation method to be used in object-based audio coding systems that overcomes these constraints and enables their use in 5G voice and audio communication systems. The presented method distributes an available codec bit budget to encode waveforms of the individual audio objects based on a classification of the objects’ subjective importance in particular frames. The presented method has been used in the Immersive Voice and Audio Services (IVAS) codec, recently standardized by 3GPP, but it can be employed in other codecs as well. Test results show the performance advantage of the bitrate adaptation method over the conventional uniformly distributed bitrate method. The paper also presents IVAS selection test results for object-based audio with four audio objects, rendered to binaural headphone representation, in which the presented method plays a substantial role.

Speakers

Vaclav Eksler

Authors

Vaclav Eksler

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Enhancing Realism for Digital Piano Players: A Perceptual Evaluation of Head-Tracked Binaural Audio

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

This paper outlines a process for achieving and perceptually evaluating a head-tracked binaural audio system designed to enhance realism for players of digital pianos. Using an Ambisonic microphone to sample an acoustic piano, followed by leveraging off-the-shelf equipment, the system allows players to experience changes in the sound field in real-time as they rotate their heads while wearing headphones under three degrees of freedom (3DoF). The evaluation criteria included spatial clarity, spectral clarity, envelopment, and preference. These criteria were assessed across three different listening systems: stereo speakers, stereo headphones, and head-tracked binaural audio. Results showed a strong preference for the head-tracked binaural audio system, with players noting significantly greater realism and immersion.

Speakers

Pingkang Chen

Authors

Pingkang Chen

Braxton Boren

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Exploring Immersive Opera: Recording and Post-Production with Spatial Multi-Microphone System and Volumetric Microphone Array

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

Traditional opera recording techniques using large microphone systems are typically less flexible towards experimental singer choreographies, which have the potential of being adapted to immersive and interactive representations such as Virtual Reality (VR) applications. The authors present an engineering report on implementing two microphone systems for recording an experimental opera production in a medium-sized theatre: a 7.0.4 hybrid array of Lindberg’s 2L and the Bowles spatial arrays and a volumetric array consisting of three higher-order Ambisonic microphones in Left/Center/Right (LCR) formation. Details of both microphone setups are first described, followed by post-production techniques for multichannel loudspeaker playback and 6 degrees-of-freedom (6DoF) binaural rendering for VR experiences. Finally, the authors conclude with observations from informal listening critique sessions and discuss the technical challenges and aesthetic choices involved during the recording and post-production stages in the hope of inspiring future projects on a larger scale.

Speakers

Jiawen Mao

PhD student, McGill University

Authors

Jiawen Mao

PhD student, McGill University

Martha de Francisco

Michael Ikonomidis

Doctoral student, McGill University

Richard King

Professor, McGill University

Wieslaw Woszczyk

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Exploring the Directivity of the Lute, Lavta, and Oud Plucked String Instruments

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

This study investigates the spherical directivity and radiation patterns of the Lute, Lavta, and Oud, pear-shaped traditional plucked-string instruments from the Middle East, Turkey, Greece, and the surrounding areas, providing insights into the acoustic qualities of their propagated sound in a three-dimensional space. Data was recorded in an acoustically controlled environment with a 29-microphone array, using multiple instruments of each type, performed by several professional musicians. Directivity is investigated in terms of sound projection and radiation patterns. Instruments were categorized according to string material. The analysis revealed that all instruments, regardless of their variations in geometry and material, exhibit similar radiation patterns across all frequency bands, justifying their intuitive classification within the “Lute family”. Nevertheless, variations in sound projection across all directions are evident between instrument types, which can be attributed to differences in construction details and string material. The impact of the musician's body on directivity is also observed. Practical implications of this study include the development of guidelines for the proper recording of these instruments, as well as the simulation of their directivity properties for use in spatial auralizations and acoustic simulations with direct applications in extended reality environments and remote collaborative music performances.

Speakers

Pinelopi-Maria Pierroutsakou

Authors

Pinelopi-Maria Pierroutsakou

Areti Andreopoulou

Konstantinos Bakogiannis

Yiannis Malafis

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Generate acoustic responses of virtual microphone arrays from a single set of measured FOA responses. - Apply to multiple sound sources.

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

V2MA (VSVerb Virtual Microphone Array)
Demos and related docs are available at https://bit.ly/3BmDBbL .
Once we have measured a set of four impulse responses (IRs) with an A-format microphone in a hall, we can make a virtual recording using a virtual microphone array built in the hall at will. The measurement does not require an A-format microphone and a loudspeaker to be placed at specific positions in the hall. Typical positions, such as at an audience seat and on a stage, are recommended, but you can place them anywhere you like. We will generate any type of virtual microphone response in a target room from an easy one-time IR measurement.
-------------------------------
We propose a method, V2MA, that virtually generates acoustic responses of any type of microphone array from a single set of FOA responses measured in a target room. An A-format microphone is used for the measurement, but no Ambisonics operation is included in the processing. V2MA is a method based on geometrical acoustics. We calculate sound intensities in the x, y, and z directions from a measured FOA response, then the virtual sound sources of the room are detected from them. Although it is desirable to have an A-format microphone place close to the attempted position of the virtual microphone array in the room, it is not a mandatory requirement. Since our method allows to generate SRIRs at arbitrary receiver positions in the room by updating the acoustic properties of the virtual sound sources detected at a certain position of the room, an A-format microphone can be placed anywhere you like. On the other hand, a loudspeaker must be placed at the source position where a player is assumed to be. Since the positions of virtual sound sources change when a real sound source moves, we used to measure the responses for each assumed real source position. To improve this inconvenient restriction, we developed the technique of updating the positions of the virtual sound sources when a real sound source moves from its original position. Although the technique requires some approximations, it is ascertained that the generated SRIRs provide fine acoustic properties in both physical and auditory aspect.

Speakers

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.

Masataka Nakahra is an acoustician specializing in studio acoustic design and R&D work on room acoustics, as well as an educator.After studying acoustics at the Kyushu Institute of Design, he joined SONA Corporation and began his career as an acoustic designer.In 2005, he received... Read More →

Authors

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.

Akira Omoto

Toru Kamekawa

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Measurement and Applications of Directional Room Impulse Responses (DRIRs) for Immersive Sound Reproduction

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

Traditional methods for characterizing Room Impulse Responses (RIRs) employing omnidirectional microphones do not fully capture the spatial properties of sound in an acoustic space. In this paper we explore a method for the characterization of room acoustics employing Directional Room Impulse Responses (DRIRs), which include the direction of arrival of the reflected sound waves in an acoustic space in addition to their time of arrival and strength. We measured DRIRs using a commercial 3D sound intensity probe (Weles Acoustics WA301) containing x, y, z acoustic velocity channels in addition to a scalar pressure channel. We then employed the measured DRIR’s to predict the binaural signals that would be measured by binaural dummy head microphones placed at the same location in the room where the DRIR was measured. The predictions can then be compared to the actual measured binaural signals. Successful implementation of DRIRs could significantly enhance applications in AR/VR and immersive sound reproduction by providing listeners with room-specific directional cues for early room reflections in addition to the diffuse reverberant impulse response tail.

Speakers

Ximing Chen

Authors

Ximing Chen

Mark Bocko

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Quantitative Assessment of Acoustical Attributes and Listener Preferences in Binaural Renderers with Head-tracking Function

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

The rapid advancement of immersive audio technologies has popularized binaural renderers that create 3D auditory experiences using head-related transfer functions (HRTFs). Various renderers with unique algorithms have emerged, offering head-tracking functionality for real-time adjustments to spatial audio perception. Building on our previous study, we compared binauralized music from five renderers with the dynamic head-tracking function enabled, focusing on how differences in HRTFs and algorithms affect listener perceptions. Participants assessed overall preference, spatial fidelity, and timbral fidelity by comparing paired stimuli. Consistent with our earlier findings, one renderer received the highest ratings for overall preference and spatial fidelity, while others rated lower in these attributes. Physical analysis showed that interaural time differences (ITD), interaural level differences (ILD), and frequency response variations contributed to these outcomes. Notably, hierarchical cluster analysis of participants' timbral fidelity evaluations revealed two distinct groups, suggesting variability in individual sensitivities to timbral nuances. While spatial cues, enhanced by head tracking, were generally found to be more influential in determining overall preference, the results also highlight that timbral fidelity plays a significant role for certain listener groups, indicating that both spatial and timbral factors should be considered in future developments.

Speakers

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology

Authors

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology

Palavat Komkris

Sungjoon Kim and Sungyoung Kim

Yideun Park

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

Review: Head-Related Impulse Response Measurement Methods

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

This review paper discusses the advancements in Head-Related Impulse Response measurement methods. HRIR (Head-Related Impulse Response) measurement methods, often referred to as HRTF (Head-Related Transfer Function) measurement methods, have undergone significant changes over the last few decades [1]. A frequently employed method is the Discrete stop-and-go method [1][2]. It involves changing the location of a single speaker, used as the sound source, and recording the impulse response at each location. [2]. Since the measurement is for 1 location of the sound source at a time, using the discrete stop-and-go method is time-consuming [1]. Hence improvements are required to enhance the efficiency of the measurement process such as using more sound sources (speakers) [1][3]. A typical HRTF measurement is usually conducted in an anechoic chamber to achieve a simulated free-field measurement condition without room reverberation. It measures the transfer function between the source and the ears to perceive localisation cues such as inter-aural time differences (ITDs), inter-aural level differences (ILDs), as well as monaural spectral cues [4]. Newer techniques such as the Multiple Exponential Sweep Method (MESM) and the reciprocal method offer alternatives. These methods enhance measurement efficiency and address challenges like inter-reflections and low-frequency response [5][6]. Individualised HRTF measurement techniques can be categorised into acoustical measurement, anthropometric data, and perceptual feedback [7]. Interpolation methods and non-anechoic environment measurements have expanded the practical application and feasibility of HRTF measurements [8][9][10][7].

Speakers

Jeremy Tsuaye

New York University

Authors

Jeremy Tsuaye

New York University

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:00pm EDT

The effects of interaural time difference and interaural level difference on sound source localization on the horizontal plane

Tuesday October 8, 2024 2:00pm - 4:00pm EDT

Poster

Interaural Time Difference (ITD) and Interaural Level Difference (ILD) are the main cues used by the human auditory system to localize sound sources on the horizontal plane. To explore the relationship between ITD, ILD, and the perceived azimuth, a study was conducted to measure and analyze the localization effects on the horizontal plane by combining ITD and ILD. Pure tones were used as sound sources in the experiment. For each of the three different frequency bands, 25 combinations of ITD and ILD test values were selected. These combinations were used to process the perceived sound from directly in front of the listener (pure tone signals collected using an artificial head in an anechoic chamber). The tests were conducted using the 1up/2down and 2AFC (two-alternative forced-choice) psychophysical testing methods. The results showed that the perceived azimuth at 350 Hz and 570 Hz was generally higher than at 1000 Hz. Additionally, the perceived azimuth at 350 Hz and 570 Hz was similar under certain combinations. The experimental data and conclusions can provide foundational data and theoretical support for efficient compression of multi-channel audio.

Speakers

Ma Xinyi

Authors

Ma Xinyi

Huang Qiuxian

Liu Chuntao

Wang Heng

Zhang Kangcheng

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

Immersive & Spatial Audio, Poster Presentation

2:30pm EDT

Decoding Emotions: Lexical and Acoustical Cues in Vocal Affects

Tuesday October 8, 2024 2:30pm - 3:00pm EDT

1E03

This study investigates listeners’ ability to detect emotion from a diverse set of speech samples, including both spontaneous conversations and actor-posed speech. It explores the contributions of lexical content and acoustic properties when native listeners rate seven pairs of affective attributes. Two experimental conditions were employed: a text condition, where participants evaluated emotional attributes from written transcripts without vocal information, and a voice condition, where participants listened to audio recordings to assess emotions. Results showed that the importance of lexical and vocal cues varies across 14 affective states for posed and spontaneous speech. Vocal cues enhanced the expression of sadness and anger in posed speech, while they had less impact on conveying happiness. Notably, vocal cues tended to mitigate negative emotions conveyed by the lexical content in spontaneous speech. Further analysis on correlations between emotion ratings in text and voice conditions indicated that lexical meanings suggesting anger or hostility could be interpreted as positive affective states like intimacy or confidence. Linear regression analyses indicated that emotional ratings by native listeners could be predicted up to 59% by lexical content and up to 26% by vocal cues. Listeners relied more on vocal cues to perceive emotional tone when the lexical content was ambiguous in terms of feeling and attitude. Finally, the analysis identified statistically significant basic acoustical parameters and other non/para-linguistic information, after controlling for the effect of lexical content.

Moderators

Dave Anderson

Speakers

Eunmi Oh

Research Professor, Yonsei University

Authors

Eunmi Oh

Research Professor, Yonsei University

Jinsun Suhr

Tuesday October 8, 2024 2:30pm - 3:00pm EDT
1E03

Perception, Paper Lecture

2:30pm EDT

Nonlinear distortion in analog modeled DSP plugins in consequence of recording levels

Tuesday October 8, 2024 2:30pm - 3:00pm EDT

1E04

The nominal audio level is where developers of professional analog equipment design their units to have an optimal performance. Audio levels above the nominal level will at some point lead to increased harmonic distortion and eventually clipping. DSP plugins emulating such nonlinear behavior must – in the same manner as analog equipment – align to a nominal level that is simulated within the digital environment. A listening test was tailored to investigate if, or to which extent, misalignments in the audio levels compared to the simulated nominal level in analog-modelled DSP plugins are audible, thus affecting the outcome, depending on which level you choose to record at. The results of this study indicate that harmonic distortion in analog-modeled DSP plugins may become audible as the recording level increases. However, for the plugins included in this study, the immediate consequence of the harmonics added is not critical and, in most cases, not noticed by the listener.

Moderators

Rob Maher

Professor, Montana State University

Audio digital signal processing, audio forensics, music analysis and synthesis.

Speakers

Tore Teigland

Professor, Kristiania University College

Authors

Tore Teigland

Professor, Kristiania University College

Tuesday October 8, 2024 2:30pm - 3:00pm EDT
1E04

Signal Processing, Paper Lecture

2:30pm EDT

Vocal Processing

Tuesday October 8, 2024 2:30pm - 3:30pm EDT

1E05

Tuesday October 8, 2024 2:30pm - 3:30pm EDT
1E05

Sponsor Session

3:00pm EDT

A comparison of in-ear headphone target curves for the Brüel & Kjær Head & Torso Simulator Type 5128

Tuesday October 8, 2024 3:00pm - 3:30pm EDT

1E03

Controlled listening tests were conducted on five different in-ear (IE) headphone target curves measured on the latest ITU-T Type 4.3 ear simulator (e.g. Bruel & Kjaer Head & Torso Simulator Type 5128). A total of 32 listeners rated each target on a 100-point scale based on preference for three different music programs with two observations each. When averaged across all listeners, two target curves were found to be equally preferred over the other choices. Agglomerative hierarchical clustering analysis further revealed two classes of listeners based on dissimilarities in their preferred target curves. Class One (72% of listeners) preferred the top two rated targets. Class two (28% of listeners) preferred targets with 2 dB less bass and 2 dB more treble than the target curves preferred by Class 1. Among the demographic factors examined, age was the best predictor of membership in each class.

Moderators

Dave Anderson

Speakers

Sean Olive

Authors

Sean Olive

Tuesday October 8, 2024 3:00pm - 3:30pm EDT
1E03

Perception, Paper Lecture

3:00pm EDT

A Survey of Methods for the Discretization of Phonograph Record Playback Filters

Tuesday October 8, 2024 3:00pm - 3:30pm EDT

1E04

Since the inception of electrical recording for phonograph records in 1924, records have been intentionally cut with a non-uniform frequency response to maximize the information density on a disc and to improve the signal-to-noise ratio. To reproduce a nominally flat signal within the available bandwidth, the effects of this cutting curve must be undone by applying an inverse curve on playback. Until 1953, with the introduction of what has become known as the RIAA curve, the playback curve required for any particular disc could vary by record company and over time. As a consequence, anyone seeking to hear or restore the information on a disc must have access to equipment that is capable of implementing multiple playback equalizations. This correction may be accomplished with either analog hardware or digital processing. The digital approach has the advantages of reduced cost and expanded versatility, but requires a transformation from continuous time, where the original curves are defined, to discrete time. This transformation inevitably comes with some deviations from the continuous-time response near the Nyquist frequency. There are many established methods for discretizing continuous-time filters, and these vary in performance, computational cost, and inherent latency. In this work, several methods for performing this transformation are explored in the context of phonograph playback equalization, and the performance of each approach is quantified. This work is intended as a resource for anyone developing systems for digital playback equalization or similar applications that require approximating the response of a continuous-time filter digitally.

Moderators

Rob Maher

Professor, Montana State University

Audio digital signal processing, audio forensics, music analysis and synthesis.

Speakers

Benjamin Thompson

PhD Student, University of Rochester

Authors

Benjamin Thompson

PhD Student, University of Rochester

Jenna Rutowski

Michael Heilemann

Tre DiPassio

Tuesday October 8, 2024 3:00pm - 3:30pm EDT
1E04

Signal Processing, Paper Lecture

3:00pm EDT

Immersive and interactive audio management and production for new entertainment venues.

Tuesday October 8, 2024 3:00pm - 3:45pm EDT

3D04 (AES and AIMS Alliance Program Room)

A wide range of new immersive audio applications like The Sphere, the COSM entertainment domes, theatres, live, museum and art Installations came up. These venues need new audio management as well as production tools. But the way to create content has not changed. Music and audio producers’ home is a regular studio, also because it’s not possible to get time on the sites for extensive mixing sessions. Based on close to 30 years immersive and binaural audio production experience, New Audio Technology provide those tools and will give an insight of mixing, management, and playback strategies for these applications.

Speakers

Tom Ammermann

New Audio Technology

Grammy-nominated music producer, Tom Ammermann, began his journey as a musician and music producer in the 1980s.At the turn of the 21st Century, Tom produced unique surround audio productions for music and film projects as well as pioneering the very first surround mixes for headphones... Read More →

Tuesday October 8, 2024 3:00pm - 3:45pm EDT
3D04 (AES and AIMS Alliance Program Room)

Sponsor Session

Host Organization New Audio Technology GmbH
Presentation Type Sponsored Session

3:15pm EDT

The Devil in the Details

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E16

What happens when you realize in the middle of a mass digitization project that most of your video assets have multi-track production audio instead of finished mixed audio, and your vendor doesn't offer a service to address the issue? Digitizing the Carlton Pearson Collection for the Harvard Divinity School produced just such a conundrum. This workshop will walk through a case study of the process of identifying problems from vendor work, QC and production workflows that had to be put into place to correct the issues that were surfaced as the project progressed, including a look at the technology stack that was developed internally in response to these issues and the necessary solutions, including a full GUI video editor that was developed for QC of audio and video and for implementing mass top/tail editing of assets while offering individual edit decision points. From problem identification to audio mix to video trimming to close captioning using AI solutions and project deposit to preservation repositories, the project team had only 3 months to complete the work on just shy of 4000 assets.

Speakers

Kaylie Ackerman

Head of Media Preservation, Harvard Library

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E16

Archiving & Restoration, Workshop

3:15pm EDT

Audio Design Roundtable

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

Stage

Join us at the AES NY 2024 convention for an exciting panel featuring top-tier audio equipment designers who have set new standards in the industry! Geared toward aspiring audio designers, students, and educators, this session will dive into the creative and technical processes behind groundbreaking audio gear used in studios, live sound, and beyond. Hear from industry leaders as they share career highlights and insights on staying ahead in a rapidly evolving field. In addition, students will learn about the various student design competitions presented by AES, offering unique opportunities to showcase their skills and gain recognition. Stick around for a Q&A session, where you'll have the chance to ask these leading experts your burning questions and gain valuable knowledge to power your future in audio design!

Speakers

Christoph Thompson

Director of Music Media Production, Ball State University

Brecht De Man

Head of Research, PXL University of Applied Sciences and Arts

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works

Anthony Agnello

Managing Director, Eventide

Tony Agnello was born in Brooklyn, graduated from Brooklyn Technical High School in 1966, received the BSEE from City College of NY in 1971, the MSEE from the City University of NY in 1974 followed by post graduate studies in Digital Signal Processing at Brooklyn’s Polytechnical... Read More →

Dave Derr

Marek Walaszek

General, Addicted To Music

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Stage

Education, Workshop | Electronic Instrument Design & Applications, Workshop

3:15pm EDT

Mary Campbell In Memoriam

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E07

Mary Campbell – In Memoriam
Manager Electric Lady Studios & Unique Recording, Sony Studios Booking Admin.

Mary Culum-Campbell was a legend of the NY studio scene for decades. She came from Montana in 1982 at age 21, and with a forged backstage pass and some other tales of delinquency talked her way into a job in the shop at Electric Lady Studios with tech extraordinaire Sal Greco, who went on to build Paisley Park for Prince. Sal’s comment recalling her job interview at her memorial was ‘She’s trouble. She’ll fit right in’. Within six months Mary became the studio manager. A trailblazer with attitude who also loved the music around her as a genuine fan, Mary was one of the few women in a power position at a major recording studio in the 80’s. Working in an intense and occasionally cutthroat environment during a tumultuous period in both music and music technology, she kept the studios booked and partied with the best (and worst) of them while managing both Electric Lady and the equally electric Unique Recording. She also had a stint working at the massive Sony Music Studios. The panel discussion will cover some of Mary’s history as well some of the history of the console and technology changes she presided over. Speakers will include former chief technicians Jim Gillis and Brian Macaluso, as well as associate Tony Drootin and engineer extraordinaire Ron Saint Germain.

Held together by Elliot Kissileff , Co-Moderator Roey Shamir

https://helenafuneralhome.com/obituaries/mary-c-campbell-age-61/

Speakers

Angela Piva

Eliot Kissileff

Owner, Music and Media Preservation Services

After graduating from New York University’s Tisch School of the Arts with a BFA in film production, Eliot Kissileff attended the Institute of Audio Research before landing a job at Electric Lady Studios in 1996. With the goal of learning how to make vintage-sounding recordings for... Read More →

Roey Shamir

Mixmaster, INFX PRODUCTIONS INC.

• Freelance Audio Engineer, Mixer and Producer • Audio Engineer Mixer, Producer, Remixer for extensive major label records and filmproductions (Mix to Picture) • Mixed and Recorded music / albums for numerous RIAA multi platinum and Grammynominated... Read More →

Ron St. Germain

Owner, Saint Sounds Inc & 'Saint's Place' Studio

Ron's career in the music business began in 1970. He learned the art of recording at two of America's busiest and best recording studios, Record Plant Studios and Media Sound Studios, both in NYC. Some of Ron's ‘colleagues’ during those formative years were Tony Bongiovi, Bob... Read More →

Brian Macaluso

Owner, Clandestine Recording

A lifelong music technology nerd, musician and songwriter, recording engineer, studio owner and former Chief Tech for Electric Lady Studios, JSM Music, and Dreamhire Professional Audio Rentals in NYC.Owner of Clandestine Recording, a private project studio in Kingston NY.

Tony Drootin

Manager, Sound on Sound Studios

Tony Graduated with a degree in Music Performance Percussion in 1984. Upon graduating college he took a position as receptionist at Unique Recording Studios in Times Square. Tony worked at Unique for 13 years, 10 of which he managed the facility. While at Unique Recording he formed... Read More →

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E07

Historical, Workshop

3:15pm EDT

Mastery in Ambisonic Recording

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E09

Ambisonic microphones offer a dynamic and multifaceted approach to capturing and creating immersive audio experiences. This workshop delves into the process of transforming multichannel A-format raw recordings into versatile Ambisonic B-format files. These files can then be decoded into various audio formats, including stereo, binaural, surround, and spatial 3D audio. This makes it an ideal technique for crafting head-tracking binaural audio for 360/3D videos and VR gaming environments.
Participants will gain hands-on experience with the Zoom H3-VR portable recorder and learn how to create Ambisonic recordings. The session will guide you through processing raw recordings using Reaper and the dearVR AMBI MICRO plugin and illustrating the workflow for integrating Ambisonic audio into 360/3D video and VR games for head-tracking binaural sound. Attendees are encouraged to bring their laptops and headphones, install the necessary software, and download the provided demo projects for an interactive experience. Additionally, the workshop will offer insights into capturing immersive audio with various Ambisonic microphones, including the new em64 Eigenmike Spherical Microphone Array.

Speakers

Ming-Lun Lee

Professor of Electrical and Computer Engineering, University of Rochester

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E09

Immersive & Spatial Audio, Workshop | Immersive Music, Workshop

3:15pm EDT

Susan Rogers: Enveloping Masterclass

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

3D06

Susan talks about her incredible carrier in science and music, working with artists such as Julia Darling, David Byrne, Prince, Laurie Anderson, Tevin Campbell and many more; all garnished with high resolution listening examples.

This masterclass series, featuring remarkable recording artists, is a chance to hear stereo and 3D music at its best; as we discuss important factors of production, distribution and reproduction. Thomas exemplifies the underrated qualia, auditory envelopment (AE); and we evaluate how robustly intimacy and AE latent in the content may be heard across this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats each time.

Speakers

Thomas Lund

Senior Technologist, Genelec Oy

Susan Rogers

Professor, Berklee Online

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
3D06

Immersive Music, Workshop

3:15pm EDT

The Art of Mixing in Stereo and ATMOS Simultaneously

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E06

Join veteran audio engineer Matt Boudreau, who has mixed in ATMOS for Alanis Morissette, Green Day, and Deafheaven, as he delves into immersive audio mixing. In this session, Matt will discuss his approach to mixing in ATMOS and stereo simultaneously and provide a look into his Pro Tools mixing template. Drawing from his extensive experience and his Mixing in Stereo and ATMOS course, Matt will share practical tips for creating dynamic mixes that excel in both formats. Whether you're new to ATMOS or refining your skills, this presentation offers valuable insights for all audio professionals.

Speakers

Matt Boudreau

An audio engineer at Matt Boudreau MIXING|MASTERING + Producer/Host/Author Working Class Audio Podcast.

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E06

Immersive Music, Workshop

3:15pm EDT

Copying and attributing training data in audio generative models

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E15

MOTIVATION: Generative AI for audio is quickly proliferating in both the commercial and open source software communities, However, there is still no technical consensus on how novel the generated audio data is, and what (if any) characteristics from the training data are commonly replicated. This workshop will explore existing technical approaches for how to detect memorized audio training data, how often it happens, and what characteristics of the audio are memorized. Additionally, we plan to share recent research on how audio similarity algorithms can be used to attribute audio samples produced by a generative model to specific audio samples in the training set.

FORMAT: this will be in the form of a panel discussion with three parts to it (listed below) – the purpose is to inform the AES community about the technical feasibility of detecting memorization and attributing training data in audio generative models, give audio examples of the results, an outlook for future developments, and solicit feedback from the audience.

• Memorization and generalization in deep learning
• Searching for memorized training data in audio generative models
• Training data attribution using audio similarity measures

Speakers

Gordon Wichern

Principal Research Scientist - Speech and Audio Team, MERL, Sr. Principal Research Scientist

Audio signal processing and machine learning resarcher

Bryan Pardo

Yuki Mitsufuji

Lead Research Scientist/VP of AI Research, Sony AI

Philipp Lengeling

Senior Counsel, RafterMarsh

Philipp G. Lengeling, Mag. iur., LL.M. (New York), Esq. is an attorney based in New York (U.S.A.) and Hamburg (Germany), who is heading the New York and Hamburg based offices for RafterMarsh, a transatlantic boutique law firm (California, New York, U.K., Germany).

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E15

Machine Learning and Artificial Intelligence, Workshop

3:15pm EDT

Reporting from the Frontlines of the Recording Industry: Hear from Studio Owners and Chief Engineers on Present / Future Trends, Best Practices, and the State of the Recording Industry

Tuesday October 8, 2024 3:15pm - 4:15pm EDT

1E08

Panel Description:
Join us for an in-depth, candid discussion with prominent NYC studio owners on the state recording industry of today and where it might go tomorrow. This panel will explore trends in studio bookings, predictions for the future, and the evolving landscape of recording studios. Our panelists will share their experiences of owning and managing studios, offering valuable insights for engineers on essential skills and best practices for success in today’s recording environment. We'll also discuss the importance of high-quality recording, and how these practices are crucial in maintaining the soul of music against the backdrop of AI and digital homogenization.

Topics also include:
What are studios looking for from aspiring engineers, and what does it take to work in a major market studio?
Studio best practices from deliverables to client relations
To Atmos or not to Atmos?
Proper crediting for engineers and staff
Working with record labels and film studios
Can recording and mix engineers find the same labor protections as the film industry?

Panelists (in alphabetical order)

Amon Drum, Owner / Engineer, The Bridge Studio (https://www.bridgerecordingstudio.com/)
Amon Drum is a recordist, producer, acoustic designer, owner, Chief Engineer, and designer of The Bridge Studio in Williamsburg, Brooklyn. He has an expertise in both analog and digital recording techniques. Amon specializes in the recording of non-western acoustic instruments, large ensembles, and live music for video. Amon is also a percussionist, having trained with master musicians including M’Bemba Bangora and Mamady Keita in Guinea West Africa, and brings this folkloric training to his recording & productions. Working with a wide variety of artists and genres from Jason Moran to Adi Oaisis, Run The Jewels, to Ill Divo as well as many Afro-Cuban ensembles from the diaspora.

Ben Kane, Owner, Electric Garden (https://electricgarden.com/)
Ben Kane is a Grammy Award winning recording and mix engineer, producer, and owner of Electric Garden, a prominent recording studio known for its focus on excellence in technical and engineering standards, and a unique handcrafted ambiance. Kane is known for his work with D'Angelo, Emily King, Chris Dave, and PJ Morton.

Shahzad Ismaily, Owner / Engineer, Figure 8 Recording (https://www.figure8recording.com/)
Shahzad Ismaily is a Grammy-nominated multi-instrumentalist, composer, owner and engineer at Figure 8 Recording in Brooklyn. Renowned for his collaborative spirit and eclectic musical range, Shahzad has contributed his unique sonic vision to projects spanning various genres and artists worldwide.

Zukye Ardella, Partner / Engineer, s5studio (https://www.s5studiony.com/)
Zukye is a New York City based certified gold audio engineer, music producer and commercial studio owner. In 2015, she began her professional career at the original s5studio located in Brooklyn, New York. Zukye teamed up with s5studio’s founder (Sonny Carson) to move s5studio to its current location in Chelsea, Manhattan. Over the years, Ardella has been an avid spokesperson for female empowerment organizations such as Women’s Audio Mission and She is the Music. Talents she’s worked with include NeYo, WizKid, Wale, Conway The Machine, A$AP Ferg, Lil Tecca, AZ, Dave East, Phillip Lawrence, Tay Keith, Lola Brooke, Princess Nokia, Vory, RMR, Yeat, DJ KaySlay, Kyrie Irving, Maliibu Mitch, Flipp Dinero, Fred The Godson, Jerry Wonda, ASAP 12vy and more.

--

Moderator:

Mona Kayhan, Owner, The Bridge Studio (https://www.bridgerecordingstudio.com/)
Mona is a talent manager for Grammy award winning artists, and consults for music and media companies in marketing and operations. She has experience as a tour manager, an international festival producer, and got her start in NYC working at Putumayo World Music. As an owner of The Bridge Studio, Mona focuses on client relationships, studio operations, and strategic partnerships. She also has an insatiable drive to support and advocate for the recording arts industry.

Speakers

Mona Kayhan

Amon Drum

Ben Kane

Shahzad Ismaily

Zukye Ardella

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E08

Recording and Production, Workshop

3:30pm EDT

Leveraging TSN Protocols to Support AES67: Achieving AVB Quality with Layer 3 Benefits

Tuesday October 8, 2024 3:30pm - 3:50pm EDT

1E04

This paper investigates using Time-Sensitive Networking (TSN) protocols, particularly from Audio Video Bridging (AVB), to support AES67 audio transport. By leveraging the IEEE 1588 Level 3 Precision Time Protocol (PTP) Media Profile, packet scheduling, and bandwidth reservation, we demonstrate that AES67 can be transported with AVB-equivalent quality guarantees while benefiting from Layer 3 networking advantages. The evolution of professional audio networking has increased the demand for high-quality, interoperable, and efficiently managed networks. AVB provides robust Layer 2 delivery guarantees but is limited by Layer 2 constraints. AES67 offers Layer 3 interoperability but lacks strict quality of service (QoS) guarantees. This paper proposes combining the strengths of both approaches by using TSN protocols to support AES67, ensuring precise audio transmission with Layer 3 flexibility. TSN extends AVB standards for time synchronization, traffic shaping, and resource reservation, ensuring low latency, low jitter, and minimal packet loss. AES67, a standard for high-performance audio over IP, leverages ubiquitous IP infrastructure for scalability and flexibility but lacks the QoS needed for professional audio. Integrating TSN protocols with AES67 achieves AVB's QoS guarantees in a Layer 3 environment. IEEE 1588 Level 3 PTP Media Profile ensures precise synchronization, packet scheduling reduces latency and jitter, and bandwidth reservation prevents congestion. Experiments show that TSN protocols enable AES67 to achieve latency, jitter, and packet loss performance on par with AVB, providing reliable audio transmission suitable for professional applications in modern, scalable networks.

Moderators

Rob Maher

Professor, Montana State University

Audio digital signal processing, audio forensics, music analysis and synthesis.

Speakers

Nicolas Sturmel

Directout GmbH

Authors

Nicolas Sturmel

Directout GmbH

Claudio Becker-Foss

Tuesday October 8, 2024 3:30pm - 3:50pm EDT
1E04

Signal Processing, Paper Lecture

3:30pm EDT

A cepstrum analysis approach to perceptual modelling of the precedence effect

Tuesday October 8, 2024 3:30pm - 4:00pm EDT

1E03

The precedence effect describes our ability to perceive the spatial characteristics of lead and lag sound signals. When the time delay between the lead and lag is sufficiently small we will cease to hear two distinct sounds, instead perceiving the lead and lag as a single fused sound with its own spatial characteristics. Historically, precedence effect models have had difficulty differentiating between lead/lag signals and their fusions. The likelihood of fusion occurring is increased when the signal contains periodicity, such as in the case of music. In this work we present a cepstral analysis based perceptual model of the precedence effect, CEPBIMO, which is more resilient to the presence of fusions than its predecessors. To evaluate our model we employ four datasets of various signal types, each containing 10,000 synthetically generated room impulse responses. The results of the CEPBIMO model are then compared against results of the BICAM. Our results show that the CEPBIMO model is more resilient to the presence of fusions and signal periodicity than previous precedence effect models.

Moderators

Dave Anderson

Speakers

Jeramey Tyler

Samtec

Jeramey is in the 3rd person. So it goes.

Authors

Jeramey Tyler

Samtec

Jeramey is in the 3rd person. So it goes.

Jonas Braasch

Mei Si

Tuesday October 8, 2024 3:30pm - 4:00pm EDT
1E03

Perception, Paper Lecture

3:50pm EDT

Harnessing Diffuse Signal Processing (DiSP) to Mitigate Coherent Interference

Tuesday October 8, 2024 3:50pm - 4:10pm EDT

1E04

Coherent sound wave interference is a persistent challenge in live sound reinforcement, where phase differences between multiple loudspeakers lead to destructive interference, resulting in inconsistent audio coverage. This review paper presents a modern solution: Diffuse Signal Processing (DiSP), which utilizes Temporally Diffuse Impulses (TDIs) to mitigate phase cancellation. Unlike traditional methods focused on phase alignment, DiSP manipulates the temporal and spectral characteristics of sound, effectively diffusing coherent wavefronts. TDIs, designed to spread acoustic energy over time, are synthesized and convolved with audio signals to reduce the likelihood of interference. This process maintains the original sound’s perceptual integrity while enhancing spatial consistency, particularly in large-scale sound reinforcement systems. Practical implementation methods are demonstrated, including a MATLAB-based workflow for generating TDIs and optimizing them for specific frequency ranges or acoustic environments. Furthermore, dynamic DiSP is introduced as a method for addressing interference caused by early reflections in small-to-medium sized rooms. This technique adapts TDIs in real-time, ensuring ongoing decorrelation in complex environments. The potential for future developments, such as integrating DiSP with immersive audio systems or creating dedicated hardware for real-time signal processing, is also discussed.

Moderators

Rob Maher

Professor, Montana State University

Audio digital signal processing, audio forensics, music analysis and synthesis.

Speakers

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina

Authors

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina

Tuesday October 8, 2024 3:50pm - 4:10pm EDT
1E04

Signal Processing, Paper Lecture

4:00pm EDT

Categorical Perception of Neutral Thirds Within the Musical Context

Tuesday October 8, 2024 4:00pm - 4:30pm EDT

1E03

This paper investigates the contextual recognition of neutral thirds in music by integrating real-world musical context into the study of categorical perception. Traditionally, categorical perception has been studied using isolated auditory stimuli in controlled laboratory settings. However, music is typically experienced within a circumstantial framework, significantly influencing its reception. Our study involved musicians from various specializations who listened to precomposed musical fragments, each concluding with a 350-cent interval preceded by different harmonic contexts. The fragments included a monophonic synthesizer and orchestral mockups, with contexts such as major chords, minor chords, a single pitch, neutral thirds, and natural fifths. The results indicate that musical context remarkably affects the recognition of pseudotonal chords. Participants' accuracy in judging interval size varied based on the preceding harmonic context. A statistical analysis was conducted to determine if there were significant differences in the neutral third perception across the different harmonic contexts. The test led to the rejection of the null hypothesis: the findings underscore the need to consider real-world listening experiences in research on auditory processing and cognition.

Moderators

Dave Anderson

Speakers

Krzysztof Kicior

Authors

Krzysztof Kicior

Tuesday October 8, 2024 4:00pm - 4:30pm EDT
1E03

Perception, Paper Lecture

4:00pm EDT

Revolutionizing Loudspeaker Design with Ultra Thin Glass Technology

Tuesday October 8, 2024 4:00pm - 4:45pm EDT

3D04 (AES and AIMS Alliance Program Room)

Discover the groundbreaking potential of our latest Ultra Thin Glass (UTG) technology, engineered for speaker diaphragm applications. This innovation marries toughness, flexibility, and a distinctive finish with full ESG compliance. With thicknesses as fine as 25μm, UTG seamlessly integrates into earphones, headphones, micro-speakers, and a wide range of loudspeakers. Join us to explore how this cutting-edge material can elevate your audio designs to new heights.

Speakers

Kwunkit Chan

Glass Acoustic Innovations Co., LTD

Tuesday October 8, 2024 4:00pm - 4:45pm EDT
3D04 (AES and AIMS Alliance Program Room)

Sponsor Session

Host Organization Glass Acoustic Innovations Co., LTD
Presentation Type Sponsored Session

4:30pm EDT

Mastering for Vinyl

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

1E07

Women's Audio Mission presents a panel discussion with award-winning mastering engineers and vinyl cutters who will talk about the intricacies of the vinyl format and the strategies used in mastering modern music to vinyl including how to address factors such as sibilance, managing low frequency stereo information, sequencing strategies, program length and volume, etc.

Moderator: Terri Winston

Panelists:

Kim Rosen - Kim Rosen is a GRAMMY- nominated mastering engineer who works out of her own “Knack Mastering” located in Ringwood, NJ, USA. Recordings Kim has mastered have garnered twenty-six Grammy nominations with five winning across multiple categories including Bonnie Raitt’s Song of the Year. She has mastered projects for Wynonna Judd, Johnny Cash, Aimee Mann, The Milk Carton Kids, Allison Russell, Superdrag, and Flogging Molly.

Margaret Luthar - Margaret Luthar is a Grammy-nominated Mastering Engineer who has worked at Chicago Mastering Service (where she learned to cut vinyl) and at Welcome to 1979 in Nashville. Currently a Broadcast Recording Technician at NPR and a freelance mastering engineer based in Los Angeles. She has mastered thousands of projects including The Lumineers, Tinashe, Spiritualized, Soccer Mommy and has significant experience in audio restoration, from her work at the Norwegian Institute of Recorded Sound.

Piper Payne - Piper Payne is a Mastering Engineer based in Nashville, TN. Piper has mastered a wide variety of music including nationally renowned artists such as Janis Ian (Best Folk Album Grammy Nomination 2023), Dolly Parton, Third Eye Blind, LeAnn Rimes, The Go-Go’s, Madame Gandhi, and many more. After moving her mastering studio to Nashville from the San Francisco Bay Area, she ventured into vinyl manufacturing, opening Physical Music Products.

Moderated by Terri Winston, Executive Director of Women's Audio Mission. Women's Audio Mission (WAM) is dedicated to closing the chronic gender gap in the music, recording and technology industries. Less than 5% of the people creating and shaping the sounds that make up the soundtrack of our lives are women and gender-expansive people. WAM is “changing the face of sound” by providing training, mentoring and career advancement that inspires the next generation of women and gender-expansive music producers, recording engineers and audio technologists and radically changes the narrative in these industries.

Speakers

Terri Winston

Executive Director, Women's Audio Mission

Women's Audio Mission (WAM) is dedicated to closing the chronic gender gap in the music, recording and technology industries. Less than 5% of the people creating and shaping the sounds that make up the soundtrack of our lives are women and gender-expansive people. WAM is “changing... Read More →

Kim Rosen

Piper Payne

Margaret Luthar

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E07

Archiving & Restoration, Workshop | Recording and Production, Workshop

4:30pm EDT

ADM-OSC 1.0

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

1E15

ADM-OSC is an industry initiative aimed at standardising Object-Based Audio (OBA) positioning data by implementing the Audio Definition Model (ADM) over Open Sound Control (OSC). As immersive audio gains traction across various industries, from music streaming to gaming, and from live sound to broadcasting, the Audio Definition Model (ADM) is becoming a popular standard for metadata. This includes Serial ADM for broadcast and ADM BWF or XML files for studio use.

The ADM-OSC workgroup was formed four years ago to bridge the gap between immersive live and studio ecosystems. It now includes leading developers and manufacturers who aim to facilitate the sharing of audio object metadata across different environments, from studios to broadcasts to live performances.

Since its initial draft implementation, ADM-OSC has been supported by various audio industry tools, including live rendering engines, digital audio workstations (DAWs), controllers, live tracking systems, and media server solutions. It is currently being deployed in both live and studio productions, with increasing interest from technology developers wanting to join and implement this standard.

ADM-OSC 1.0 is now the published specification, aiming to provide a basic interoperability layer between Object Editors and Object Renderers.

This presentation and workshop will take a dive on ADM-OSC and will cover:
- The origins of ADM-OSC
- Presentation of ADM-OSC 1.0 specification
- Use case/application demonstration
- DAW object positional data to external rendering engine (s)
- Controllers’ data to object panner in DAW for automation recording
- Live tracking (actors, artist) positional data to live rendering engine (s)
- Plugin fest 2023 and 2024 report
- Future considerations
- Application-specific subgroups such as broadcast, VR/Gaming, live rendering, show control

Speakers

Michael Zbyszynski

Software Development Engineer, L-Acoustics

Michael Zbyszyński is musician, researcher, teacher and developer in the field of contemporary electroacoustic music. He is currently part of the Creative Technologies R&D group at L-Acoustics. As a musician, his work spans from brass bands to symphony orchestras, including composition... Read More →

Hugo Larin

Senior Mgr. Business Development | FLUX:: GPLM, Harman International

Hugo Larin is a key collaborator to the FLUX: SPAT Revolution project and has deep roots in audio mixing, design and operation, as well as in networked control and data distribution. He leads the FLUX:: business development at HARMAN. His recent involvements and interests include object-based spatial audio mixing workflows, interoperability... Read More →

Mathieu Delquignies

Education & Application Support France, d&b audiotechnik

Mathieu holds a Bachelors's degree in applied physics from Paris 7 University and Master's degree in sound engineering from ENS Louis Lumière in 2003. He has years of diverse freelance mixing and system designer experiences internationally, as well as loudspeakers, amplifiers, dsp... Read More →

Lucas Zwicker

Senior Director, Workflow and Integration, CTO Office, Lawo AG

Lucas joined Lawo in 2014, having previously worked as a freelancer in the live sound and entertainment industry for several years. He holds a degree in event technology and a Bachelor of Engineering in electrical engineering and information technology from the University of Applied... Read More →

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E15

Immersive & Spatial Audio, Workshop | Live Sound, Workshop

4:30pm EDT

Upmix and Format Conversion of Multichannel Audio: An Opportunity for AI-Based Breakthroughs?

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

1E06

The demand for the production, distribution and playback of audio content in surround or immersive multichannel formats and flexible playback setups has continued to grow in the cinema, broadcast and music industries. This trend motivates the continued development of multichannel audio signal processing methods for the conversion of recordings between different multichannel and spatial audio formats or layouts, including down- or up-mixing. These operations leverage a well researched frequency-domain processing framework nowadays commonly referred to as parametric spatial audio signal processing, to solve challenges including the primary-ambient decomposition of audio recordings. Bringing ML/AI techniques to bear in the development of novel spatial audio analysis/synthesis methods can unlock practical applications with transformative impact. In this workshop, we will review and discuss the fundamentals and recent progress in these methods.

Speakers

Jean-Marc Jot

Founder and Principal, Virtuel Works LLC

Spatial audio and music technology expert and innovator. Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and music spatial computing technology and interoperability solutions.

Gordon Wichern

Principal Research Scientist - Speech and Audio Team, MERL, Sr. Principal Research Scientist

Audio signal processing and machine learning resarcher

Sunil Bharitkar

Samsung Research America

Carlos Freitas

CMO - Mastering Engineer, Spatial9

9 Latin Grammy nominees in the Best Engineering category, and has been the mastering engineer in 34 Grammy award winning records, with more than 100 nominations.The albums include artists such as Nathan East, Andres Cepeda, Ines Gaviria, João Gilberto, Caetano Veloso, Gilberto Gil... Read More →

Alan Silva

CTO, Spatial9

Alan Silva is a well-seasoned Researcher and Engineer who focuses on leveraging machine learning algorithms and distributed computing to create innovative solutions. With a solid commitment to open-source development, Alan actively contributes to collaborative projects that drive... Read More →

AES 2024 Fall W77 Upmix and Format Conversion pdf

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E06

Immersive & Spatial Audio, Workshop | Machine Learning and Artificial Intelligence, Workshop

4:30pm EDT

Psychoacoustics for Immersive Productions

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

1E16

3D audio has enormous potential to emotionally touch the audience: The potent effect occurs when the auditory system is given the illusion of being in a natural environment. When this is the case with impressive music, everyone gets goosebumps. Psychoacoustics forms the basis for remarkable results in music productions.
In the first part, Konrad Strauss explains the basics of psychoacoustics in the context of music production:
• How immersive differs from stereo
• Sound localization and perception
• The eye/ear/brain link
• Implications for recording and mixing in immersive
• Transitioning from stereo to immersive: Center speaker, LFE, working with the diffuse
surround field and height channels.
In the second part, Lasse Nipkow introduces the quasi-binaural spot miking technique he uses to capture the beautiful sound of acoustic instruments during his recordings. He explains the strategy for microphone placement and shows, using stereo and 3D audio sound examples, the potential of these signals for immersive productions.
This first contribution is linked to a second, subsequent presentation by Lasse Nipkow and Ronald Prent: ‘Tools for Impressive Immersive Productions’.

Speakers

Lasse Nipkow

CEO, Silent Work LLC

Since 2010, Lasse Nipkow has been a renowned keynote speaker in the field of 3D audio music production. His expertise spans from seminars to conferences, both online and offline, and has gained significant popularity. As one of the leading experts in Europe, he provides comprehensive... Read More →

Konrad Strauss

Professor, Indiana University Jacobs School of Music

Konrad Strauss is a Professor of Music in the Department of Audio Engineering and Sound Production at Indiana University’s Jacobs School of Music. He served as department chair and director of Recording Arts from 2001 to 2022. Prior to joining the faculty of IU, he worked as an... Read More →

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E16

Immersive & Spatial Audio, Workshop | Recording and Production, Workshop

4:30pm EDT

Morten Lindberg: Enveloping Masterclass

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

3D06

Morten details 3D music recordings from his 2L catalogue, including high resolution listening examples from new releases.

This masterclass series, featuring remarkable recording artists, is a chance to hear 3D audio at its best; as we discuss factors of production, distribution and reproduction that makes it worth the effort. Thomas exemplifies the underrated qualia, auditory envelopment (AE); and we evaluate how robustly AE latent in the content may be heard across this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats.

Speakers

Thomas Lund

Senior Technologist, Genelec Oy

Morten Lindberg

Producer and Engineer, 2L (Lindberg Lyd, Norway)

Recording Producer and Balance Engineer with 43 GRAMMY-nominations, 35 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner 2020.

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
3D06

Immersive Music, Workshop

4:30pm EDT

How Do We Embrace AI and Make It Work For Us in Audio? Hey AI, Get Yer Stinkin’ Hands Offa My Job! AES and SMPTE Joint Exploration

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

1E08

In this timely panel let’s discuss some of the pressing matters AI confronts us with in audio, and how we can turn a perceived foe into an ally.
We’ll discuss issues including:
During production, AI cannot deal with any artistic issues related to production and engineering, most of which depend on personal interaction - as well as perception.

In post-production, AI could be of use in repetitive tasks: making a track conform to a click-track and maintain proper pitch, perform pitch correction on a track, deal with extraneous clicks (without removing important vocal consonants), perform ambience-matching (particularly on live recordings), to name a few. Can AI running in the background on our DAW build macros for us?

The more we can use it as a tool for creativity and enhance our revenue streams the more it becomes a practical, positive approach. Many composers are using it to create or enhance their musical ideals almost instantaneously. The key here is that it is they are our ideas that AI adds to.

How do we embrace AI, adapt to it, and help it to adapt to us? Can we get to the point where we incorporate it as we have past innovations rather than fear it? How do we take control of AI instead of AI taking control of us?

What should we, the audio community, be asking AI to do?

Speakers

Gary Gottlieb

AES President-Elect, Mendocino College

Lenise Bent

Producer/Engineer/Editor/AES Governor, Soundflo Productions

Audio Recording HistoryWomen and Diversity in AudioAnalog Tape RecordingPost Production/Sound Design/FoleyVinyl RecordsAudio Recording Archiving, Repair and PreservationBasic and Essential Recording TechniquesOpportunities in the Audio IndustryAudio Adventurers

David V.R. Bowles

Franco Caspe

Student, Queen Mary University of London

Soumya Sai Vanka

PhD Researcher, Queen Mary University of London

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E08

Machine Learning and Artificial Intelligence, Workshop

4:30pm EDT

Sonic Mastery - A Room of One's Own

Tuesday October 8, 2024 4:30pm - 5:30pm EDT

Stage

This panel brings together a group of audio engineers to explore their experiences of navigating and shaping the audio industry as Black women. As they delve into the creation and significance of safe spaces, the panelists will discuss the importance of community in a field where representation is often lacking. Through sharing stories of unique and impactful projects, the conversation will highlight the creativity and resilience required to thrive in a male-dominated industry. Additionally, we will celebrate the legacy of unsung women producers and their groundbreaking contributions to music and production. This session is a discussion by and tribute to the talent and influence of Black women in audio engineering, both past and present.

Speakers

Eve Horne

Peak Music

Leslie Gaston-Bird

Owner, Mix Messiah Productions

Gloria Kaba

Studio Manager, Power Station at BerkleeNYC

Gloria Kaba is a Ghanaian-American sound engineer, producer, mixer, and writer with over a decade of experience in the studio, often operating under the moniker Redsoul. She’s worked on A Tribe Called Quest’s final album We Got It From Here...Thank You For Your Service and Solange’s... Read More →

Ebonie Smith

Tuesday October 8, 2024 4:30pm - 5:30pm EDT
Stage

Special Event, Workshop | Recording and Production, Workshop

5:00pm EDT

Student Mixer at Telefunken Booth

Tuesday October 8, 2024 5:00pm - 6:00pm EDT

Booth 335

Open to anyone with a student badge.

Tuesday October 8, 2024 5:00pm - 6:00pm EDT
Booth 335

6:00pm EDT

Tech Tour: The Sonic Room at Amazon Studios

Tuesday October 8, 2024 6:00pm - 8:00pm EDT

Offsite

All Tech Tours are full. Know Before You Go emails were sent to accepted Tech Tour registrants with information about the tour.

Duration: 2 hours
Capacity: 50 (Once capacity is reached, registrants will be added to a wait list)

Tour Description: Come and check out the new Sonic Studio126 at Amazon Music in Brooklyn and get to meet multi-Grammy Award Winning producer and engineer, Mr Sonic. Take a tour of the various studios, writer's rooms and event spaces at Amazon Music's Brooklyn location and finish the evening with a mixer to meet and network with other creatives and professionals.

Arrival Instructions: Arrive at the South Lobby of 25 Kent Avenue, Brooklyn, NY, and check in by presenting your full name and ID at the front desk of security. You will be heading to the 7th floor (Amazon Music) for the event.

Tuesday October 8, 2024 6:00pm - 8:00pm EDT
Offsite

Tech Tour