Loading…
AES Show 2024 NY has ended
Exhibits+ badges provide access to the ADAM Audio Immersive Room, the Genelec Immersive Room, Tech Tours, and the presentations on the Main Stage.

All Access badges provide access to all content in the Program (Tech Tours still require registration)

View the Exhibit Floor Plan.
arrow_back View All Dates
Tuesday, October 8
 

8:00am EDT

Attendee Registration
Tuesday October 8, 2024 8:00am - 6:00pm EDT
Tuesday October 8, 2024 8:00am - 6:00pm EDT
Crystal Palace South

9:00am EDT

Student Welcome Meeting
Tuesday October 8, 2024 9:00am - 9:30am EDT
Students! Joins us so we can find out where you are from, and tell you about all the exciting things happening at the convention.
Speakers
avatar for Ian Corbett

Ian Corbett

Coordinator & Professor, Audio Engineering & Music Technology, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates off-beat-open-hats LLC, providing live sound, recording, and audio production services to clients in the Kansas City area... Read More →
avatar for Angela Piva

Angela Piva

Angela Piva, Audio Pro/Audio Professor, highly skilled in all aspects of music & audio production, recording, mixing and mastering with over 35 years of professional audio engineering experience and accolades. Known as an innovator in sound technology, and for contributing to the... Read More →
Tuesday October 8, 2024 9:00am - 9:30am EDT
1E08

9:00am EDT

NPR's Tiny Desk: A look back and a look forward
Tuesday October 8, 2024 9:00am - 10:00am EDT
NPR's Tiny Desk Concert Series has seen and increase in its popularity post-pandemic and is now a must for up and coming as well as established artists. Neil Tevault started the concert series when Bob Boilen came to NPR's music studio with an idea to record an artist at his desk. Josh Newell is the current technical director of NPR Music who is working to keep the standards set at the highest level and move the concert series into the future.

Neil will present a couple of concerts from early in the series looking at the small recording footprint that was employed. The very first concert with Laura Gibson, which was a single microphone and a DI and Bela Fleck, Edgar Meyer, and Zakir Hussain who played around a single stereo shotgun microphone.

Josh will present a couple of recent concerts to show how everything has grown and we have moved all the microphones closer and increased the number of microphones to capture the sound that the Tiny Desk is known for. Josh will also look at what we're doing to make things work better and make the recording process more efficient on concert day. We'll discuss Josh's vision for the sound of the Tiny Desk moving into the future.

We'll also leave ample time for Q&A

Tuesday October 8, 2024 9:00am - 10:00am EDT
1E09

9:00am EDT

Personalized Spatial Audio for Accessibility in Xbox Gaming
Tuesday October 8, 2024 9:00am - 10:00am EDT
In this enlightening panel, technologists from the Xbox platform and creatives from Xbox studios will join us to discuss how they are driving audio innovation and game sound design towards the vision of gaming for everyone. The discussion will focus on how personalized spatial audio can foster inclusivity by accommodating the unique auditory profiles of different ethnicities, genders, and age groups of gamers. By integrating these personalized Head-Related Transfer Functions (HRTFs) into audio middleware, we aim to enhance the Xbox gaming experience for all gamers. This approach not only enriches the auditory landscape but also breaks down barriers, making immersive gaming a truly inclusive experience. Join us as we explore the future of spatial audio on Xbox, where every gamer is heard and can fully engage with the immersive worlds we create.
Speakers
KS

Kaushik Sunder

VP of Engineering, Embody
avatar for Kapil Jain

Kapil Jain

CEO, Embody
We are Embody, and we believe in the power of AI to push the boundaries of immersive sound experience in gaming and entertainment.We are a team of data scientists, audio engineers, musicians and gamers who are building an ecosystem of tools and technologies for Immersive entertainment... Read More →
avatar for Robert Ridihalgh

Robert Ridihalgh

Technical Audio Specialist, Microsoft
A 33-year veteran of the games industry, Robert is an audio designer, composer, integrator, voice expert, and programmer with a passion for future audio technologies and audio accessibility.
Tuesday October 8, 2024 9:00am - 10:00am EDT
1E16

9:00am EDT

Adapting Immersive Microphone Techniques for different Acoustics Spaces
Tuesday October 8, 2024 9:00am - 10:00am EDT
By now we know a lot about how immersive microphone arrays work, from fully coincident, to semi-coincident, to fully spaced etc., and how they can be used to make spectacular 3D productions, But what happens when we use the same recording techniques in very different acoustic spaces with different size ensembles etc.? Various hi-resolution immersive recordings of acoustic music from around the globe made in extraordinary acoustic spaces including Skywalker Scoring Stage, Zlin Concert Hall, Notre Dame University, NYU's Paulson Center, CNSO Studios, Smecky Studios, and other exceptional churches and concert halls will be presented. Along with critical listening exercises, we will share microphone set-ups, multi-track recordings, videos and photo documentation of the spaces with the participants.
Speakers
PG

Paul Geluso

Director of the Music Technolo, New York University
avatar for David Bowles

David Bowles

Owner, Swineshead Productions, LLC
David v.R Bowles formed Swineshead Productions, LLC as a classical recording production company in 1995. His recordings have been GRAMMY- and JUNO-nominated and critically acclaimed worldwide. His releases in 3D Dolby Atmos can be found on Avie, OutHere Music (Delos) and Navona labels.Mr... Read More →
Tuesday October 8, 2024 9:00am - 10:00am EDT
1E06

9:00am EDT

Best practices for wireless audio in live productions
Tuesday October 8, 2024 9:00am - 10:00am EDT
Wireless audio, both mics and in-ear-monitors, has become essential in many live productions of music and theatre, but it is often fraught with uneasiness and uncertainty. The panel of presenters will draw on their varied experience and knowledge to show how practitioners can use best engineering practices to ensure reliability and performance of their wireless mic and in-ear-monitor systems.
Speakers
avatar for Bob Lee

Bob Lee

Applications Engineer / Trainer, RF Venue, Inc.
I'm a fellow of the AES, an RF and electronics geek, and live audio specialist, especially in both amateur and professional theater. My résumé includes Senhheiser, ARRL, and a 27-year-long tenure at QSC. Now I help live audio practitioners up their wireless mic and IEM game.I play... Read More →
HC

Henry Cohen

ASR Co-Chair, AES 151
Tuesday October 8, 2024 9:00am - 10:00am EDT
1E07

9:00am EDT

dLive Certification
Tuesday October 8, 2024 9:00am - 12:00pm EDT
In addition to long-form dLive Certification classes, training sessions will cover a variety of topics relevant to live sound engineers, including: Mixing Monitors from FOH; Vocal Processing; Groups, Matrices and DCAs; Active Dynamics; and Gain Staging.
 
The training will be led by industry veterans Michael Bangs and Jake Hartsfield. Bangs, whose career includes experience as a monitor engineer and production manager, has worked with A-list artists, including Aerosmith, Katy Perry, Tom Petty, Lynyrd Skynyrd and Kid Rock. Hartsfield is a seasoned live sound engineer, having mixed for artists like Vulfpeck, Ben Rector, Fearless Flyers, and more.

Sign up link: https://zfrmz.com/DmSlX5gyZCfjrJUHa6bV
Tuesday October 8, 2024 9:00am - 12:00pm EDT
1E05

9:30am EDT

An Industry Focused Investigation into Immersive Commercial Melodic Rap Production - Part Two
Tuesday October 8, 2024 9:30am - 9:50am EDT
In part one of this study, five professional mixing engineers were asked to create a Dolby Atmos 7.1.4 mix of the same melodic rap song adhering to the following commercial music industry specifications: follow the framework of the stereo reference, implement binaural distance settings, and conform to –18LKFS, -1dBTP loudness levels. An analysis of the mix sessions and post-mix interviews with the engineers revealed that they felt creatively limited in their approaches due to the imposed industry specifications. The restricted approaches were evident through the minimal applications of mix processing, automation, and traditional positioning of key elements in the completed mixes.
In part two of this study, the same mix engineers were asked to complete a second mix of the same song without any imposed limitations and were encouraged to approach the mix creatively. Intra-subject comparisons between the restricted and unrestricted mixes were explored to identify differences in element positioning, mix processing techniques, panning automation, loudness levels, and binaural distance settings. Analysis of the mix sessions and interviews showed that when no restrictions were imposed on their work, the mix engineers emphasized the musical narrative through more diverse element positioning, increased use of automation, and applications of additional reverb with characteristics that differed from the reverb in the source material.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers
avatar for Christal Jerez

Christal Jerez

Engineer, Christal's Sonic Lab
Christal Jerez is an audio engineer with experience recording, mixing and mastering music. After studying audio production at American University for her B.A. in Audio Production and at New York University for her Masters degree in Music Technology, she started working professionally... Read More →
Authors
avatar for Christal Jerez

Christal Jerez

Engineer, Christal's Sonic Lab
Christal Jerez is an audio engineer with experience recording, mixing and mastering music. After studying audio production at American University for her B.A. in Audio Production and at New York University for her Masters degree in Music Technology, she started working professionally... Read More →
avatar for Andrew Scheps

Andrew Scheps

Owner, Tonequake Records
Andrew Scheps has worked with some of the biggest bands in the world: Green Day, Red Hot Chili Peppers, Weezer, Audioslave, Black Sabbath, Metallica, Linkin Park, Hozier, Kaleo and U2. He’s worked with legends such as Johnny Cash, Neil Diamond and Iggy Pop, as well as indie artists... Read More →
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
Professor
Tuesday October 8, 2024 9:30am - 9:50am EDT
1E03

9:30am EDT

Physiological measurement of the arousing effect of bass amplification in music
Tuesday October 8, 2024 9:30am - 10:00am EDT
Music's amazing ability to evoke emotions has been the focus of various scientific studies, with researchers testing how different musical structures or interpretations impacted the emotions induced in the listener. However, in the context of amplified music, little is known about the influence of the sound reinforcement system. In this study, we investigate whether the amount of low-frequency amplification produced by a sound system impacts the listener's arousal. We organized two listening experiments whereby we measured the skin conductance of the participants while they were listening to music excerpts with different levels of low-frequency amplification. Our results indicate that an increase in the level of bass is correlated with a small but measurable rise in electrodermal activity, which is correlated with arousal. In addition this effect seems to depend on the nature of the music.
Moderators
avatar for Brett Leonard

Brett Leonard

Director of Music Technology Programs, University of Indianapolis
Speakers
avatar for Nicolas Epain

Nicolas Epain

Application Research Engineer, L-Acoustics
Authors
avatar for Nicolas Epain

Nicolas Epain

Application Research Engineer, L-Acoustics
avatar for Thomas Mouterde

Thomas Mouterde

Field application research engineer, L-Acoustics
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
Tuesday October 8, 2024 9:30am - 10:00am EDT
1E04

9:45am EDT

Introducing the 2025 AES AI and ML for Audio Conference and its New Format
Tuesday October 8, 2024 9:45am - 10:30am EDT
The upcoming 2025 AES International Conference on Artificial Intelligence and Machine Learning for Audio (AIMLA) aims to foster a collaborative environment where researchers and practitioners from academia and industry can converge to share their latest work in Artificial Intelligence (AI) and Machine Learning (ML) for Audio.

We want to advertise the upcoming AIMLA at the AES Show, to encourage early involvement and awareness from the AES community. To better accommodate the central themes of the conference, we propose new additions to the typical AES proceedings, such as challenges and long workshops, that can more appropriately showcase the rapidly growing state of the art. In this presentation, we plan to give an overview and a discussion space about the upcoming conference, and the changes we want to bring into play, tailored for AI/ML research communities, with references to successfully organized cases outside of AES. Finally, we propose a standardized template with guidelines for hosting crowdsourced challenges and presenting long workshops.

Challenges are a staple in the ML/AI community, providing a platform where specific problems are tackled by multiple teams who develop and submit models to address the given issue. These events not only spur competition but also encourage collaboration and knowledge sharing, ultimately driving forward the collective understanding and capabilities of the community.

Complementing the challenges, we introduce long-format workshops to exchange knowledge about emerging AI approaches in audio. These workshops can help develop novel approaches from the ground up and produce high-quality material for diffusion among participants. Both additions could help the conference become an exciting and beneficial event at the forefront of AI/ML for audio, as they intend to cultivate a setting where ideas can be exchanged effectively, drawing inspiration from established conferences such as ISMIR, DCASE, and ICASSP, which have successfully fostered AI/ML communities.

As evidenced by the recent AES International Symposium on AI and the Musician, we believe AI and ML will play an increasingly important role in audio and music engineering. To facilitate and standardize the procedures for featuring and conducting challenges and long-form workshops, we will present a complete guideline for hosting long-form workshops and challenges at AES conferences.

Our final goal is to promote the upcoming 2025 International Conference on AI and Machine Learning for Audio, generate a space to discuss the new additions and ideas, connect with interested parties, advertise and provide guidelines regarding the calls for crowd-sourced challenges and workshops, and ultimately get feedback from the AES as a whole to tailor the new conference to the requirements of both our AES and the AI/ML communities.
Speakers
avatar for Soumya Sai Vanka

Soumya Sai Vanka

PhD Researcher, Queen Mary University of London
I am a doctoral researcher at the Centre for Digital MusicQueen Mary University of London under the AI and Music Centre for Doctoral Training Program. My research focuses on the design of user-centric context-aware AI-based tools for music production. As a hobbyist musician and producer myself, I am interested in developing tools that can support creativity and collaboration resulting in emergence and novelty. I am also interested... Read More →
avatar for Franco Caspe

Franco Caspe

Student, Queen Mary University of London
I’m an electronic engineer, a maker, hobbyist musician and a PhD Student at the Artificial Intelligence and Music CDT at Queen Mary University of London. I have experience in development of real-time systems for applications such as communication, neural network inference, and DSP... Read More →
avatar for Brecht De Man

Brecht De Man

Head of Research, PXL University of Applied Sciences and Arts
Brecht is an audio engineer with a broad background comprising research, software development, management and creative practice. He holds a PhD from the Centre for Digital Music at Queen Mary University of London on the topic of intelligent software tools for music production, and... Read More →
Tuesday October 8, 2024 9:45am - 10:30am EDT
1E08

9:50am EDT

Investigation of spatial resolution of first and high order ambisonics microphones as capturing tool for auralization of real spaces in recording studios equipped with virtual acoustics systems
Tuesday October 8, 2024 9:50am - 10:10am EDT
This paper proposes a methodology for studying the spatial resolution of a collection of first-order and high-order ambisonic microphones when employed as a capturing tool of Spatial Room Impulse Responses (SRIRs) for virtual acoustics applications. In this study, the spatial resolution is defined as the maximum number of mono statistically independent impulse responses that can be extracted through beamforming technique and used in multichannel convolution reverbs. The correlation of the responses is assessed as a function of the beam angle and frequency bands, adapted to the frequency response of the loudspeakers in use, with the aim to be used in recording studios equipped with virtual acoustics systems that operate in the creation of the spatial impression of reverberation of real environments. The study examines the differences introduced by the physical characteristics of the microphones, the normalization methodologies of the spherical harmonics, and the number of spherical harmonics introduced in the encoding (ambisonic order). Preliminary results show that the correlation is inversely proportional to frequency as a function of wavelength. 
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers
avatar for Gianluca Grazioli

Gianluca Grazioli

Montreal, Canada, McGill University
Authors
Tuesday October 8, 2024 9:50am - 10:10am EDT
1E03

10:00am EDT

Exploring trends in audio mixes and masters: Insights from a dataset analysis
Tuesday October 8, 2024 10:00am - 10:30am EDT
We present an analysis of a dataset of audio metrics and aesthetic considerations about mixes and masters provided by the web platform MixCheck studio.The platform is designed for educational purposes, primarily targeting amateur music producers, and aimed at analysing their recordings prior to them being released. The analysis focuses on the following data points: integrated loudness, mono compatibility, presence of clipping and phase issues, compression and tonal profile across 30 user-specified genres. Both mixed (mixes) and mastered audio (masters)are included in the analysis, where mixes refer to the initial combination and balance of individual tracks, and masters refer to the final refined version optimized for distribution. Results show that loudness-related issues along with dynamics issues are the most prevalent, particularly in mastered audio. However mastered audio presents better results in compression than just mixed audio. Additionally, results show that mastered audio has a lower percentage of stereo field and phase issues.
Moderators
avatar for Brett Leonard

Brett Leonard

Director of Music Technology Programs, University of Indianapolis
Speakers
avatar for David Ronan

David Ronan

CEO, RoEx
CEO
Authors
Tuesday October 8, 2024 10:00am - 10:30am EDT
1E04

10:10am EDT

A comparative study of volumetric microphone techniques and methods in a classical recording context
Tuesday October 8, 2024 10:10am - 10:30am EDT
This paper studies volumetric microphone techniques (i.e. using configurations of multiple Ambisonic microphones) in a classical recording context. A pilot study with expert opinions was designed to show its feasibility. Based on the findings from the pilot study, a trio recording of piano, violin, and cello was conducted where 6 Ambisonic microphones established a hexagon. Such a volumetric approach is believed to improve the sound characteristics where the recordings were processed with the SoundField by RØDE Ambisonic decoder and were produced into a 7.0.4 loudspeaker system. A blinded subject experiment was designed where the participants were asked to evaluate the volumetric hexagonal configuration, comparing it to a more traditional 5.0 immersive configuration and a single Ambisonic microphone, all of which were mixed with spot microphones. These results were quantitatively analyzed, and revealed that the volumetric configuration is the most localized amongst all, but less immersive than the single Ambisonic microphone. No significant difference occurred in focus, naturalness, and preference. The analyses are generalized because the demographic backgrounds of the participants have no effect on the sound characteristics.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers Authors
avatar for Parichat Songmuang

Parichat Songmuang

Studio Manager/PhD Student, New York University
Parichat Songmuang graduated from New York University with her Master of Music degree in Music Technology at New York University and Advanced Certificate in Tonmeister Studies. As an undergraduate, she studied for her Bachelor of Science in Electronics Media and Film with a concentration... Read More →
PG

Paul Geluso

Director of the Music Technolo, New York University
Tuesday October 8, 2024 10:10am - 10:30am EDT
1E03

10:15am EDT

For Future Use - Surround and Immersive from Familiar Sources
Tuesday October 8, 2024 10:15am - 11:15am EDT
In a perfect world, one would record projects in the perfect space with musicians performing in perfect balance, with perfect results, but for us, the perfect world doesn’t exist and we’re presented with projects that were never intended for surround or immersive presentation, In this illustrated talk, we will play projects that were recorded (even in the last century!) and have been turned into successful immersive and surround recordings..
Speakers
avatar for Jim Anderson

Jim Anderson

Producer/Engineer, Anderson Audio New York
Producer/Engineer
avatar for Ulrike Schwarz

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York
Engineer/Producer, Co-Founder
Tuesday October 8, 2024 10:15am - 11:15am EDT
1E06

10:15am EDT

Adventures in Livestreaming
Tuesday October 8, 2024 10:15am - 11:45am EDT
During the Covid pandemic, streaming was the only way for musicians and artists to connect. As the pandemic subsided, many organizations continued to stream and livestream, and oftentimes it was the audio engineering team who were left to figure out how to produce video. Join this panel of experienced engineers and educators who found themselves learning how to produce compelling live concert video. The special challenges of livestreaming (i.e. streaming without a net) will be discussed, along with lessons learned. It is the intention of the panelists to provide encouragement to fellow engineers who are learning this new skillset themselves.
Speakers
avatar for Scott Burgess

Scott Burgess

Director, Audio and Media Production, Aspen Music Festival and School
Scott Burgess has worked on all facets of symphonic production. A graduate of Interlochen and the Cleveland Institute of Music, he has played bassoon and contrabassoon, sung with the Cleveland Orchestra Chorus, and produced, engineered, or edited numerous orchestral and chamber music... Read More →
avatar for Mary Mazurek

Mary Mazurek

Audio Educator/ Recording Engineer, University of Lethbridge
Audio Educator/ Recording Engineer
Tuesday October 8, 2024 10:15am - 11:45am EDT
1E09

10:15am EDT

The Plugin Kitchen: Coding a custom effect and putting it in the mix
Tuesday October 8, 2024 10:15am - 11:45am EDT
We are excited to present an educational workshop centered around the creation of custom audio plugins in Matlab. This hands-on session is designed for both students and educators on the endless possibilities of plugin development, while also encouraging participation in AES student competitions and fostering increased membership.

The workshop, titled “The Plugin Kitchen: Coding a custom effect and putting it in the mix” will offer a unique blend of real-time coding and practical application. Participants will be guided through the process of developing a comprehensive channel strip plugin, incorporating an EQ, a compressor, and a reverb effect.

Our session will begin with an introduction to audio plugins and their importance in modern music production. We will then dive into the hands-on coding segments:

1. Multi-band Parametric EQ:
Participants will learn to implement a parametric EQ, implementing multiple bands and adjusting parameters such as frequency and gain.

2. Compressor:
We will use Matlab’s built-in functions to create a compressor, covering essential concepts like threshold, ratio, attack, and release times. Attendees will see the immediate impact of these parameters on audio dynamics.

3. Reverb:
The final coding segment will focus on adding a reverb effect, including experimenting with decay time and pre-delay.

The workshop will culminate in a live demonstration by renowned engineer Paul Womack, showcasing the practical application of the developed plugin in a mixing session. This will illustrate the real-world benefits and versatility of the plugin, inspiring participants to explore further and engage with AES competitions.

Throughout the session, we emphasize interaction and practical application, ensuring that participants leave with both theoretical knowledge and a functional plugin they can continue to develop. Join us to unlock the potential of Matlab for audio plugin development and take a step towards innovative audio engineering.
Speakers
avatar for Christoph Thompson

Christoph Thompson

Director of Music Media Production, Ball State University
Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →
avatar for Chris Bennett

Chris Bennett

Professor, University of Miami
avatar for Paul Womack

Paul Womack

Record Producer/Recording Engineer
A producer, engineer and sonic artist, Paul "Willie Green" Womack has built a discography boasting names such as Armand Hammer, Wiz Khalifa, The Alchemist, The Roots, Billy Woods, ELUCID and many more, and established himself as one of the top names in independent Hip-Hop & R&B. Currently... Read More →
Tuesday October 8, 2024 10:15am - 11:45am EDT
1E07

10:15am EDT

The Anatomy of a Recording Session: Where Audio Technology and Musical Creativity Intersect (Part III)
Tuesday October 8, 2024 10:15am - 11:45am EDT
Abstract: Humans and machines interact to create new and interesting music content. Here we look at video excerpts from a particular recording session where the behind-the-scenes action comes to the forefront. Artists working with each other, artists working with the producer and engineer, and the influence (good or bad) of the technology with which they work will all be discussed during the workshop.

Summary:
The workshop will center on a discussion of using recording studio sessions to study creativity as a collaborative, but often complex and subtle practice. Our interdisciplinary team of recording engineers/producers and musicologists aims to understand how the interactions of musicians, engineers, recording technology, and musical instruments shape a recording’s outcome. Statements by participant-observers and the analysis of video footage from recording sessions will provide the starting point for discussions. In addition to first-hand recollections by members of our team, we are interviewing musicians who participated in the sessions. The workshop will focus on both musical interactions and on the interpersonal dynamics that affect the flow of various contributions and ideas during the recording process. Technology used also plays a role and will be analyzed as part of the workshop. The first workshop of this kind was a huge success in Helsinki at the 154th AES Convention. The room was packed, and we had an engaging discussion between panelists and audience members. Part II took place last week in Madrid (156th Convention), with many attendees saying it was “one of the highlights of the convention”. The workshop room was quite full, even though it was in the last timeslot of the last day. For this third workshop we plan on looking under the hood of a recording session involving a funk band with full horn section, and three lead singers. We plan to dig deeply into analyzing underlying events that percolate over time; the “quiet voices” that subtly influence the outcome of a recording session.
Along with our regular team of experts we are very excited to invite a guest panelist this time around – a venerable expert in music production, collaboration, and perception, Dr. Susan Rogers. This workshop was also proposed for AES 155th New York last fall but was declined due to lack of presentation space. Our requirements are simply a PowerPoint video presentation with stereo audio playback.
Speakers
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
avatar for Lisa Barg

Lisa Barg

Associate Professor, McGill University
Lisa Barg is Associate Professor of Music History and Musicology at the Schulich School of Music at McGill University and Associate Dean of Graduate Studies. She has published articles on race and modernist opera, Duke Ellington, Billy Strayhorn, Melba Liston and Paul Robeson. She... Read More →
avatar for David Brackett

David Brackett

Professor, McGill University
David Brackett is Professor of Music History and Musicology at the Schulich School of Music of McGill University, and Canada Research Chair in Popular Music Studies. His publications include Interpreting Popular Music (2000), The Pop, Rock, and Soul Reader: Histories and Debates... Read More →
avatar for Susan Rogers

Susan Rogers

Professor, Berklee Online
Susan Rogers holds a doctoral degree in experimental psychology from McGill University (2010). Prior to her science career, Susan was a multiplatinum-earning record producer, engineer, mixer and audio technician. She is best known for her work with Prince during his peak creative... Read More →
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →
Tuesday October 8, 2024 10:15am - 11:45am EDT
1E16

10:15am EDT

SEIDS guide to Building Sustainable Businesses for Music Creators
Tuesday October 8, 2024 10:15am - 11:45am EDT
In this workshop, you'll learn how to turn your love for music into a successful business. Acclaimed music producer, Sabrina Seidman aka SEIDS' tutorials have garnered thousands of views online. She’ll talk about how to decide what you want to do, find the right clients, and make products or services they need. You’ll also learn how to get clients and create your own chances to succeed. By the end of the workshop, you'll have the tools to start building a music career that makes you happy and earns you money.
Speakers
Tuesday October 8, 2024 10:15am - 11:45am EDT
1E15

10:30am EDT

Bestiari: a hypnagogic experience created by combining complementary state-of-the-art spatial sound technologies, Catalan Pavilion, Venice Art Biennale 2024
Tuesday October 8, 2024 10:30am - 10:50am EDT
Bestiari, by artist Carlos Casas1, is a spatial audio installation created as the Catalan pavilion for the 2024 Venice Art Biennale. The installation was designed for ambulant visitors and the use of informal seating arrangements distributed throughout the reproduction space, so the technical installation design did not focus on listeners’ presence in a single “sweet-spot”. While high-quality conventional spatial loudspeaker arrays typically provide excellent surround-sound experiences, the particular challenge of this installation was to reach into the proximate space of individual, dispersed and mobile listeners, rather than providing an experience that was only peripherally enveloping. To that end, novel spatial audio workflows and combinations of reproduction technologies were employed, including: High-order Ambisonic (HoA), Wavefield Synthesis (WFS), beamforming icosahedral (IKO), directional/parametric ultrasound, and infrasound. The work features sound recordings made for each reproduction technology, e.g., ambient Ambisonic soundfields recorded in Catalan national parks combined with mono and stereo recordings of specific insects in that habitat simultaneously projected via the WFS system. In-situ production provided an opportunity to explore the differing attributes of the reproduction devices and their interactions with the acoustical characteristics of the space – a concrete and brick structure with a trussed wooden roof, built in the late 1800s for the Venetian shipping industry. The practitioners’ reflections on this exploration, including their perception of the capabilities of this unusual combination of spatial technologies, are presented. Design, workflows and implementation are detailed.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers
avatar for Craig Cieciura

Craig Cieciura

Research Fellow, University of Surrey
Craig graduated from the Music and Sound Recording (Tonmeister) course at The University of Surrey in 2016. He then completed his PhD at the same institution in 2022. His PhD topic concerned reproduction of object-based audio in the domestic environment using combinations of installed... Read More →
Authors
avatar for Craig Cieciura

Craig Cieciura

Research Fellow, University of Surrey
Craig graduated from the Music and Sound Recording (Tonmeister) course at The University of Surrey in 2016. He then completed his PhD at the same institution in 2022. His PhD topic concerned reproduction of object-based audio in the domestic environment using combinations of installed... Read More →
Tuesday October 8, 2024 10:30am - 10:50am EDT
1E03

10:30am EDT

Audience Effect in the Low-Frequency Range, Part 2: Impact on Time Alignment of Loudspeaker Systems
Tuesday October 8, 2024 10:30am - 11:00am EDT
A sound reinforcement system typically combines a full-range system with a subwoofer system to deliver a consistent frequency bandwidth. The two systems must be time-aligned, which is usually done without an audience. This paper investigates the impact of the audience on the time alignment of loudspeaker systems at low frequencies. The study demonstrates, through on-site measurements and simulations, that the audience significantly affects sound propagation. The research highlights the greater phase shift observed with ground-stacked subwoofers compared to flown systems due to the audience’s presence, requiring adjustments of the system time alignment with the audience when flown and ground-stacked sources are used together. Moreover, in this case, the results demonstrate the lower quality of the summation with the audience even with the alignment adjustment. Lastly, recommendations for system design and calibration are proposed.
Moderators
avatar for Brett Leonard

Brett Leonard

Director of Music Technology Programs, University of Indianapolis
Speakers
avatar for Thomas Mouterde

Thomas Mouterde

Field application research engineer, L-Acoustics
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
Authors
avatar for Thomas Mouterde

Thomas Mouterde

Field application research engineer, L-Acoustics
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
avatar for Nicolas Epain

Nicolas Epain

Application Research Engineer, L-Acoustics
Tuesday October 8, 2024 10:30am - 11:00am EDT
1E04

10:45am EDT

Generative AI For Novel Audio Content Creation
Tuesday October 8, 2024 10:45am - 11:45am EDT
The presence and hype associated with generative AI across most forms of recorded media have become undeniable realities. Generative AI tools are becoming increasingly more prevalent, with applications ranging from conversational chatbots to text-to-image generation. More recently, we have witnessed an influx of generative audio models which have the potential of disrupting how music may be created in the very near future. In this talk, we will highlight some of the core technologies that enable novel audio content creation for music production, reviewing some seminal text-to-music works from the past year. We will then delve deeper into common research themes and subsequent works which intend to map these technologies closer to musicians’ needs.


We will begin the talk by outlining a common framework underlying the generative audio models that we will touch on, consisting of an audio synthesizer “back-end” paired with a latent representation modeling “front-end.” Accordingly, we will overview two primary forms of back-ends in the forms of neural audio codecs and variational auto-encoders (with examples), and illustrate how they pair naturally with transformer language model (LM) and latent diffusion model (LDM) front-ends, respectively. Furthermore, we will briefly touch on CLAP and T5 embeddings as conditioning signals that enable text as an input interface, and explain the means by which they are integrated into modern text-to-audio systems.

Next, we will review some seminal works that have been released within the past year(s) (primarily in the field of text-to-music generation), and roughly categorize them according to the common framework that we have built up thus far. At the time of writing this proposal, we would naturally consider MusicLM/FX (LM), MusicGen (LM), Stable Audio (LDM), etc. as exemplary candidates for review. We will contextualize these new capabilities in terms of what they can enable for music production and opportunities for future improvements. Accordingly, we will draw on some subsequent works that intend on meeting musicians a bit closer to the creative process. At the time of writing this proposal, this may include but is not limited to ControlNet (LDM), SingSong (LM), StemGen (LM), VampNet (LM), as well as our own previous work, as approved time permits. We will cap off our talk by providing some perspectives on what AI researchers could stand to understand about music creators, and what musicians could stand to understand about scientific research. Time permitting, we may allow ourselves to conduct a live coding demonstration whereby we exemplify constructing, training, and inferring audio examples from a generative audio model on a toy data example leveraging several prevalent open source libraries.


We hope that such a talk would be both accessible and fruitful for technologists and musicians alike. It would assume no background knowledge in generative modeling, and may perhaps assume only the most notional conception as to how machine learning works. The goal of this talk would be for the audience at large to walk out with a rough understanding of the underlying technologies and challenges associated with novel audio content creation using generative AI.
Tuesday October 8, 2024 10:45am - 11:45am EDT
1E08

10:50am EDT

Influence of Dolby Atmos versus Stereo Formats on Narrative Engagement: A Comparative Study Using Physiological and Self-Report Measures
Tuesday October 8, 2024 10:50am - 11:10am EDT
As spatial audio technology rapidly evolves, the conversation around immersion becomes ever more relevant, particularly in how these advancements enhance the creation of compelling sonic experiences. However, immersion is a complex, multidimensional construct, making it challenging to study in its entirety. This paper narrows the focus to one particular dimension—narrative engagement—to explore how it shapes the immersive experience. Specifically, we investigate whether the multichannel audio format, here 7.1.4, enhances narrative engagement compared to traditional stereo storytelling. Participants were exposed to two storytelling examples: one in an immersive format and another in a stereo fold-down. Physiological responses were recorded during listening sessions, followed by a self-report survey adapted from the Narrative Engagement Scale. The lack of significant differences between two formats in both subjective and objective measures is discussed in the context of existing studies.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
Professor
Authors
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
Professor
Tuesday October 8, 2024 10:50am - 11:10am EDT
1E03

11:00am EDT

Part of the Band: Virtual Acoustic Space as a Participant in Musical Performance
Tuesday October 8, 2024 11:00am - 11:30am EDT
We detail a real-time application of active acoustics used to create a shared virtual environment over a closed audio network as a research-creation project exploring the concept of room participation in musical performance. As part of a concert given in the Immersive Media Lab at McGill University, musicians and audience members were located in a virtual acoustic environment while a second audience was located in an adjacent but acoustically isolated space on the same audio network. Overall, the blending of computer-generated and acoustic sources created a specific use case for virtual acoustics while the immersive capture and distribution method examined an avenue for producing a real-time shared experience. Future work in this area includes audio networks with multiple virtual acoustic environments.
Moderators
avatar for Brett Leonard

Brett Leonard

Director of Music Technology Programs, University of Indianapolis
Speakers
avatar for Kathleen Ying-Ying Zhang

Kathleen Ying-Ying Zhang

PhD Candidate, McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
Authors
avatar for Kathleen Ying-Ying Zhang

Kathleen Ying-Ying Zhang

PhD Candidate, McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
avatar for Mihai-Vlad Baran

Mihai-Vlad Baran

McGill University
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Tuesday October 8, 2024 11:00am - 11:30am EDT
1E04

11:10am EDT

Creation of representative head-related impulse responses for binaural rendering of moving audio objects
Tuesday October 8, 2024 11:10am - 11:30am EDT
To achieve highly realistic 3D audio reproduction in virtual reality (VR) or augmented reality (AR) through binaural rendering, we must address the considerable computational complexity involved in convoluting head-related impulse responses (HRIRs). To reduce this complexity, an algorithm is proposed where audio signals are distributed to pre-defined representative directions through panning. Only the distributed signals are then convoluted with the corresponding HRIRs. In this study, we explored a method for generating representative HRIRs through learning, utilizing a full-sphere HRIR set. This approach takes into account smooth transitions and minimal degradation introduced during rendering, for both moving and static audio objects. Compared with conventional panning, the proposed method reduces average distortion by approximately 47% while maintaining the runtime complexity of the rendering.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers Authors
avatar for Masayuki Nishiguchi

Masayuki Nishiguchi

Professor, Akita Prefectural University
Masayuki Nishiguchi received his B.E., M.S., and Ph.D. degrees from Tokyo Institute of Technology, University of California Santa Barbara, and Tokyo Institute of Technology, in 1981, 1989, and 2006 respectively.  He was with Sony corporation from 1981 to 2015, where he was involved... Read More →
Tuesday October 8, 2024 11:10am - 11:30am EDT
1E03

11:30am EDT

Quantifying the Impact of Head-Tracked Spatial Audio on Common User Auditory Experiences using Facial Microexpressions
Tuesday October 8, 2024 11:30am - 11:50am EDT
The study aims to enhance the understanding of how Head Tracked Spatial Audio technology influences both emotional responses and immersion levels among listeners. By employing micro facial gesture recognition technology, it quantifies the depth of immersion and the intensity of emotional responses elicited by various types of binaural content, measuring categories such as Neutral, Happy, Sad, Angry, Surprised, Scared, Disgusted, Contempt, Valence, and Arousal. Subjects were presented with a randomized set of audio stimuli consisting of stereo music, stereo speech, and 5.1 movie content. Each audio piece lasted 15 seconds, and the Spatial Audio processing was On or Off randomly throughout the experiment. The FaceReader software was detecting the facial microexpressions of the subjects constantly. Statistical analysis was conducted using R software, applying Granger causality tests in time series, T-tests, and the P-value criterion for hypothesis validation. After consolidating the records of 78 participants, the final database consisted of 212,862 unique data points. With a 95% confidence, it was determined that the average level of "Arousal" is significantly higher when Head Tracked Spatial Audio is activated compared to when it is deactivated, suggesting that HT technology increases the emotional arousal of audio listeners. Regarding the happiness reaction, the highest levels were recorded in mode 5 (HT on and Voice) with an average of 0.038, while the lowest levels were detected in mode 6 (HT off and Voice). Preliminary conclusions indicate that surprise effectively causes a decrease in neutrality, supporting the dynamic interaction between these emotional variables.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers Authors
Tuesday October 8, 2024 11:30am - 11:50am EDT
1E03

11:30am EDT

INFLUENCE OF RECORDING TECHNIQUE AND ENSEMBLE SIZE ON APPARENT SOURCE WIDTH
Tuesday October 8, 2024 11:30am - 12:00pm EDT
The impression of listeners to aurally “see” the size of a performing entity is crucial to the success of both a concert hall and a reproduced sound field. Previous studies have looked at how different concert halls with different lateral reflections affect apparent source width. Yet, the perceptual effects of different source distributions with different recording techniques on apparent source width are not well understood. This study explores how listeners perceive the width of an orchestra by using four stereo and one binaural recording techniques and three wave field synthesis ensemble settings. Subjective experiments were conducted using stereo loudspeakers and headphone to play back the recording clips asking the listeners to rate the perceived wideness of the sound source. Results show that recording techniques greatly influence how wide an orchestra is perceived. The primary mechanism used to judge auditory spatial impression differs between stereo loudspeaker and headphone listening. When western classical symphony is recorded and reproduced by two-channel stereophony, the changes in instrument positions in terms of increasing or reducing the physical source width do not lead to an obvious increase or reduction on the spatial impression of the performing entity.
Moderators
avatar for Brett Leonard

Brett Leonard

Director of Music Technology Programs, University of Indianapolis
Speakers Authors
Tuesday October 8, 2024 11:30am - 12:00pm EDT
1E04

11:50am EDT

Investigating the Role of Customized Interaural Time Differences on First-Person Shooter Gaming Performance
Tuesday October 8, 2024 11:50am - 12:10pm EDT
Binaural listening with personalized Head-Related Transfer Functions (HRTFs) is known to enhance a listener's auditory localization in virtual environments, including gaming. However, the methods for achieving personalized HRTFs are often inaccessible for average game players due to measurement complexity and cost. This study explores a simplified approach to improving game performance, particularly in First-Person Shooter (FPS) games by optimizing Interaural Time Difference (ITD). Recognizing that horizontal localization is particularly important for identifying opponent positions in FPS games, this study hypothesizes that optimizing ITD alone may be sufficient for better game performances, potentially alleviating the need for full HRTF personalization. To test this hypothesis, a simplified FPS game environment was developed in Unity. Participants performed tasks to detect sound positions under three HRTF conditions: MIT-KEMAR, Steam Audio’s default HRTF, and the proposed ITD optimization method. The results indicated that our proposed method significantly reduced players' response times compared to other HRTF conditions. These findings allow players to improve their gaming performance within FPS games through simplified HRTF optimization, broadening accessibility to optimized HRTFs for a wider range of game users.
Moderators
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor, New York University
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
Speakers
avatar for Sungjoon Kim

Sungjoon Kim

Research Intern, Korea Advanced Institute of Science and Technology
Authors
avatar for Sungjoon Kim

Sungjoon Kim

Research Intern, Korea Advanced Institute of Science and Technology
avatar for Rai Sato

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology
Rai Sato (佐藤 来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →
Tuesday October 8, 2024 11:50am - 12:10pm EDT
1E03

12:00pm EDT

Opening & Awards Ceremony
Tuesday October 8, 2024 12:00pm - 12:45pm EDT
Join the AES Committee and Chairs as we celebrate award recipients from the past year. The Opening and Awards will be followed by the keynote speech by Ebonie Smith.
Speakers
avatar for Leslie Gaston-Bird

Leslie Gaston-Bird

Owner, Mix Messiah Productions
Leslie Gaston-Bird (AMPS, MPSE) is author of the book "Women in Audio", part of the AES Presents series and published by Focal Press (Routledge). She is a voting member of the Recording Academy (The Grammys®) and its P&E (Producers and Engineers) Wing. Currently, she is a freelance... Read More →
avatar for Gary Gottlieb

Gary Gottlieb

AES President-Elect, Mendocino College
President-Elect, Co-Chair of the Events Coordination Committee, Chair of the Conference Policy Committee, and former Vice President of the Eastern Region, US and Canada; AES Fellow, Engineer, Author, Educator and Guest Speaker Gary Gottlieb refers to himself as a music generalist... Read More →
avatar for Michael Hagen

Michael Hagen

System Administrator and Studio Manager, New York University - Clive Davis Institute
Michael Hagen is the driving force behind the technical operations at the Clive Davis Institute at New York University, where he oversees the maintenance and continual improvement of the institute’s advanced production facilities. With extensive hands-on experience in recording... Read More →
avatar for Jeanne Montalvo

Jeanne Montalvo

Engineer/Producer, Self
Jeanne Montalvo is a Grammy-nominated audio engineer and award-winning radio producer. She was selected amongst thousands of applicants as the 2018 EQL resident at Spotify Studios and Electric Lady Studios in New York City, assisting in the recording process for artists like John... Read More →
Tuesday October 8, 2024 12:00pm - 12:45pm EDT
Stage

12:45pm EDT

Keynote: Ebonie Smith
Tuesday October 8, 2024 12:45pm - 1:45pm EDT
Ebonie Smith is a celebrated music producer, audio engineer, and singer-songwriter, based in the vibrant hub of Los Angeles. As a prominent figure in the industry, she currently holds the esteemed roles of senior audio engineer and producer at Atlantic Records. Ebonie's remarkable portfolio features notable credits, including the Broadway cast album of Hamilton, Janelle Monae's groundbreaking Dirty Computer, and Cardi B's chart-topping Invasion Of Privacy.

Notably, Ebonie serves as the Co-Chair of the Producers & Engineers Wing of The Recording Academy, underscoring her dedication to advancing excellence in music production. Beyond her professional achievements, she is the visionary founder and president of Gender Amplified, Inc., a nonprofit organization committed to celebrating and empowering women music producers.

Ebonie's educational foundation includes a master's degree in music technology from New York University and an undergraduate degree from Barnard College, Columbia University, solidifying her position as a distinguished leader in the music industry.
Speakers
avatar for Ebonie Smith

Ebonie Smith

Ebonie Smith is a celebrated music producer, audio engineer, and singer-songwriter, based in the vibrant hub of Los Angeles. As a prominent figure in the industry, she currently holds the esteemed roles of senior audio engineer and producer at Atlantic Records. Ebonie's remarkable... Read More →
Tuesday October 8, 2024 12:45pm - 1:45pm EDT
Stage

1:00pm EDT

Mixing Monitors from FOH
Tuesday October 8, 2024 1:00pm - 2:00pm EDT
In addition to long-form dLive Certification classes, training sessions will cover a variety of topics relevant to live sound engineers, including: Mixing Monitors from FOH; Vocal Processing; Groups, Matrices and DCAs; Active Dynamics; and Gain Staging.
 
The training will be led by industry veterans Michael Bangs and Jake Hartsfield. Bangs, whose career includes experience as a monitor engineer and production manager, has worked with A-list artists, including Aerosmith, Katy Perry, Tom Petty, Lynyrd Skynyrd and Kid Rock. Hartsfield is a seasoned live sound engineer, having mixed for artists like Vulfpeck, Ben Rector, Fearless Flyers, and more.

Sign up link: https://zfrmz.com/DmSlX5gyZCfjrJUHa6bV
Tuesday October 8, 2024 1:00pm - 2:00pm EDT
1E05

1:00pm EDT

Exhibit Hall
Tuesday October 8, 2024 1:00pm - 6:00pm EDT
Step into the heart of innovation at the Audio Engineering Society’s Annual Conference Exhibit Hall. This dynamic space brings together leading companies and cutting-edge technologies from across the audio engineering industry. Attendees will have the opportunity to explore the latest advancements in audio equipment, software, and services, engage with industry experts, and discover new solutions to enhance their projects. Whether you’re looking to network with professionals, gain insights from live demonstrations, or simply stay ahead of the curve, the Exhibit Hall is the place to be. Don’t miss this chance to immerse yourself in the future of audio engineering!
Tuesday October 8, 2024 1:00pm - 6:00pm EDT
Exhibit Hall

1:45pm EDT

AES Ice Cream Social
Tuesday October 8, 2024 1:45pm - 2:45pm EDT
Tuesday October 8, 2024 1:45pm - 2:45pm EDT
AES Membership Booth

2:00pm EDT

Towards prediction of high-fidelity earplug subjective ratings using acoustic metrics
Tuesday October 8, 2024 2:00pm - 2:30pm EDT
High-fidelity earplugs are used by musicians and live sound engineers to prevent hearing damage while allowing musical sounds to reach the eardrum without distortion. To determine objective methods for judging earplug fidelity in a similar way to headphones or loudspeakers, a small sample of trained listeners were asked to judge the attenuation level and clarity of music through seven commercially available passive earplugs. These scores were then compared to acoustic/musical metrics measured in a laboratory. It was found that NRR is strongly predictive of both attenuation and clarity scores, and that insertion loss flatness provides no advantage over NRR. A different metric measuring spectral flatness distortion seems to predict clarity independently from attenuation and will be subject to further study.
Moderators Speakers
avatar for David Anderson

David Anderson

Assistant Professor, University of Minnesota Duluth
Authors
avatar for David Anderson

David Anderson

Assistant Professor, University of Minnesota Duluth
Tuesday October 8, 2024 2:00pm - 2:30pm EDT
1E03

2:00pm EDT

Fourier Paradoxes
Tuesday October 8, 2024 2:00pm - 2:30pm EDT
Fourier theory is quite ubiquitous in modern audio signal processing. However, this framework is often at odds with our intuitions behind audio signals. Strictly speaking, Fourier theory is ideal to analyze periodic behaviors but when periodicities change across time it is easy to misinterpret its results. Of course, we have developed strategies around it like the Short Time Fourier Transform, yet again many times our interpretations of it falls beyond what the theory really says. This paper pushes the exact theoretical description showing examples where our interpretation of the data is incorrect. Furthermore, it shows specific instances where we incorrectly take decisions based on such paradoxical framework.
Moderators
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
Speakers
avatar for Juan Sierra

Juan Sierra

NYU
Currently, I am a PhD Candidate in Music Technology at NYU and am currently based in NYUAD as part of the Global Fellowship program. As a professional musician, my expertise lies in Audio Engineering, and I hold a master's degree in Music, Science, and Technology from the prestigious... Read More →
Authors
avatar for Juan Sierra

Juan Sierra

NYU
Currently, I am a PhD Candidate in Music Technology at NYU and am currently based in NYUAD as part of the Global Fellowship program. As a professional musician, my expertise lies in Audio Engineering, and I hold a master's degree in Music, Science, and Technology from the prestigious... Read More →
Tuesday October 8, 2024 2:00pm - 2:30pm EDT
1E04

2:00pm EDT

Archiving multi-track and multi-channel: challeges and recommendations
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Where are the files of my immersive production and why don't they open correctly on another DAW?
This workshop will outline some of the major challenges when working with archival materials and it will discuss ongoing activities in AES Standards Group SC-03-06 relating to multi-track and multi-channel audio. The workshop will give an overview of the status of the document, "Implementation of a Recommended Workflow for the Creation and Archiving of Digital Archival Materials from Professional Audio Production Formats," as well as why the interaction with related partner institutions (NARAS P&E Wing, SMPTE) and internal working groups is so important. The topic is relevant not only for preservation in an archival setting, but far beyond, for keeping audio productions safely stored and accessible in general.
Speakers
avatar for Nadja Wallaszkovits

Nadja Wallaszkovits

Stuttgart State Academy of fine Arts
avatar for Brad McCoy

Brad McCoy

Audio Engineer, Retired, Library of Congress
Audio Engineer (Archiving/Preservation)
avatar for Ulrike Schwarz

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York
Engineer/Producer, Co-Founder
avatar for Jim Anderson

Jim Anderson

Producer/Engineer, Anderson Audio New York
Producer/Engineer
avatar for Jeff Willens

Jeff Willens

Media Preservation Engineer, New York Public Library
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E09

2:00pm EDT

Designing With Constraints in an Era of Abundance
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Every year, the availability and capabilites of processors and sensors expand greatly while their cost decreases correspondingly. As an instrument designer, the temptation to include every possible advance is an obvious one, but one that comes at the cost of additional complexity and probably more concerningly, reduced *character*.

Looking back on electronic instruments from the past, note that what we love about the classics are the quirks and idiosyncracies that come from the technical limitations of the time and how the designers ended up leveraging those limitations creatively. This panel will bring together designers to discuss how we are self-imposing constraints on our designs in the face of that temptation to include everything that's possible.

As Brian Eno said: "Whatever you now find weird, ugly, uncomfortable and nasty about a new medium will surely become its signature. CD distortion, the jitteriness of digital video, the crap sound of 8-bit - all of these will be cherished and emulated as soon as they can be avoided."
Speakers
avatar for Brett Porter

Brett Porter

Lead Software Engineer, Artiphon
Brett g Porter is a software developer and engineering manager with 3 decades of experience in the pro audio/music instrument industry; currently Lead Software Engineer at Artiphon, he leads the team that develops companion applications for the company's family of instruments. Previously... Read More →
AF

Alexandra Fierra

Eternal Research
AM

Adam McHeffey

CMO, Artiphon
avatar for Ben Neill

Ben Neill

Former Professor, Ramapo College
Composer/performer Ben Neill is the inventor of the mutantrumpet, a hybrid electro-acoustic instrument, and is widely recognized as a musical innovator through his recordings, performances and installations. Neill has recorded ten CDs of his music on the Universal/Verve, Thirsty Ear... Read More →
avatar for Nick Yulman

Nick Yulman

Kickstarter
Nick Yulman has worked with Kickstarter’s community of creators for the last ten years and currently leads the company’s Outreach team, helping designers, technologists, and artists of all kinds bring their ideas to life through crowdfunding. He was previously Kickstarter’s... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E08

2:00pm EDT

Recent Advances in Volumetric Recording Techniques for Music Production
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Volumetric sound recording techniques , where multiple 3D sound signals are captured throughout the recording area and mapped later for playback, have shown great promise in the world of positional tracking immersive experiences. But is volumetric sound capture an effective and plausible technique for music production ? What advantages does volumetric have over more traditional recording techniques and single position Ambisonic recordings and how the signals are combined for a stationary immersive listening experience ? These questions will be addressed by sharing immersive recordings, documentation, and discussing the outcomes of recent and ongoing work conducted at New York University and McGill University.
Speakers
PG

Paul Geluso

Director of the Music Technolo, New York University
avatar for Ying-Ying Zhang

Ying-Ying Zhang

PhD Candidate, McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
avatar for Parichat Songmuang

Parichat Songmuang

Studio Manager/PhD Student, New York University
Parichat Songmuang graduated from New York University with her Master of Music degree in Music Technology at New York University and Advanced Certificate in Tonmeister Studies. As an undergraduate, she studied for her Bachelor of Science in Electronics Media and Film with a concentration... Read More →
avatar for Michael Ikonomidis

Michael Ikonomidis

Doctoral student, McGill University
Michael Ikonomidis (Michail Oikonomidis) is an accomplished audio engineer and PhD student in Sound Recording at McGill University, specializing in immersive audio, high-channel count orchestral recordings and scoring sessions.With a diverse background in music production, live sound... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E06

2:00pm EDT

Richard King: Enveloping Masterclass
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Richard King plays high resolution 7.1.4 music recordings of Yo-Yo Ma and Eric Clapton, and from the excellent Chevalier sound track album, describing also some of the techniques used.

This masterclass series, featuring remarkable recording artists, is a chance to hear 3D audio at its best; as we discuss factors of production, distribution and reproduction that makes it worth the effort. Thomas exemplifies the important qualia, auditory envelopment (AE); and we evaluate how well AE latent in the content - from intimate to grand - comes across in this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats each time.
Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
3D06

2:00pm EDT

Bridging the Gap: Lessons for Live Media Networking from IT
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
The rapid evolution of live media networking has brought it closer to converged networking, where robust and efficient communication is paramount. While protocols such as MILAN/AVB, Dante and AES67 are staples, significant opportunities exist to enhance live media networking by adopting architectural blueprints, tools, and widely used protocols from the Information Technology (IT) sector. This workshop explores the specific requirements of live media networking, identifies potential learnings from IT workflows, and examines how other industries, particularly broadcast and video markets, have successfully integrated IT principles to propose technical recommendations.
Live media networking, encompassing audio, video, and control signals, demands high precision, low latency, and synchronization. Unlike traditional IT networks, which prioritize data integrity and security, live media networks must ensure seamless real-time transmission without compromising quality. The workshop will delve into these specificities, highlighting the challenges unique to live media and how they differ from typical IT networking scenarios and the use of Time Sensitive Networking (TSN)..
A significant challenge in this transition is the learning curve faced by sound technicians. Traditionally focused on audio-specific knowledge, these professionals now need to acquire IT networking skills to manage complex media networks effectively. This gap in expertise necessitates a new role emerging in the industry: the "Live Media Network Manager," a specialist who bridges the knowledge gap between traditional sound engineering and advanced IT networking.
A key focus area will be examining IT architectural blueprints and their applicability to live media networking. IT networks often leverage scalable, redundant, and resilient architectures to ensure uninterrupted service delivery. By adopting similar principles, live media networks can achieve greater reliability and scalability. The workshop will discuss how concepts such as network segmentation, redundancy, and failover mechanisms from IT can be tailored to meet the stringent requirements of live media.
Additionally, we will explore the tools and protocols widely used in IT that can benefit live media networking. Network monitoring and management tools, such as SNMP and Syslog, offer comprehensive insights into network performance and can aid in proactive maintenance and troubleshooting. Furthermore, protocols like QoS can be adapted to prioritize media traffic, ensuring that critical audio and video streams are delivered with minimal delay and jitter.
The workshop will also draw parallels from the broadcast and video markets, which have already embraced IT-based solutions to enhance their networking capabilities. These industries have developed technical recommendations and standards, such as SMPTE ST 2110 for professional media over managed IP networks, which can serve as valuable references for the live media domain. By examining these examples, participants will gain a broader perspective on how cross-industry learnings can drive innovation in live media networking.
This workshop will provide a comprehensive overview of the specific needs of live media networking and present actionable insights from IT workflows and other industries. Participants will leave with a deeper understanding of how to leverage IT principles to enhance the efficiency, reliability, and scalability of live media networks, paving the way for a more integrated and future-proof approach.
Speakers
avatar for Nicolas Sturmel

Nicolas Sturmel

Directout GmbH
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E16

2:00pm EDT

HELA Certification: Elevating standards in live event sound management
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
The Healthy Ears, Limited Annoyance (HELA) certification scheme, which originated within the AES Technical Committee on Acoustics and Sound Reinforcement, launched in summer 2024. Tailored for event organizers, sound engineers, venue managers, musicians, and all other key live event industry stakeholders, HELA offers a comprehensive framework for delivering live music experiences that protect audience hearing and minimize neighborhood disturbances. This session will delve into the balance between sound quality, hearing health and community harmony. Attendees will gain practical insights into HELA's guidance on sound level management and effective communication strategies, fostering a community dedicated to sustainable live event production. Join us to discover how HELA Certification can set a new industry standard, creating memorable yet safe and respectful experiences for everyone involved.
Speakers
avatar for Adam Hill

Adam Hill

Associate Professor of Electroacoustics, University of Derby
Adam Hill is an Associate Professor of Electroacoustics at the University of Derby where he leads the Electro-Acoustics Research Lab (EARLab) and runs the MSc Audio Engineering program. He received a Ph.D. from the University of Essex, an M.Sc. in Acoustics and Music Technology from... Read More →
avatar for Jon Burton

Jon Burton

Senior Lecturer, Derby University
A live sound engineer with over 40 years of concert touring experience. Jon has toured internationally with artists such as Bryan Ferry, Stereophonics, Biffy Clyro and The Prodigy. Jon is also a partner in a five-studio recording complex in Sheffield, UK. Involved in education for... Read More →
avatar for Laura Sinnott

Laura Sinnott

Owner, Sound Culture
A long time audio engineer for film, Laura career-expanded into hearing health as an audiologist. She ran the hearing clinic at Sensaphonics, a Chicago-based institution that has served musicians for over 30 years. Now based in Central New York, she sees patients in her Utica, NY... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E07

2:00pm EDT

WMAS – The way forward for multichannel wireless audio
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Join the industry experts on this exciting panel to learn more about Wireless Multichannel Audio Systems, the wide scope of approaches that WMAS facilitates, and how the various technologies can help you with multi-channel wireless set ups in challenging RF environments.
Speakers
avatar for Joe Ciaudelli

Joe Ciaudelli

Sennheiser
Joe Ciaudelli was hired by Sennheiser in 1987 upon graduating from Columbia University with an electrical engineering degree.  He provided frequency coordination for large multi-channel wireless microphone systems used by Broadway productions, major theme parks, and broadcast networks... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
1E15

2:00pm EDT

Jack Antonoff & Laura Sisk, Up Close and Personal
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
GRAMMY® Award-winning producer Jack Antonoff and GRAMMY® Award-winning recording and mix engineer Laura Sisk will be the focal point of a special event session titled “Up Close and Personal.” Jack Antonoff is an eleven-time GRAMMY® Award-winning producer, artist, songwriter, and musician, as well as the creative force behind Bleachers. In February 2024, Antonoff won Producer of the Year at the GRAMMY Awards® for an incredible third consecutive year, becoming only the second producer in history to win three years running. He will be joined by five-time GRAMMY® Award-winning recording and mix engineer and long-time collaborator Laura Sisk in a session revealing their studio work with such leading artists as Taylor Swift, Lana Del Rey, Sabrina Carpenter, Kendrick Lamar, Nick Cave, St. Vincent, Diana Ross and more. The event will be moderated by producer/engineer/musician Glenn Lorbecki.
Speakers
avatar for Jack Antonoff

Jack Antonoff

Credited by the BBC for having “redefined pop music,” the globally celebrated, eleven-time Grammy Award-winning singer, songwriter, musician, and producer, Jack Antonoff has collaborated with the likes of Taylor Swift, Kendrick Lamar, Lana Del Rey, The 1975, Diana Ross, Lorde... Read More →
avatar for Laura Sisk

Laura Sisk

Laura Sisk is a five-time GRAMMY® Award-winning recording and mix engineer, widely recognized for her work with producer Jack Antonoff on Taylor Swift, as well as working with renowned artists like Lana Del Rey, Jon Batiste, Florence + The Machine, Diana Ross, Lorde, St.Vincent... Read More →
Tuesday October 8, 2024 2:00pm - 3:00pm EDT
Stage

2:00pm EDT

Bitrate adaptation in object-based audio coding in communication immersive voice and audio systems
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
The object-based audio is one of the spatial audio representations providing an immersive audio experience. While it can be found in a wide variety of audio reproduction systems, its use in communication systems is very limited as it faces many constraints like the complexity of the system, short delay, or limited available bitrate for coding and transmission. This paper presents a new bitrate adaptation method to be used in object-based audio coding systems that overcomes these constraints and enables their use in 5G voice and audio communication systems. The presented method distributes an available codec bit budget to encode waveforms of the individual audio objects based on a classification of the objects’ subjective importance in particular frames. The presented method has been used in the Immersive Voice and Audio Services (IVAS) codec, recently standardized by 3GPP, but it can be employed in other codecs as well. Test results show the performance advantage of the bitrate adaptation method over the conventional uniformly distributed bitrate method. The paper also presents IVAS selection test results for object-based audio with four audio objects, rendered to binaural headphone representation, in which the presented method plays a substantial role.
Speakers Authors
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Enhancing Realism for Digital Piano Players: A Perceptual Evaluation of Head-Tracked Binaural Audio
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This paper outlines a process for achieving and perceptually evaluating a head-tracked binaural audio system designed to enhance realism for players of digital pianos. Using an Ambisonic microphone to sample an acoustic piano, followed by leveraging off-the-shelf equipment, the system allows players to experience changes in the sound field in real-time as they rotate their heads while wearing headphones under three degrees of freedom (3DoF). The evaluation criteria included spatial clarity, spectral clarity, envelopment, and preference. These criteria were assessed across three different listening systems: stereo speakers, stereo headphones, and head-tracked binaural audio. Results showed a strong preference for the head-tracked binaural audio system, with players noting significantly greater realism and immersion.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Exploring Immersive Opera: Recording and Post-Production with Spatial Multi-Microphone System and Volumetric Microphone Array
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Traditional opera recording techniques using large microphone systems are typically less flexible towards experimental singer choreographies, which have the potential of being adapted to immersive and interactive representations such as Virtual Reality (VR) applications. The authors present an engineering report on implementing two microphone systems for recording an experimental opera production in a medium-sized theatre: a 7.0.4 hybrid array of Lindberg’s 2L and the Bowles spatial arrays and a volumetric array consisting of three higher-order Ambisonic microphones in Left/Center/Right (LCR) formation. Details of both microphone setups are first described, followed by post-production techniques for multichannel loudspeaker playback and 6 degrees-of-freedom (6DoF) binaural rendering for VR experiences. Finally, the authors conclude with observations from informal listening critique sessions and discuss the technical challenges and aesthetic choices involved during the recording and post-production stages in the hope of inspiring future projects on a larger scale.
Speakers
JM

Jiawen Mao

PhD student, McGill University
Authors
JM

Jiawen Mao

PhD student, McGill University
avatar for Michael Ikonomidis

Michael Ikonomidis

Doctoral student, McGill University
Michael Ikonomidis (Michail Oikonomidis) is an accomplished audio engineer and PhD student in Sound Recording at McGill University, specializing in immersive audio, high-channel count orchestral recordings and scoring sessions.With a diverse background in music production, live sound... Read More →
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Exploring the Directivity of the Lute, Lavta, and Oud Plucked String Instruments
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This study investigates the spherical directivity and radiation patterns of the Lute, Lavta, and Oud, pear-shaped traditional plucked-string instruments from the Middle East, Turkey, Greece, and the surrounding areas, providing insights into the acoustic qualities of their propagated sound in a three-dimensional space. Data was recorded in an acoustically controlled environment with a 29-microphone array, using multiple instruments of each type, performed by several professional musicians. Directivity is investigated in terms of sound projection and radiation patterns. Instruments were categorized according to string material. The analysis revealed that all instruments, regardless of their variations in geometry and material, exhibit similar radiation patterns across all frequency bands, justifying their intuitive classification within the “Lute family”. Nevertheless, variations in sound projection across all directions are evident between instrument types, which can be attributed to differences in construction details and string material. The impact of the musician's body on directivity is also observed. Practical implications of this study include the development of guidelines for the proper recording of these instruments, as well as the simulation of their directivity properties for use in spatial auralizations and acoustic simulations with direct applications in extended reality environments and remote collaborative music performances.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Generate acoustic responses of virtual microphone arrays from a single set of measured FOA responses. - Apply to multiple sound sources.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
V2MA (VSVerb Virtual Microphone Array)
Demos and related docs are available at https://bit.ly/3BmDBbL .
Once we have measured a set of four impulse responses (IRs) with an A-format microphone in a hall, we can make a virtual recording using a virtual microphone array built in the hall at will. The measurement does not require an A-format microphone and a loudspeaker to be placed at specific positions in the hall. Typical positions, such as at an audience seat and on a stage, are recommended, but you can place them anywhere you like. We will generate any type of virtual microphone response in a target room from an easy one-time IR measurement.
-------------------------------
We propose a method, V2MA, that virtually generates acoustic responses of any type of microphone array from a single set of FOA responses measured in a target room. An A-format microphone is used for the measurement, but no Ambisonics operation is included in the processing. V2MA is a method based on geometrical acoustics. We calculate sound intensities in the x, y, and z directions from a measured FOA response, then the virtual sound sources of the room are detected from them. Although it is desirable to have an A-format microphone place close to the attempted position of the virtual microphone array in the room, it is not a mandatory requirement. Since our method allows to generate SRIRs at arbitrary receiver positions in the room by updating the acoustic properties of the virtual sound sources detected at a certain position of the room, an A-format microphone can be placed anywhere you like. On the other hand, a loudspeaker must be placed at the source position where a player is assumed to be. Since the positions of virtual sound sources change when a real sound source moves, we used to measure the responses for each assumed real source position. To improve this inconvenient restriction, we developed the technique of updating the positions of the virtual sound sources when a real sound source moves from its original position. Although the technique requires some approximations, it is ascertained that the generated SRIRs provide fine acoustic properties in both physical and auditory aspect.
Speakers
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician specializing in studio acoustic design and R&D work on room acoustics, as well as an educator.After studying acoustics at the Kyushu Institute of Design, he joined SONA Corporation and began his career as an acoustic designer.In 2005, he received... Read More →
Authors
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician specializing in studio acoustic design and R&D work on room acoustics, as well as an educator.After studying acoustics at the Kyushu Institute of Design, he joined SONA Corporation and began his career as an acoustic designer.In 2005, he received... Read More →

Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Measurement and Applications of Directional Room Impulse Responses (DRIRs) for Immersive Sound Reproduction
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Traditional methods for characterizing Room Impulse Responses (RIRs) employing omnidirectional microphones do not fully capture the spatial properties of sound in an acoustic space. In this paper we explore a method for the characterization of room acoustics employing Directional Room Impulse Responses (DRIRs), which include the direction of arrival of the reflected sound waves in an acoustic space in addition to their time of arrival and strength. We measured DRIRs using a commercial 3D sound intensity probe (Weles Acoustics WA301) containing x, y, z acoustic velocity channels in addition to a scalar pressure channel. We then employed the measured DRIR’s to predict the binaural signals that would be measured by binaural dummy head microphones placed at the same location in the room where the DRIR was measured. The predictions can then be compared to the actual measured binaural signals. Successful implementation of DRIRs could significantly enhance applications in AR/VR and immersive sound reproduction by providing listeners with room-specific directional cues for early room reflections in addition to the diffuse reverberant impulse response tail.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Quantitative Assessment of Acoustical Attributes and Listener Preferences in Binaural Renderers with Head-tracking Function
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
The rapid advancement of immersive audio technologies has popularized binaural renderers that create 3D auditory experiences using head-related transfer functions (HRTFs). Various renderers with unique algorithms have emerged, offering head-tracking functionality for real-time adjustments to spatial audio perception. Building on our previous study, we compared binauralized music from five renderers with the dynamic head-tracking function enabled, focusing on how differences in HRTFs and algorithms affect listener perceptions. Participants assessed overall preference, spatial fidelity, and timbral fidelity by comparing paired stimuli. Consistent with our earlier findings, one renderer received the highest ratings for overall preference and spatial fidelity, while others rated lower in these attributes. Physical analysis showed that interaural time differences (ITD), interaural level differences (ILD), and frequency response variations contributed to these outcomes. Notably, hierarchical cluster analysis of participants' timbral fidelity evaluations revealed two distinct groups, suggesting variability in individual sensitivities to timbral nuances. While spatial cues, enhanced by head tracking, were generally found to be more influential in determining overall preference, the results also highlight that timbral fidelity plays a significant role for certain listener groups, indicating that both spatial and timbral factors should be considered in future developments.
Speakers
avatar for Rai Sato

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology
Rai Sato (佐藤 来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →
Authors
avatar for Rai Sato

Rai Sato

Ph.D. Student, Korea Advanced Institute of Science and Technology
Rai Sato (佐藤 来) is currently pursuing a PhD at the Graduate School of Culture Technology at the Korea Advanced Institute of Science and Technology. He holds a Bachelor of Music from Tokyo University of the Arts, where he specialized in immersive audio recording and psychoacoustics... Read More →
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

Review: Head-Related Impulse Response Measurement Methods
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
This review paper discusses the advancements in Head-Related Impulse Response measurement methods. HRIR (Head-Related Impulse Response) measurement methods, often referred to as HRTF (Head-Related Transfer Function) measurement methods, have undergone significant changes over the last few decades [1]. A frequently employed method is the Discrete stop-and-go method [1][2]. It involves changing the location of a single speaker, used as the sound source, and recording the impulse response at each location. [2]. Since the measurement is for 1 location of the sound source at a time, using the discrete stop-and-go method is time-consuming [1]. Hence improvements are required to enhance the efficiency of the measurement process such as using more sound sources (speakers) [1][3]. A typical HRTF measurement is usually conducted in an anechoic chamber to achieve a simulated free-field measurement condition without room reverberation. It measures the transfer function between the source and the ears to perceive localisation cues such as inter-aural time differences (ITDs), inter-aural level differences (ILDs), as well as monaural spectral cues [4]. Newer techniques such as the Multiple Exponential Sweep Method (MESM) and the reciprocal method offer alternatives. These methods enhance measurement efficiency and address challenges like inter-reflections and low-frequency response [5][6]. Individualised HRTF measurement techniques can be categorised into acoustical measurement, anthropometric data, and perceptual feedback [7]. Interpolation methods and non-anechoic environment measurements have expanded the practical application and feasibility of HRTF measurements [8][9][10][7].
Speakers
avatar for Jeremy Tsuaye

Jeremy Tsuaye

New York University
Authors
avatar for Jeremy Tsuaye

Jeremy Tsuaye

New York University
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:00pm EDT

The effects of interaural time difference and interaural level difference on sound source localization on the horizontal plane
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Interaural Time Difference (ITD) and Interaural Level Difference (ILD) are the main cues used by the human auditory system to localize sound sources on the horizontal plane. To explore the relationship between ITD, ILD, and the perceived azimuth, a study was conducted to measure and analyze the localization effects on the horizontal plane by combining ITD and ILD. Pure tones were used as sound sources in the experiment. For each of the three different frequency bands, 25 combinations of ITD and ILD test values were selected. These combinations were used to process the perceived sound from directly in front of the listener (pure tone signals collected using an artificial head in an anechoic chamber). The tests were conducted using the 1up/2down and 2AFC (two-alternative forced-choice) psychophysical testing methods. The results showed that the perceived azimuth at 350 Hz and 570 Hz was generally higher than at 1000 Hz. Additionally, the perceived azimuth at 350 Hz and 570 Hz was similar under certain combinations. The experimental data and conclusions can provide foundational data and theoretical support for efficient compression of multi-channel audio.
Tuesday October 8, 2024 2:00pm - 4:00pm EDT
Poster

2:30pm EDT

Decoding Emotions: Lexical and Acoustical Cues in Vocal Affects
Tuesday October 8, 2024 2:30pm - 3:00pm EDT
This study investigates listeners’ ability to detect emotion from a diverse set of speech samples, including both spontaneous conversations and actor-posed speech. It explores the contributions of lexical content and acoustic properties when native listeners rate seven pairs of affective attributes. Two experimental conditions were employed: a text condition, where participants evaluated emotional attributes from written transcripts without vocal information, and a voice condition, where participants listened to audio recordings to assess emotions. Results showed that the importance of lexical and vocal cues varies across 14 affective states for posed and spontaneous speech. Vocal cues enhanced the expression of sadness and anger in posed speech, while they had less impact on conveying happiness. Notably, vocal cues tended to mitigate negative emotions conveyed by the lexical content in spontaneous speech. Further analysis on correlations between emotion ratings in text and voice conditions indicated that lexical meanings suggesting anger or hostility could be interpreted as positive affective states like intimacy or confidence. Linear regression analyses indicated that emotional ratings by native listeners could be predicted up to 59% by lexical content and up to 26% by vocal cues. Listeners relied more on vocal cues to perceive emotional tone when the lexical content was ambiguous in terms of feeling and attitude. Finally, the analysis identified statistically significant basic acoustical parameters and other non/para-linguistic information, after controlling for the effect of lexical content.
Moderators Speakers
EO

Eunmi Oh

Research Professor, Yonsei University
Authors
EO

Eunmi Oh

Research Professor, Yonsei University
Tuesday October 8, 2024 2:30pm - 3:00pm EDT
1E03

2:30pm EDT

Nonlinear distortion in analog modeled DSP plugins in consequence of recording levels
Tuesday October 8, 2024 2:30pm - 3:00pm EDT
The nominal audio level is where developers of professional analog equipment design their units to have an optimal performance. Audio levels above the nominal level will at some point lead to increased harmonic distortion and eventually clipping. DSP plugins emulating such nonlinear behavior must – in the same manner as analog equipment – align to a nominal level that is simulated within the digital environment. A listening test was tailored to investigate if, or to which extent, misalignments in the audio levels compared to the simulated nominal level in analog-modelled DSP plugins are audible, thus affecting the outcome, depending on which level you choose to record at. The results of this study indicate that harmonic distortion in analog-modeled DSP plugins may become audible as the recording level increases. However, for the plugins included in this study, the immediate consequence of the harmonics added is not critical and, in most cases, not noticed by the listener.
Moderators
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
Speakers
avatar for Tore Teigland

Tore Teigland

Professor, Kristiania University College
Authors
avatar for Tore Teigland

Tore Teigland

Professor, Kristiania University College
Tuesday October 8, 2024 2:30pm - 3:00pm EDT
1E04

2:30pm EDT

Vocal Processing
Tuesday October 8, 2024 2:30pm - 3:30pm EDT
In addition to long-form dLive Certification classes, training sessions will cover a variety of topics relevant to live sound engineers, including: Mixing Monitors from FOH; Vocal Processing; Groups, Matrices and DCAs; Active Dynamics; and Gain Staging.

The training will be led by industry veterans Michael Bangs and Jake Hartsfield. Bangs, whose career includes experience as a monitor engineer and production manager, has worked with A-list artists, including Aerosmith, Katy Perry, Tom Petty, Lynyrd Skynyrd and Kid Rock. Hartsfield is a seasoned live sound engineer, having mixed for artists like Vulfpeck, Ben Rector, Fearless Flyers, and more.

Signup link: https://zfrmz.com/DmSlX5gyZCfjrJUHa6bV
Tuesday October 8, 2024 2:30pm - 3:30pm EDT
1E05

3:00pm EDT

A comparison of in-ear headphone target curves for the Brüel & Kjær Head & Torso Simulator Type 5128
Tuesday October 8, 2024 3:00pm - 3:30pm EDT
Controlled listening tests were conducted on five different in-ear (IE) headphone target curves measured on the latest ITU-T Type 4.3 ear simulator (e.g. Bruel & Kjaer Head & Torso Simulator Type 5128). A total of 32 listeners rated each target on a 100-point scale based on preference for three different music programs with two observations each. When averaged across all listeners, two target curves were found to be equally preferred over the other choices. Agglomerative hierarchical clustering analysis further revealed two classes of listeners based on dissimilarities in their preferred target curves. Class One (72% of listeners) preferred the top two rated targets. Class two (28% of listeners) preferred targets with 2 dB less bass and 2 dB more treble than the target curves preferred by Class 1. Among the demographic factors examined, age was the best predictor of membership in each class.
Moderators Speakers Authors
Tuesday October 8, 2024 3:00pm - 3:30pm EDT
1E03

3:00pm EDT

A Survey of Methods for the Discretization of Phonograph Record Playback Filters
Tuesday October 8, 2024 3:00pm - 3:30pm EDT
Since the inception of electrical recording for phonograph records in 1924, records have been intentionally cut with a non-uniform frequency response to maximize the information density on a disc and to improve the signal-to-noise ratio. To reproduce a nominally flat signal within the available bandwidth, the effects of this cutting curve must be undone by applying an inverse curve on playback. Until 1953, with the introduction of what has become known as the RIAA curve, the playback curve required for any particular disc could vary by record company and over time. As a consequence, anyone seeking to hear or restore the information on a disc must have access to equipment that is capable of implementing multiple playback equalizations. This correction may be accomplished with either analog hardware or digital processing. The digital approach has the advantages of reduced cost and expanded versatility, but requires a transformation from continuous time, where the original curves are defined, to discrete time. This transformation inevitably comes with some deviations from the continuous-time response near the Nyquist frequency. There are many established methods for discretizing continuous-time filters, and these vary in performance, computational cost, and inherent latency. In this work, several methods for performing this transformation are explored in the context of phonograph playback equalization, and the performance of each approach is quantified. This work is intended as a resource for anyone developing systems for digital playback equalization or similar applications that require approximating the response of a continuous-time filter digitally.
Moderators
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
Speakers
avatar for Benjamin Thompson

Benjamin Thompson

PhD Student, University of Rochester
Authors
Tuesday October 8, 2024 3:00pm - 3:30pm EDT
1E04

3:00pm EDT

Immersive and interactive audio management and production for new entertainment venues.
Tuesday October 8, 2024 3:00pm - 3:45pm EDT
A wide range of new immersive audio applications like The Sphere, the COSM entertainment domes, theatres, live, museum and art Installations came up. These venues need new audio management as well as production tools. But the way to create content has not changed. Music and audio producers’ home is a regular studio, also because it’s not possible to get time on the sites for extensive mixing sessions. Based on close to 30 years immersive and binaural audio production experience, New Audio Technology provide those tools and will give an insight of mixing, management, and playback strategies for these applications.
Speakers
avatar for Tom Ammermann

Tom Ammermann

New Audio Technology
Grammy-nominated music producer, Tom Ammermann, began his journey as a musician and music producer in the 1980s.At the turn of the 21st Century, Tom produced unique surround audio productions for music and film projects as well as pioneering the very first surround mixes for headphones... Read More →
Tuesday October 8, 2024 3:00pm - 3:45pm EDT
3D04 (AES and AIMS Alliance Program Room)
  Sponsor Session

3:15pm EDT

The Devil in the Details
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
What happens when you realize in the middle of a mass digitization project that most of your video assets have multi-track production audio instead of finished mixed audio, and your vendor doesn't offer a service to address the issue? Digitizing the Carlton Pearson Collection for the Harvard Divinity School produced just such a conundrum. This workshop will walk through a case study of the process of identifying problems from vendor work, QC and production workflows that had to be put into place to correct the issues that were surfaced as the project progressed, including a look at the technology stack that was developed internally in response to these issues and the necessary solutions, including a full GUI video editor that was developed for QC of audio and video and for implementing mass top/tail editing of assets while offering individual edit decision points. From problem identification to audio mix to video trimming to close captioning using AI solutions and project deposit to preservation repositories, the project team had only 3 months to complete the work on just shy of 4000 assets.
Speakers
avatar for Kaylie Ackerman

Kaylie Ackerman

Head of Media Preservation, Harvard Library
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E16

3:15pm EDT

Audio Design Roundtable
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Join us at the AES NY 2024 convention for an exciting panel featuring top-tier audio equipment designers who have set new standards in the industry! Geared toward aspiring audio designers, students, and educators, this session will dive into the creative and technical processes behind groundbreaking audio gear used in studios, live sound, and beyond. Hear from industry leaders as they share career highlights and insights on staying ahead in a rapidly evolving field. In addition, students will learn about the various student design competitions presented by AES, offering unique opportunities to showcase their skills and gain recognition. Stick around for a Q&A session, where you'll have the chance to ask these leading experts your burning questions and gain valuable knowledge to power your future in audio design!
Speakers
avatar for Christoph Thompson

Christoph Thompson

Director of Music Media Production, Ball State University
Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →
avatar for Brecht De Man

Brecht De Man

Head of Research, PXL University of Applied Sciences and Arts
Brecht is an audio engineer with a broad background comprising research, software development, management and creative practice. He holds a PhD from the Centre for Digital Music at Queen Mary University of London on the topic of intelligent software tools for music production, and... Read More →
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →
avatar for Anthony Agnello

Anthony Agnello

Managing Director, Eventide
Tony Agnello was born in Brooklyn, graduated from Brooklyn Technical High School in 1966, received the BSEE from City College of NY in 1971, the MSEE from the City University of NY in 1974 followed by post graduate studies in Digital Signal Processing at Brooklyn’s Polytechnical... Read More →
MW

Marek Walaszek

General, Addicted To Music
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Stage

3:15pm EDT

Mary Campbell In Memoriam
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Mary Campbell – In Memoriam
Manager Electric Lady Studios & Unique Recording, Sony Studios Booking Admin.

Mary Culum-Campbell was a legend of the NY studio scene for decades. She came from Montana in 1982 at age 21, and with a forged backstage pass and some other tales of delinquency talked her way into a job in the shop at Electric Lady Studios with tech extraordinaire Sal Greco, who went on to build Paisley Park for Prince. Sal’s comment recalling her job interview at her memorial was ‘She’s trouble. She’ll fit right in’. Within six months Mary became the studio manager. A trailblazer with attitude who also loved the music around her as a genuine fan, Mary was one of the few women in a power position at a major recording studio in the 80’s. Working in an intense and occasionally cutthroat environment during a tumultuous period in both music and music technology, she kept the studios booked and partied with the best (and worst) of them while managing both Electric Lady and the equally electric Unique Recording. She also had a stint working at the massive Sony Music Studios. The panel discussion will cover some of Mary’s history as well some of the history of the console and technology changes she presided over. Speakers will include former chief technicians Jim Gillis and Brian Macaluso, as well as associate Tony Drootin and engineer extraordinaire Ron Saint Germain.

Held together by Elliot Kissileff , Co-Moderator Roey Shamir

https://helenafuneralhome.com/obituaries/mary-c-campbell-age-61/
Speakers
avatar for Angela Piva

Angela Piva

Angela Piva, Audio Pro/Audio Professor, highly skilled in all aspects of music & audio production, recording, mixing and mastering with over 35 years of professional audio engineering experience and accolades. Known as an innovator in sound technology, and for contributing to the... Read More →
avatar for Eliot Kissileff

Eliot Kissileff

Owner, Music and Media Preservation Services
After graduating from New York University’s Tisch School of the Arts with a BFA in film production, Eliot Kissileff attended the Institute of Audio Research before landing a job at Electric Lady Studios in 1996. With the goal of learning how to make vintage-sounding recordings for... Read More →
avatar for Roey Shamir

Roey Shamir

Mixmaster, INFX PRODUCTIONS INC.
• Freelance Audio Engineer, Mixer and Producer • Audio Engineer Mixer, Producer, Remixer for extensive major label records and filmproductions (Mix to Picture) • Mixed and Recorded music / albums for numerous RIAA multi platinum and Grammynominated... Read More →
avatar for Ron St. Germain

Ron St. Germain

Owner, Saint Sounds Inc & 'Saint's Place' Studio
Ron's career in the music business began in 1970. He learned the art of recording at two of America's busiest and best recording studios, Record Plant Studios and Media Sound Studios, both in NYC. Some of Ron's ‘colleagues’ during those formative years were Tony Bongiovi, Bob... Read More →
avatar for Brian Macaluso

Brian Macaluso

Owner, Clandestine Recording
A lifelong music technology nerd, musician and songwriter, recording engineer, studio owner and former Chief Tech for Electric Lady Studios, JSM Music, and Dreamhire Professional Audio Rentals in NYC.Owner of Clandestine Recording, a private project studio in Kingston NY.
avatar for Tony Drootin

Tony Drootin

Manager, Sound on Sound Studios
Tony Graduated with a degree in Music Performance Percussion in 1984. Upon graduating college he took a position as receptionist at Unique Recording Studios in Times Square. Tony worked at Unique for 13 years, 10 of which he managed the facility. While at Unique Recording he formed... Read More →

Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E07

3:15pm EDT

Mastery in Ambisonic Recording
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Ambisonic microphones offer a dynamic and multifaceted approach to capturing and creating immersive audio experiences. This workshop delves into the process of transforming multichannel A-format raw recordings into versatile Ambisonic B-format files. These files can then be decoded into various audio formats, including stereo, binaural, surround, and spatial 3D audio. This makes it an ideal technique for crafting head-tracking binaural audio for 360/3D videos and VR gaming environments.
Participants will gain hands-on experience with the Zoom H3-VR portable recorder and learn how to create Ambisonic recordings. The session will guide you through processing raw recordings using Reaper and the dearVR AMBI MICRO plugin and illustrating the workflow for integrating Ambisonic audio into 360/3D video and VR games for head-tracking binaural sound. Attendees are encouraged to bring their laptops and headphones, install the necessary software, and download the provided demo projects for an interactive experience. Additionally, the workshop will offer insights into capturing immersive audio with various Ambisonic microphones, including the new em64 Eigenmike Spherical Microphone Array.
Speakers
avatar for Ming-Lun Lee

Ming-Lun Lee

Professor of Electrical and Computer Engineering, University of Rochester
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E09

3:15pm EDT

Susan Rogers: Enveloping Masterclass
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Susan talks about her incredible carrier in science and music, working with artists such as Julia Darling, David Byrne, Prince, Laurie Anderson, Tevin Campbell and many more; all garnished with high resolution listening examples.

This masterclass series, featuring remarkable recording artists, is a chance to hear stereo and 3D music at its best; as we discuss important factors of production, distribution and reproduction. Thomas exemplifies the underrated qualia, auditory envelopment (AE); and we evaluate how robustly intimacy and AE latent in the content may be heard across this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats each time.
Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →
avatar for Susan Rogers

Susan Rogers

Professor, Berklee Online
Susan Rogers holds a doctoral degree in experimental psychology from McGill University (2010). Prior to her science career, Susan was a multiplatinum-earning record producer, engineer, mixer and audio technician. She is best known for her work with Prince during his peak creative... Read More →
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
3D06

3:15pm EDT

The Art of Mixing in Stereo and ATMOS Simultaneously
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Join veteran audio engineer Matt Boudreau, who has mixed in ATMOS for Alanis Morissette, Green Day, and Deafheaven, as he delves into immersive audio mixing. In this session, Matt will discuss his approach to mixing in ATMOS and stereo simultaneously and provide a look into his Pro Tools mixing template. Drawing from his extensive experience and his Mixing in Stereo and ATMOS course, Matt will share practical tips for creating dynamic mixes that excel in both formats. Whether you're new to ATMOS or refining your skills, this presentation offers valuable insights for all audio professionals.
Speakers
MB

Matt Boudreau

An audio engineer at Matt Boudreau MIXING|MASTERING + Producer/Host/Author Working Class Audio Podcast.
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E06

3:15pm EDT

Copying and attributing training data in audio generative models
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
MOTIVATION: Generative AI for audio is quickly proliferating in both the commercial and open source software communities, However, there is still no technical consensus on how novel the generated audio data is, and what (if any) characteristics from the training data are commonly replicated. This workshop will explore existing technical approaches for how to detect memorized audio training data, how often it happens, and what characteristics of the audio are memorized. Additionally, we plan to share recent research on how audio similarity algorithms can be used to attribute audio samples produced by a generative model to specific audio samples in the training set.

FORMAT: this will be in the form of a panel discussion with three parts to it (listed below) – the purpose is to inform the AES community about the technical feasibility of detecting memorization and attributing training data in audio generative models, give audio examples of the results, an outlook for future developments, and solicit feedback from the audience.

• Memorization and generalization in deep learning
• Searching for memorized training data in audio generative models
• Training data attribution using audio similarity measures
Speakers
avatar for Gordon Wichern

Gordon Wichern

Principal Research Scientist - Speech and Audio Team, MERL, Sr. Principal Research Scientist
Audio signal processing and machine learning resarcher
avatar for Yuki Mitsufuji

Yuki Mitsufuji

Lead Research Scientist/VP of AI Research, Sony AI
avatar for Philipp Lengeling

Philipp Lengeling

Senior Counsel, RafterMarsh
Philipp G. Lengeling, Mag. iur., LL.M. (New York), Esq. is an attorney based in New York (U.S.A.) and Hamburg (Germany), who is heading the New York and Hamburg based offices for RafterMarsh, a transatlantic boutique law firm (California, New York, U.K., Germany).
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E15

3:15pm EDT

Reporting from the Frontlines of the Recording Industry: Hear from Studio Owners and Chief Engineers on Present / Future Trends, Best Practices, and the State of the Recording Industry
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
Panel Description:
Join us for an in-depth, candid discussion with prominent NYC studio owners on the state recording industry of today and where it might go tomorrow. This panel will explore trends in studio bookings, predictions for the future, and the evolving landscape of recording studios. Our panelists will share their experiences of owning and managing studios, offering valuable insights for engineers on essential skills and best practices for success in today’s recording environment. We'll also discuss the importance of high-quality recording, and how these practices are crucial in maintaining the soul of music against the backdrop of AI and digital homogenization.

Topics also include:
What are studios looking for from aspiring engineers, and what does it take to work in a major market studio?
Studio best practices from deliverables to client relations
To Atmos or not to Atmos?
Proper crediting for engineers and staff
Working with record labels and film studios
Can recording and mix engineers find the same labor protections as the film industry?

Panelists (in alphabetical order)

Amon Drum, Owner / Engineer, The Bridge Studio (https://www.bridgerecordingstudio.com/)
Amon Drum is a recordist, producer, acoustic designer, owner, Chief Engineer, and designer of The Bridge Studio in Williamsburg, Brooklyn. He has an expertise in both analog and digital recording techniques. Amon specializes in the recording of non-western acoustic instruments, large ensembles, and live music for video. Amon is also a percussionist, having trained with master musicians including M’Bemba Bangora and Mamady Keita in Guinea West Africa, and brings this folkloric training to his recording & productions. Working with a wide variety of artists and genres from Jason Moran to Adi Oaisis, Run The Jewels, to Ill Divo as well as many Afro-Cuban ensembles from the diaspora.

Ben Kane, Owner, Electric Garden (https://electricgarden.com/)
Ben Kane is a Grammy Award winning recording and mix engineer, producer, and owner of Electric Garden, a prominent recording studio known for its focus on excellence in technical and engineering standards, and a unique handcrafted ambiance. Kane is known for his work with D'Angelo, Emily King, Chris Dave, and PJ Morton.

Shahzad Ismaily, Owner / Engineer, Figure 8 Recording (https://www.figure8recording.com/)
Shahzad Ismaily is a Grammy-nominated multi-instrumentalist, composer, owner and engineer at Figure 8 Recording in Brooklyn. Renowned for his collaborative spirit and eclectic musical range, Shahzad has contributed his unique sonic vision to projects spanning various genres and artists worldwide.

Zukye Ardella, Partner / Engineer, s5studio (https://www.s5studiony.com/)
Zukye is a New York City based certified gold audio engineer, music producer and commercial studio owner. In 2015, she began her professional career at the original s5studio located in Brooklyn, New York. Zukye teamed up with s5studio’s founder (Sonny Carson) to move s5studio to its current location in Chelsea, Manhattan. Over the years, Ardella has been an avid spokesperson for female empowerment organizations such as Women’s Audio Mission and She is the Music. Talents she’s worked with include NeYo, WizKid, Wale, Conway The Machine, A$AP Ferg, Lil Tecca, AZ, Dave East, Phillip Lawrence, Tay Keith, Lola Brooke, Princess Nokia, Vory, RMR, Yeat, DJ KaySlay, Kyrie Irving, Maliibu Mitch, Flipp Dinero, Fred The Godson, Jerry Wonda, ASAP 12vy and more.


--

Moderator:

Mona Kayhan, Owner, The Bridge Studio (https://www.bridgerecordingstudio.com/)
Mona is a talent manager for Grammy award winning artists, and consults for music and media companies in marketing and operations. She has experience as a tour manager, an international festival producer, and got her start in NYC working at Putumayo World Music. As an owner of The Bridge Studio, Mona focuses on client relationships, studio operations, and strategic partnerships. She also has an insatiable drive to support and advocate for the recording arts industry.
Tuesday October 8, 2024 3:15pm - 4:15pm EDT
1E08

3:30pm EDT

Leveraging TSN Protocols to Support AES67: Achieving AVB Quality with Layer 3 Benefits
Tuesday October 8, 2024 3:30pm - 3:50pm EDT
This paper investigates using Time-Sensitive Networking (TSN) protocols, particularly from Audio Video Bridging (AVB), to support AES67 audio transport. By leveraging the IEEE 1588 Level 3 Precision Time Protocol (PTP) Media Profile, packet scheduling, and bandwidth reservation, we demonstrate that AES67 can be transported with AVB-equivalent quality guarantees while benefiting from Layer 3 networking advantages. The evolution of professional audio networking has increased the demand for high-quality, interoperable, and efficiently managed networks. AVB provides robust Layer 2 delivery guarantees but is limited by Layer 2 constraints. AES67 offers Layer 3 interoperability but lacks strict quality of service (QoS) guarantees. This paper proposes combining the strengths of both approaches by using TSN protocols to support AES67, ensuring precise audio transmission with Layer 3 flexibility. TSN extends AVB standards for time synchronization, traffic shaping, and resource reservation, ensuring low latency, low jitter, and minimal packet loss. AES67, a standard for high-performance audio over IP, leverages ubiquitous IP infrastructure for scalability and flexibility but lacks the QoS needed for professional audio. Integrating TSN protocols with AES67 achieves AVB's QoS guarantees in a Layer 3 environment. IEEE 1588 Level 3 PTP Media Profile ensures precise synchronization, packet scheduling reduces latency and jitter, and bandwidth reservation prevents congestion. Experiments show that TSN protocols enable AES67 to achieve latency, jitter, and packet loss performance on par with AVB, providing reliable audio transmission suitable for professional applications in modern, scalable networks.
Moderators
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
Speakers
avatar for Nicolas Sturmel

Nicolas Sturmel

Directout GmbH
Authors
Tuesday October 8, 2024 3:30pm - 3:50pm EDT
1E04

3:30pm EDT

A cepstrum analysis approach to perceptual modelling of the precedence effect
Tuesday October 8, 2024 3:30pm - 4:00pm EDT
The precedence effect describes our ability to perceive the spatial characteristics of lead and lag sound signals. When the time delay between the lead and lag is sufficiently small we will cease to hear two distinct sounds, instead perceiving the lead and lag as a single fused sound with its own spatial characteristics. Historically, precedence effect models have had difficulty differentiating between lead/lag signals and their fusions. The likelihood of fusion occurring is increased when the signal contains periodicity, such as in the case of music. In this work we present a cepstral analysis based perceptual model of the precedence effect, CEPBIMO, which is more resilient to the presence of fusions than its predecessors. To evaluate our model we employ four datasets of various signal types, each containing 10,000 synthetically generated room impulse responses. The results of the CEPBIMO model are then compared against results of the BICAM. Our results show that the CEPBIMO model is more resilient to the presence of fusions and signal periodicity than previous precedence effect models.
Moderators Speakers
avatar for Jeramey Tyler

Jeramey Tyler

Samtec
Jeramey is in the 3rd person. So it goes.
Authors
avatar for Jeramey Tyler

Jeramey Tyler

Samtec
Jeramey is in the 3rd person. So it goes.
Tuesday October 8, 2024 3:30pm - 4:00pm EDT
1E03

3:50pm EDT

Harnessing Diffuse Signal Processing (DiSP) to Mitigate Coherent Interference
Tuesday October 8, 2024 3:50pm - 4:10pm EDT
Coherent sound wave interference is a persistent challenge in live sound reinforcement, where phase differences between multiple loudspeakers lead to destructive interference, resulting in inconsistent audio coverage. This review paper presents a modern solution: Diffuse Signal Processing (DiSP), which utilizes Temporally Diffuse Impulses (TDIs) to mitigate phase cancellation. Unlike traditional methods focused on phase alignment, DiSP manipulates the temporal and spectral characteristics of sound, effectively diffusing coherent wavefronts. TDIs, designed to spread acoustic energy over time, are synthesized and convolved with audio signals to reduce the likelihood of interference. This process maintains the original sound’s perceptual integrity while enhancing spatial consistency, particularly in large-scale sound reinforcement systems. Practical implementation methods are demonstrated, including a MATLAB-based workflow for generating TDIs and optimizing them for specific frequency ranges or acoustic environments. Furthermore, dynamic DiSP is introduced as a method for addressing interference caused by early reflections in small-to-medium sized rooms. This technique adapts TDIs in real-time, ensuring ongoing decorrelation in complex environments. The potential for future developments, such as integrating DiSP with immersive audio systems or creating dedicated hardware for real-time signal processing, is also discussed.
Moderators
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
Speakers
TS

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina
Authors
TS

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina
Tuesday October 8, 2024 3:50pm - 4:10pm EDT
1E04

4:00pm EDT

Categorical Perception of Neutral Thirds Within the Musical Context
Tuesday October 8, 2024 4:00pm - 4:30pm EDT
This paper investigates the contextual recognition of neutral thirds in music by integrating real-world musical context into the study of categorical perception. Traditionally, categorical perception has been studied using isolated auditory stimuli in controlled laboratory settings. However, music is typically experienced within a circumstantial framework, significantly influencing its reception. Our study involved musicians from various specializations who listened to precomposed musical fragments, each concluding with a 350-cent interval preceded by different harmonic contexts. The fragments included a monophonic synthesizer and orchestral mockups, with contexts such as major chords, minor chords, a single pitch, neutral thirds, and natural fifths. The results indicate that musical context remarkably affects the recognition of pseudotonal chords. Participants' accuracy in judging interval size varied based on the preceding harmonic context. A statistical analysis was conducted to determine if there were significant differences in the neutral third perception across the different harmonic contexts. The test led to the rejection of the null hypothesis: the findings underscore the need to consider real-world listening experiences in research on auditory processing and cognition.
Moderators Speakers Authors
Tuesday October 8, 2024 4:00pm - 4:30pm EDT
1E03

4:00pm EDT

Revolutionizing Loudspeaker Design with Ultra Thin Glass Technology
Tuesday October 8, 2024 4:00pm - 4:45pm EDT
Discover the groundbreaking potential of our latest Ultra Thin Glass (UTG) technology, engineered for speaker diaphragm applications. This innovation marries toughness, flexibility, and a distinctive finish with full ESG compliance. With thicknesses as fine as 25μm, UTG seamlessly integrates into earphones, headphones, micro-speakers, and a wide range of loudspeakers. Join us to explore how this cutting-edge material can elevate your audio designs to new heights.
Speakers
KC

Kwunkit Chan

Glass Acoustic Innovations Co., LTD
Tuesday October 8, 2024 4:00pm - 4:45pm EDT
3D04 (AES and AIMS Alliance Program Room)

4:30pm EDT

Mastering for Vinyl
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
Women's Audio Mission presents a panel discussion with award-winning mastering engineers and vinyl cutters who will talk about the intricacies of the vinyl format and the strategies used in mastering modern music to vinyl including how to address factors such as sibilance, managing low frequency stereo information, sequencing strategies, program length and volume, etc.

Moderator: Terri Winston

Panelists:

Kim Rosen - Kim Rosen is a GRAMMY- nominated mastering engineer who works out of her own “Knack Mastering” located in Ringwood, NJ, USA. Recordings Kim has mastered have garnered twenty-six Grammy nominations with five winning across multiple categories including Bonnie Raitt’s Song of the Year. She has mastered projects for Wynonna Judd, Johnny Cash, Aimee Mann, The Milk Carton Kids, Allison Russell, Superdrag, and Flogging Molly.

Margaret Luthar - Margaret Luthar is a Grammy-nominated Mastering Engineer who has worked at Chicago Mastering Service (where she learned to cut vinyl) and at Welcome to 1979 in Nashville. Currently a Broadcast Recording Technician at NPR and a freelance mastering engineer based in Los Angeles. She has mastered thousands of projects including The Lumineers, Tinashe, Spiritualized, Soccer Mommy and has significant experience in audio restoration, from her work at the Norwegian Institute of Recorded Sound.

Piper Payne - Piper Payne is a Mastering Engineer based in Nashville, TN. Piper has mastered a wide variety of music including nationally renowned artists such as Janis Ian (Best Folk Album Grammy Nomination 2023), Dolly Parton, Third Eye Blind, LeAnn Rimes, The Go-Go’s, Madame Gandhi, and many more. After moving her mastering studio to Nashville from the San Francisco Bay Area, she ventured into vinyl manufacturing, opening Physical Music Products.

Moderated by Terri Winston, Executive Director of Women's Audio Mission. Women's Audio Mission (WAM) is dedicated to closing the chronic gender gap in the music, recording and technology industries. Less than 5% of the people creating and shaping the sounds that make up the soundtrack of our lives are women and gender-expansive people. WAM is “changing the face of sound” by providing training, mentoring and career advancement that inspires the next generation of women and gender-expansive music producers, recording engineers and audio technologists and radically changes the narrative in these industries.
Speakers
avatar for Terri Winston

Terri Winston

Executive Director, Women's Audio Mission
Women's Audio Mission (WAM) is dedicated to closing the chronic gender gap in the music, recording and technology industries. Less than 5% of the people creating and shaping the sounds that make up the soundtrack of our lives are women and gender-expansive people. WAM is “changing... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E07

4:30pm EDT

ADM-OSC 1.0
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
ADM-OSC is an industry initiative aimed at standardising Object-Based Audio (OBA) positioning data by implementing the Audio Definition Model (ADM) over Open Sound Control (OSC). As immersive audio gains traction across various industries, from music streaming to gaming, and from live sound to broadcasting, the Audio Definition Model (ADM) is becoming a popular standard for metadata. This includes Serial ADM for broadcast and ADM BWF or XML files for studio use.

The ADM-OSC workgroup was formed four years ago to bridge the gap between immersive live and studio ecosystems. It now includes leading developers and manufacturers who aim to facilitate the sharing of audio object metadata across different environments, from studios to broadcasts to live performances.

Since its initial draft implementation, ADM-OSC has been supported by various audio industry tools, including live rendering engines, digital audio workstations (DAWs), controllers, live tracking systems, and media server solutions. It is currently being deployed in both live and studio productions, with increasing interest from technology developers wanting to join and implement this standard.

ADM-OSC 1.0 is now the published specification, aiming to provide a basic interoperability layer between Object Editors and Object Renderers.

This presentation and workshop will take a dive on ADM-OSC and will cover:
- The origins of ADM-OSC
- Presentation of ADM-OSC 1.0 specification
- Use case/application demonstration
- DAW object positional data to external rendering engine (s)
- Controllers’ data to object panner in DAW for automation recording
- Live tracking (actors, artist) positional data to live rendering engine (s)
- Plugin fest 2023 and 2024 report
- Future considerations
- Application-specific subgroups such as broadcast, VR/Gaming, live rendering, show control
Speakers
avatar for Michael Zbyszynski

Michael Zbyszynski

Software Development Engineer, L-Acoustics
Michael Zbyszyński is musician, researcher, teacher and developer in the field of contemporary electroacoustic music. He is currently part of the Creative Technologies R&D group at L-Acoustics. As a musician, his work spans from brass bands to symphony orchestras, including composition... Read More →
avatar for Hugo Larin

Hugo Larin

Senior Mgr. Business Development | FLUX:: GPLM, Harman International
Hugo Larin is a key collaborator to the FLUX: SPAT Revolution project and has deep roots in audio mixing, design and operation, as well as in networked control and data distribution. He leads the FLUX:: business development at HARMAN. His recent involvements and interests include object-based spatial audio mixing workflows, interoperability... Read More →
avatar for Mathieu Delquignies

Mathieu Delquignies

Education & Application Support France, d&b audiotechnik
Mathieu holds a Bachelors's degree in applied physics from Paris 7 University and Master's degree in sound engineering from ENS Louis Lumière in 2003. He has years of diverse freelance mixing and system designer experiences internationally, as well as loudspeakers, amplifiers, dsp... Read More →
avatar for Lucas Zwicker

Lucas Zwicker

Senior Director, Workflow and Integration, CTO Office, Lawo AG
Lucas joined Lawo in 2014, having previously worked as a freelancer in the live sound and entertainment industry for several years. He holds a degree in event technology and a Bachelor of Engineering in electrical engineering and information technology from the University of Applied... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E15

4:30pm EDT

Upmix and Format Conversion of Multichannel Audio: An Opportunity for AI-Based Breakthroughs?
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
The demand for the production, distribution and playback of audio content in surround or immersive multichannel formats and flexible playback setups has continued to grow in the cinema, broadcast and music industries. This trend motivates the continued development of multichannel audio signal processing methods for the conversion of recordings between different multichannel and spatial audio formats or layouts, including down- or up-mixing. These operations leverage a well researched frequency-domain processing framework nowadays commonly referred to as parametric spatial audio signal processing, to solve challenges including the primary-ambient decomposition of audio recordings. Bringing ML/AI techniques to bear in the development of novel spatial audio analysis/synthesis methods can unlock practical applications with transformative impact. In this workshop, we will review and discuss the fundamentals and recent progress in these methods.
Speakers
avatar for Jean-Marc Jot

Jean-Marc Jot

Founder and Principal, Virtuel Works LLC
Spatial audio and music technology expert and innovator. Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and music spatial computing technology and interoperability solutions.
avatar for Gordon Wichern

Gordon Wichern

Principal Research Scientist - Speech and Audio Team, MERL, Sr. Principal Research Scientist
Audio signal processing and machine learning resarcher
avatar for Sunil Bharitkar

Sunil Bharitkar

Samsung Research America
avatar for Carlos Freitas

Carlos Freitas

CMO - Mastering Engineer, Spatial9
9 Latin Grammy nominees in the Best Engineering category, and has been the mastering engineer in 34 Grammy award winning records, with more than 100 nominations.The albums include artists such as Nathan East, Andres Cepeda, Ines Gaviria, João Gilberto, Caetano Veloso, Gilberto Gil... Read More →
avatar for Alan Silva

Alan Silva

CTO, Spatial9
Alan Silva is a well-seasoned Researcher and Engineer who focuses on leveraging machine learning algorithms and distributed computing to create innovative solutions. With a solid commitment to open-source development, Alan actively contributes to collaborative projects that drive... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E06

4:30pm EDT

Psychoacoustics for Immersive Productions
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
3D audio has enormous potential to emotionally touch the audience: The potent effect occurs when the auditory system is given the illusion of being in a natural environment. When this is the case with impressive music, everyone gets goosebumps. Psychoacoustics forms the basis for remarkable results in music productions.
In the first part, Konrad Strauss explains the basics of psychoacoustics in the context of music production:
• How immersive differs from stereo
• Sound localization and perception
• The eye/ear/brain link
• Implications for recording and mixing in immersive
• Transitioning from stereo to immersive: Center speaker, LFE, working with the diffuse
surround field and height channels.
In the second part, Lasse Nipkow introduces the quasi-binaural spot miking technique he uses to capture the beautiful sound of acoustic instruments during his recordings. He explains the strategy for microphone placement and shows, using stereo and 3D audio sound examples, the potential of these signals for immersive productions.
This first contribution is linked to a second, subsequent presentation by Lasse Nipkow and Ronald Prent: ‘Tools for Impressive Immersive Productions’.
Speakers
avatar for Lasse Nipkow

Lasse Nipkow

CEO, Silent Work LLC
Since 2010, Lasse Nipkow has been a renowned keynote speaker in the field of 3D audio music production. His expertise spans from seminars to conferences, both online and offline, and has gained significant popularity. As one of the leading experts in Europe, he provides comprehensive... Read More →
avatar for Konrad Strauss

Konrad Strauss

Professor, Indiana University Jacobs School of Music
Konrad Strauss is a Professor of Music in the Department of Audio Engineering and Sound Production at Indiana University’s Jacobs School of Music. He served as department chair and director of Recording Arts from 2001 to 2022. Prior to joining the faculty of IU, he worked as an... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E16

4:30pm EDT

Morten Lindberg: Enveloping Masterclass
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
Morten details 3D music recordings from his 2L catalogue, including high resolution listening examples from new releases.

This masterclass series, featuring remarkable recording artists, is a chance to hear 3D audio at its best; as we discuss factors of production, distribution and reproduction that makes it worth the effort. Thomas exemplifies the underrated qualia, auditory envelopment (AE); and we evaluate how robustly AE latent in the content may be heard across this particular listening room.

Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. If you attend multiple masterclasses, consider choosing different seats.
Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund has authored papers on human perception, spatialisation, loudness, sound exposure and true-peak level. He is researcher at Genelec, and convenor of a working group on hearing health under the European Commission. Out of a medical background, Thomas previously served in... Read More →
avatar for Morten Lindberg

Morten Lindberg

Producer and Engineer, 2L (Lindberg Lyd, Norway)
Recording Producer and Balance Engineer with 43 GRAMMY-nominations, 35 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner 2020.
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
3D06

4:30pm EDT

How Do We Embrace AI and Make It Work For Us in Audio? Hey AI, Get Yer Stinkin’ Hands Offa My Job! AES and SMPTE Joint Exploration
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
In this timely panel let’s discuss some of the pressing matters AI confronts us with in audio, and how we can turn a perceived foe into an ally.
We’ll discuss issues including:
During production, AI cannot deal with any artistic issues related to production and engineering, most of which depend on personal interaction - as well as perception.

In post-production, AI could be of use in repetitive tasks: making a track conform to a click-track and maintain proper pitch, perform pitch correction on a track, deal with extraneous clicks (without removing important vocal consonants), perform ambience-matching (particularly on live recordings), to name a few. Can AI running in the background on our DAW build macros for us?

The more we can use it as a tool for creativity and enhance our revenue streams the more it becomes a practical, positive approach. Many composers are using it to create or enhance their musical ideals almost instantaneously. The key here is that it is they are our ideas that AI adds to.

How do we embrace AI, adapt to it, and help it to adapt to us? Can we get to the point where we incorporate it as we have past innovations rather than fear it? How do we take control of AI instead of AI taking control of us?

What should we, the audio community, be asking AI to do?
Speakers
avatar for Gary Gottlieb

Gary Gottlieb

AES President-Elect, Mendocino College
President-Elect, Co-Chair of the Events Coordination Committee, Chair of the Conference Policy Committee, and former Vice President of the Eastern Region, US and Canada; AES Fellow, Engineer, Author, Educator and Guest Speaker Gary Gottlieb refers to himself as a music generalist... Read More →
avatar for Lenise Bent

Lenise Bent

Producer/Engineer/Editor/AES Governor, Soundflo Productions
Audio Recording HistoryWomen and Diversity in AudioAnalog Tape RecordingPost Production/Sound Design/FoleyVinyl RecordsAudio Recording Archiving, Repair and PreservationBasic and Essential Recording TechniquesOpportunities in the Audio IndustryAudio Adventurers
avatar for Franco Caspe

Franco Caspe

Student, Queen Mary University of London
I’m an electronic engineer, a maker, hobbyist musician and a PhD Student at the Artificial Intelligence and Music CDT at Queen Mary University of London. I have experience in development of real-time systems for applications such as communication, neural network inference, and DSP... Read More →
avatar for Soumya Sai Vanka

Soumya Sai Vanka

PhD Researcher, Queen Mary University of London
I am a doctoral researcher at the Centre for Digital MusicQueen Mary University of London under the AI and Music Centre for Doctoral Training Program. My research focuses on the design of user-centric context-aware AI-based tools for music production. As a hobbyist musician and producer myself, I am interested in developing tools that can support creativity and collaboration resulting in emergence and novelty. I am also interested... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
1E08

4:30pm EDT

Sonic Mastery - A Room of One's Own
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
This panel brings together a group of audio engineers to explore their experiences of navigating and shaping the audio industry as Black women. As they delve into the creation and significance of safe spaces, the panelists will discuss the importance of community in a field where representation is often lacking. Through sharing stories of unique and impactful projects, the conversation will highlight the creativity and resilience required to thrive in a male-dominated industry. Additionally, we will celebrate the legacy of unsung women producers and their groundbreaking contributions to music and production. This session is a discussion by and tribute to the talent and influence of Black women in audio engineering, both past and present.
Speakers
avatar for Eve Horne

Eve Horne

Peak Music
avatar for Leslie Gaston-Bird

Leslie Gaston-Bird

Owner, Mix Messiah Productions
Leslie Gaston-Bird (AMPS, MPSE) is author of the book "Women in Audio", part of the AES Presents series and published by Focal Press (Routledge). She is a voting member of the Recording Academy (The Grammys®) and its P&E (Producers and Engineers) Wing. Currently, she is a freelance... Read More →
avatar for Gloria Kaba

Gloria Kaba

Studio Manager, Power Station at BerkleeNYC
Gloria Kaba is a Ghanaian-American sound engineer, producer, mixer, and writer with over a decade of experience in the studio, often operating under the moniker Redsoul. She’s worked on A Tribe Called Quest’s final album We Got It From Here...Thank You For Your Service and Solange’s... Read More →
avatar for Ebonie Smith

Ebonie Smith

Ebonie Smith is a celebrated music producer, audio engineer, and singer-songwriter, based in the vibrant hub of Los Angeles. As a prominent figure in the industry, she currently holds the esteemed roles of senior audio engineer and producer at Atlantic Records. Ebonie's remarkable... Read More →
Tuesday October 8, 2024 4:30pm - 5:30pm EDT
Stage

5:00pm EDT

Student Mixer at Telefunken Booth
Tuesday October 8, 2024 5:00pm - 6:00pm EDT
Open to anyone with a student badge.
Tuesday October 8, 2024 5:00pm - 6:00pm EDT
Booth 335

6:00pm EDT

Tech Tour: The Sonic Room at Amazon Studios
Tuesday October 8, 2024 6:00pm - 8:00pm EDT
All Tech Tours are full. Know Before You Go emails were sent to accepted Tech Tour registrants with information about the tour.  

Duration: 2 hours
Capacity: 50 (Once capacity is reached, registrants will be added to a wait list)

Tour Description
Come and check out the new Sonic Studio126 at Amazon Music in Brooklyn and get to meet multi-Grammy Award Winning producer and engineer, Mr Sonic. Take a tour of the various studios, writer's rooms and event spaces at Amazon Music's Brooklyn location and finish the evening with a mixer to meet and network with other creatives and professionals.

Arrival Instructions:  Arrive at the South Lobby of 25 Kent Avenue, Brooklyn, NY, and check in by presenting your full name and ID at the front desk of security. You will be heading to the 7th floor (Amazon Music) for the event.
Tuesday October 8, 2024 6:00pm - 8:00pm EDT
Offsite
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -