Music's amazing ability to evoke emotions has been the focus of various scientific studies, with researchers testing how different musical structures or interpretations impacted the emotions induced in the listener. However, in the context of amplified music, little is known about the influence of the sound reinforcement system. In this study, we investigate whether the amount of low-frequency amplification produced by a sound system impacts the listener's arousal. We organized two listening experiments whereby we measured the skin conductance of the participants while they were listening to music excerpts with different levels of low-frequency amplification. Our results indicate that an increase in the level of bass is correlated with a small but measurable rise in electrodermal activity, which is correlated with arousal. In addition this effect seems to depend on the nature of the music.
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
We present an analysis of a dataset of audio metrics and aesthetic considerations about mixes and masters provided by the web platform MixCheck studio.The platform is designed for educational purposes, primarily targeting amateur music producers, and aimed at analysing their recordings prior to them being released. The analysis focuses on the following data points: integrated loudness, mono compatibility, presence of clipping and phase issues, compression and tonal profile across 30 user-specified genres. Both mixed (mixes) and mastered audio (masters)are included in the analysis, where mixes refer to the initial combination and balance of individual tracks, and masters refer to the final refined version optimized for distribution. Results show that loudness-related issues along with dynamics issues are the most prevalent, particularly in mastered audio. However mastered audio presents better results in compression than just mixed audio. Additionally, results show that mastered audio has a lower percentage of stereo field and phase issues.
A sound reinforcement system typically combines a full-range system with a subwoofer system to deliver a consistent frequency bandwidth. The two systems must be time-aligned, which is usually done without an audience. This paper investigates the impact of the audience on the time alignment of loudspeaker systems at low frequencies. The study demonstrates, through on-site measurements and simulations, that the audience significantly affects sound propagation. The research highlights the greater phase shift observed with ground-stacked subwoofers compared to flown systems due to the audience’s presence, requiring adjustments of the system time alignment with the audience when flown and ground-stacked sources are used together. Moreover, in this case, the results demonstrate the lower quality of the summation with the audience even with the alignment adjustment. Lastly, recommendations for system design and calibration are proposed.
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →
We detail a real-time application of active acoustics used to create a shared virtual environment over a closed audio network as a research-creation project exploring the concept of room participation in musical performance. As part of a concert given in the Immersive Media Lab at McGill University, musicians and audience members were located in a virtual acoustic environment while a second audience was located in an adjacent but acoustically isolated space on the same audio network. Overall, the blending of computer-generated and acoustic sources created a specific use case for virtual acoustics while the immersive capture and distribution method examined an avenue for producing a real-time shared experience. Future work in this area includes audio networks with multiple virtual acoustic environments.
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
YIng-Ying Zhang is a music technology researcher and sound engineer. She is currently a PhD candidate at McGill University in the Sound Recording program where her research focuses on musician-centered virtual acoustic applications in recording environments. She received her Masters... Read More →
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
The impression of listeners to aurally “see” the size of a performing entity is crucial to the success of both a concert hall and a reproduced sound field. Previous studies have looked at how different concert halls with different lateral reflections affect apparent source width. Yet, the perceptual effects of different source distributions with different recording techniques on apparent source width are not well understood. This study explores how listeners perceive the width of an orchestra by using four stereo and one binaural recording techniques and three wave field synthesis ensemble settings. Subjective experiments were conducted using stereo loudspeakers and headphone to play back the recording clips asking the listeners to rate the perceived wideness of the sound source. Results show that recording techniques greatly influence how wide an orchestra is perceived. The primary mechanism used to judge auditory spatial impression differs between stereo loudspeaker and headphone listening. When western classical symphony is recorded and reproduced by two-channel stereophony, the changes in instrument positions in terms of increasing or reducing the physical source width do not lead to an obvious increase or reduction on the spatial impression of the performing entity.
Fourier theory is quite ubiquitous in modern audio signal processing. However, this framework is often at odds with our intuitions behind audio signals. Strictly speaking, Fourier theory is ideal to analyze periodic behaviors but when periodicities change across time it is easy to misinterpret its results. Of course, we have developed strategies around it like the Short Time Fourier Transform, yet again many times our interpretations of it falls beyond what the theory really says. This paper pushes the exact theoretical description showing examples where our interpretation of the data is incorrect. Furthermore, it shows specific instances where we incorrectly take decisions based on such paradoxical framework.
Currently, I am a PhD Candidate in Music Technology at NYU and am currently based in NYUAD as part of the Global Fellowship program. As a professional musician, my expertise lies in Audio Engineering, and I hold a master's degree in Music, Science, and Technology from the prestigious... Read More →
Currently, I am a PhD Candidate in Music Technology at NYU and am currently based in NYUAD as part of the Global Fellowship program. As a professional musician, my expertise lies in Audio Engineering, and I hold a master's degree in Music, Science, and Technology from the prestigious... Read More →
The nominal audio level is where developers of professional analog equipment design their units to have an optimal performance. Audio levels above the nominal level will at some point lead to increased harmonic distortion and eventually clipping. DSP plugins emulating such nonlinear behavior must – in the same manner as analog equipment – align to a nominal level that is simulated within the digital environment. A listening test was tailored to investigate if, or to which extent, misalignments in the audio levels compared to the simulated nominal level in analog-modelled DSP plugins are audible, thus affecting the outcome, depending on which level you choose to record at. The results of this study indicate that harmonic distortion in analog-modeled DSP plugins may become audible as the recording level increases. However, for the plugins included in this study, the immediate consequence of the harmonics added is not critical and, in most cases, not noticed by the listener.
Since the inception of electrical recording for phonograph records in 1924, records have been intentionally cut with a non-uniform frequency response to maximize the information density on a disc and to improve the signal-to-noise ratio. To reproduce a nominally flat signal within the available bandwidth, the effects of this cutting curve must be undone by applying an inverse curve on playback. Until 1953, with the introduction of what has become known as the RIAA curve, the playback curve required for any particular disc could vary by record company and over time. As a consequence, anyone seeking to hear or restore the information on a disc must have access to equipment that is capable of implementing multiple playback equalizations. This correction may be accomplished with either analog hardware or digital processing. The digital approach has the advantages of reduced cost and expanded versatility, but requires a transformation from continuous time, where the original curves are defined, to discrete time. This transformation inevitably comes with some deviations from the continuous-time response near the Nyquist frequency. There are many established methods for discretizing continuous-time filters, and these vary in performance, computational cost, and inherent latency. In this work, several methods for performing this transformation are explored in the context of phonograph playback equalization, and the performance of each approach is quantified. This work is intended as a resource for anyone developing systems for digital playback equalization or similar applications that require approximating the response of a continuous-time filter digitally.
This paper investigates using Time-Sensitive Networking (TSN) protocols, particularly from Audio Video Bridging (AVB), to support AES67 audio transport. By leveraging the IEEE 1588 Level 3 Precision Time Protocol (PTP) Media Profile, packet scheduling, and bandwidth reservation, we demonstrate that AES67 can be transported with AVB-equivalent quality guarantees while benefiting from Layer 3 networking advantages. The evolution of professional audio networking has increased the demand for high-quality, interoperable, and efficiently managed networks. AVB provides robust Layer 2 delivery guarantees but is limited by Layer 2 constraints. AES67 offers Layer 3 interoperability but lacks strict quality of service (QoS) guarantees. This paper proposes combining the strengths of both approaches by using TSN protocols to support AES67, ensuring precise audio transmission with Layer 3 flexibility. TSN extends AVB standards for time synchronization, traffic shaping, and resource reservation, ensuring low latency, low jitter, and minimal packet loss. AES67, a standard for high-performance audio over IP, leverages ubiquitous IP infrastructure for scalability and flexibility but lacks the QoS needed for professional audio. Integrating TSN protocols with AES67 achieves AVB's QoS guarantees in a Layer 3 environment. IEEE 1588 Level 3 PTP Media Profile ensures precise synchronization, packet scheduling reduces latency and jitter, and bandwidth reservation prevents congestion. Experiments show that TSN protocols enable AES67 to achieve latency, jitter, and packet loss performance on par with AVB, providing reliable audio transmission suitable for professional applications in modern, scalable networks.
Coherent sound wave interference is a persistent challenge in live sound reinforcement, where phase differences between multiple loudspeakers lead to destructive interference, resulting in inconsistent audio coverage. This review paper presents a modern solution: Diffuse Signal Processing (DiSP), which utilizes Temporally Diffuse Impulses (TDIs) to mitigate phase cancellation. Unlike traditional methods focused on phase alignment, DiSP manipulates the temporal and spectral characteristics of sound, effectively diffusing coherent wavefronts. TDIs, designed to spread acoustic energy over time, are synthesized and convolved with audio signals to reduce the likelihood of interference. This process maintains the original sound’s perceptual integrity while enhancing spatial consistency, particularly in large-scale sound reinforcement systems. Practical implementation methods are demonstrated, including a MATLAB-based workflow for generating TDIs and optimizing them for specific frequency ranges or acoustic environments. Furthermore, dynamic DiSP is introduced as a method for addressing interference caused by early reflections in small-to-medium sized rooms. This technique adapts TDIs in real-time, ensuring ongoing decorrelation in complex environments. The potential for future developments, such as integrating DiSP with immersive audio systems or creating dedicated hardware for real-time signal processing, is also discussed.
Spatial and immersive audio has become increasingly mainstream, presented in concert halls and more recently through music streaming services. There is a diverse ecosystem of hardware and software controllers and renderers in both live and studio settings that would benefit from a standardized communication protocol. In 2019 a group of industry stakeholders began designing ADM-OSC to fill this need. ADM-OSC is a standard for transmitting metadata for object-based audio by implementing a namespace in parallel with the Audio Definition Model (ADM), a metadata standard developed in the broadcast industry. Open Sound Control (OSC) is a well-established data transport protocol developed for flexible and accurate communication of real-time performance data. By leveraging these open standards, we have created a lightweight specification that can be easily implemented in audio software, plugins, game engines, consoles, and controllers. ADM-OSC has reached a level of maturity over multiple implementations be ready for an official 1.0 release. This paper will discuss the design of ADM-OSC 1.0 and how it was developed to facilitate interoperability for a range of stakeholders and use cases. The core address space for position data is described, as well as extensions for live control data. We conclude with an overview of future ADM-OSC development, including next steps in bringing together ideas and discussion from multiple industry partners.
Michael Zbyszyński is musician, researcher, teacher and developer in the field of contemporary electroacoustic music. He is currently part of the Creative Technologies R&D group at L-Acoustics. As a musician, his work spans from brass bands to symphony orchestras, including composition... Read More →
Michael Zbyszyński is musician, researcher, teacher and developer in the field of contemporary electroacoustic music. He is currently part of the Creative Technologies R&D group at L-Acoustics. As a musician, his work spans from brass bands to symphony orchestras, including composition... Read More →
Senior Mgr. Business Development | FLUX:: GPLM, Harman International
Hugo Larin is a key collaborator to the FLUX: SPAT Revolution project and has deep roots in audio mixing, design and operation, as well as in networked control and data distribution. He leads the FLUX:: business development at HARMAN. His recent involvements and interests include object-based spatial audio mixing workflows, interoperability... Read More →
Senior Director, Workflow and Integration, CTO Office, Lawo AG
Lucas joined Lawo in 2014, having previously worked as a freelancer in the live sound and entertainment industry for several years. He holds a degree in event technology and a Bachelor of Engineering in electrical engineering and information technology from the University of Applied... Read More →
Education & Application Support France, d&b audiotechnik
Mathieu holds a Bachelors's degree in applied physics from Paris 7 University and Master's degree in sound engineering from ENS Louis Lumière in 2003. He has years of diverse freelance mixing and system designer experiences internationally, as well as loudspeakers, amplifiers, dsp... Read More →
Modern Ultra Wide-Band Wireless (UWB) transceiver radio systems enhance digital audio wireless transmission by eliminating the need for audio data compression required by narrowband technologies such as Bluetooth. UWB systems, characterized by their high bandwidth, bypass compression computation delays, enabling audio data transmission from transmitter to receiver in under 10 milliseconds. This paper presents an analytical study of audio signals transmitted using a contemporary UWB transceiver system. The analysis confirms that UWB technology can deliver high-resolution (96kHz/24-bit) audio that is free from artifacts and comparable in quality to a wired digital link. This study underscores the potential of UWB systems to revolutionize wireless audio transmission by maintaining integrity and reducing latency, aligning with the rigorous demands of high-fidelity audio applications.
Composer Iannis Xenakis (1922-2001), in his book Formalized Music (1963), includes a piece of Fortran IV computer code which produces a composition of stochastically generated music. This early example of indeterminate music made by a computer was in fulfillment of his goal of creating a musical composition of minimum constraints by letting the computer stochastically generate parameters. Stochasticity is used on all levels of the composition, macro (note density and length of movements) and micro (length of notes and pitches). This paper carefully analyzes the output composition and the variations the program might produce. Efforts are then made to convert these compositions into MIDI format in order to promote their preservation and increase their accessibility. The preservation of the Free Stochastic Music Program is beneficial to understand one of the first examples of indeterminate computer music, with similar techniques being found in modern day music creation software.
This study investigates the phenomenon of increased emergency room visits by teenagers following the broadcast of ‘High School Rapper 2’, which features self-harm in its lyrics. The rate of emergency room visits for self-harm among adolescents aged 10 to 24 notably increased during and shortly after the program. Specifically, in the 10-14 age group, visits per 100,000 population tripled from 0.9 to 3.1. For those aged 15-19, the rate rose from 5.7 to 10.8, and in the 20-24 age group, it increased from 7.3 to 11.0. This study aims to clarify the relationship between lyrics and self-harm rates among adolescents. We analyzed the lyrics of the top 20 songs performed on Melon, a popular music streaming platform for teenagers, from April to October 2018. Using Python's KoNLPy and Kkma libraries for tokenization, part-of-speech tagging, and stop-word filtering, we identified the top 50 frequently appearing words and narrowed them down to the top 20 contextually significant words. A correlation analysis (Pearson R) with "Emergency Department Visits due to Self-Harm" data revealed that the words 'sway', 'think', 'freedom' and 'today' had a statistically significant correlation (p < 0.05). Additionally, we found that males’ self-harm tendency was less influenced by the broadcast compared to females. This research suggests a computational approach to understanding the influence of music lyrics on adolescent self-harm behavior. This pilot study demonstrated correlations between specific words in K-pop lyrics and increased adolescent self-harm rates, with notable gender differences.