With the development of AI technology, there are many attempts to provide new experiences to users by applying AI technology to various multimedia devices. Most of these technologies are provided through server-based AI models due to the large model size. In particular, most of the audio AI technologies are applied through apps and it is server-based in offline AI models. However, there is no doubt that AI technology which can be implemented in real time is important and attractive for streaming service devices such as TVs. This paper introduces an on-device automatic speech remastering solution. The automatic speech remastering solution indicates extracting speech in real-time from the on-device and automatically adjusts the speech level considering the current background sound and volume level of the device. In addition, the automatic speech normalization technique that reduces the variance in speech level for each content is applied. The proposed solution provides users with a high understanding and immersion in the contents by automatically improving the delivery of speech and normalizing speech levels without manually controlling the volume level. There are three key points in this paper. The first is a deep learning speech extraction model that can be implemented in real-time on TV devices, the second is an optimized implementation method using the DSP and NPU, and last is audio signal processing for the speech remastering to improve speech intelligibility.