Maybe this is just my phone (and laptop), but volume control is irritating when some tracks are configured so that I need to set the volume to 70-80% and some tracks are so “naturally loud” that the lowest setting (5% ish for my phone) is distractingly loud.
On some of my tracks (especially for the classical music ones), within the same track I need to change the volume from 20% to 80% depending on what part I am listening to if I want to hear everything without killing my ear drums.
I get that it would be difficult to do anything about this for streaming or live audio since the phone doesn’t know in advance what the input will be, but for a pre-recorded mp3 file, couldn’t my phone do some digital signal processing?
Do I just have terrible electronic items and is this an issue anyone else experiences? Ot is this problem just harder to solve than I am expecting?


What you are looking for, at least in part, is called compression. You could compress tracks to have the same volume with an external tool like Audacity (which I understand is no longer open source or has a weird licensing situation) or ffmpeg.
There are at least 2 drawbacks. First, you’d need to run your software over the tracks externally and rerecord them. Which could be a pain in the ass unless you write a script. If you haven’t done that, it shouldn’t be hard to do. List the files in a directory and feed them into ffmpeg with whatever compression parameters you want.
The second issue is the loss of sound quality. You can somewhat avoid this if you set the compression range in dB wide enough.
All that said, the easiest thing would be if there was an app that either was a media player that did this for you, or acted like a filter for your media player. The latter can be done on linux with pipewire/jack without using ffmpeg or audacity.
Not a direct answer to your question, but I hope that helps 🙂
As an audio engineer, this suggestion makes my skin crawl.
Don’t apply any extra compression to your files this, it will ruin them.
Modern audio streaming services and good audio players use loudness normalization to achieve consistent playback loudness. The way they do this is by measuring the integrated loudness of each song and increasing or, in most cases, reducing the playback gain of the song to an arbitrary target (e.g. Spotify has chosen -14LUFS which is pretty quiet when you consider most pop music is mastered to somewhere between -10LUFS and -3LUFS).
OP should just find a better audio player or figure out how to enable loudness normalization.
I have a semi-related question if you don’t mind. People often complain about the voice tracks in movies being hard to hear, especially if you don’t have a speaker for the center channel (but even then I have trouble)
Why haven’t they solved this problem by packaging the voice track separately on the bluray/stream so you can turn up the volume of the voices only without blowing your ears out when the music hits?
I don’t know why they don’t, I work in music rather than TV/Film but it infuriated me too! Give me a voice volume control! It would be technically very easy to do implement as a standard but the powers that be just haven’t come together and done it!
I’m glad to hear I’m not the only one thinking it!
Do you think it could be done by diffing a few of the different language tracks?
Unfortunately no, audio files are actually really dumb in that they’re basically just a file of 44100 (or 48000 or 96000 etc) amplitude numbers per second.
So there’s nothing really to diff because it’s basically just a squiggly line, set of squiggly lines or, when compressed, a mathematical expression that when decompressed, recreates a squiggly line.
You could isolate the dialog if you got ahold of a version with no dialog at all and then inverse the polarity of that and sum it with the original but it’s unlikely you’ll find a version without any vocals.
Machine learning vocal isolation tools are probably going to be the best way to go about it as a DIY approach. Ultimate Vocal Remover 5 with the demucs 4 algo is great FOSS software to extract vocals and you could sum that with the original track and adjust the gain to get louder dialogue… it would be a lot of work though…
I don’t really understand still but thanks for trying all the same.
Yeah I figured it was an issue with my software.