I will check with the dev team on their roadmap for audio output but as far as I know you are correct. Composer is primarily used to output single frames. The mp4 is really just a post render script.
As mentioned I would just treat rendered output and audio and separate elements and comp the audio in any standard package like After Effects.