Better AAC Audio with Handbrake ~Solution~

musicvid10 wrote on 5/26/2013, 10:11 PM
My biggest complaint with Handbrake to date has been the lack of a respectable AAC audio encoder. Most web delivery demands AAC with MP4, yet it's all crappy, unless you use the free but discontinued Nero AAC encoder, or a very expensive commercial option. You can use Nero AAC with MeGUI / AviSynth, but not inside Handbrake, because they are such sticklers with licensing concerns.

Here's how I did it:

-- Edit in Vegas, render in DNxHD, and encode in Handbrake per the , or your own variant.
-- While still in Vegas, render a separate PCM WAV audio file at 44.1 KHz.
-- Having installed the Nero AAC Codec on my computer, I used one of the readily available Nero AAC Frontends to convert the audio to AAC, then Yamb with MP4Box to remux the video with the new AAC audio. Yamb allows one to recreate the original audio / video delay, or tweak it.

The results can be heard here:
http://dl.dropboxusercontent.com/u/20519276/Benskyfall-1nero.mp4

Note that this is not public on Youtube, which only recrappifies the audio, although not as badly as if it were fed faac! I suggest you listen with your earbuds on.

Audio is the James Bond "Skyfall" theme, courtesy a Yamaha Baby Grand, a Zoom h4 placed directly seven feet from the open lid, and a very slight reverb added using a TB_Reverb plugin in Vegas.

Comments

farss wrote on 5/27/2013, 7:10 AM
Thanks, this info might be of some use to me with a new client.
I have to provide H.264 video files limited to 50MB for 10mins, only talking heads and only to be viewed on office PCs or laptops so pristine audio is not a priority.

I've tried dropping the audio bitrate with the Sony AVC encoder and going too low gives me audio that plays back at double speed. I'm going to give HB a go, first to wrangle as much as I can for the vision but secondly so I can mux the audio as you've suggested using the most efficient encoder.

You mentioned "expensive commercial options", I might have some money to throw at this, any suggestions? I've been looking into Squeeze.

Bob.
musicvid10 wrote on 5/27/2013, 8:37 AM
I think a lot of people are using Squeeze, but I have no direct experience.
musicvid10 wrote on 5/27/2013, 8:45 AM
That's less than 1 Mbps, tell them not to move their heads!
bdg wrote on 5/27/2013, 11:58 AM
I just tried your method and I guess it works - I end up with an mp4 that sounds identical to the one HB produces except that it is bigger and takes more time to produce.
I can't tell any difference in the quality of the sound, no better, no worse; but then my ears are on the way out.
Perhaps there is a difference and I just can't hear it.
musicvid10 wrote on 5/27/2013, 12:18 PM
If you're happy with what you're getting from Vegas, then that is what you should stick with. Audio is clearer and more transparent with Nero to my ears.

Comparing Handbrake outputs, the difference between their native AAC encoders and Nero is ridiculous. I played them both for the pianist in the video, and he heard the difference immediately.
bdg wrote on 5/27/2013, 12:39 PM
I exported uncompressed avi from Vegas with 44.1 16bit Stereo PCM Uncompressed.
In HB, I encoded the mp4 with AAC Passthrough.
Results:
HB - 8.51M
Nero - 9.43M
No difference in audio quality to my very imperfect and non-musical ears.
musicvid10 wrote on 5/27/2013, 12:44 PM
"I exported uncompressed avi from Vegas with 44.1 16bit Stereo PCM Uncompressed.

When there is no AAC to pass through, Handbrake falls back to encoding 160 Kbps using faac.

I'll try to put up a side-by-side comparison between faac and Nero aac, time permitting. The difference should be immediately apparent.
bdg wrote on 5/27/2013, 1:31 PM
Ahh, so my grand plan didn't work!
I had not done my homework. Therefore didn't understand what AAC referred to; or that HB silently fell back to faac.
Now I know, thanks to a quick shooftie at Wikipedia.

It was much easier when we could imagine the little needle flapping sideways in the groove. Or the little electron cloud surrounding the cathode in the valve and being sucked through the grid on its way to the plate.
That all made sense; but this current stuff - who knows what is real and what is imagined anymore?
farss wrote on 5/27/2013, 3:44 PM
[I]"That's less than 1 Mbps, tell them not to move their heads"[/I]

I'm using 512Kbps 856x480, Sony AVC and AME look pretty much the same.
This client has been editing themselves, they covert the camera original to that spec and go through several generations and it still holds up although the audio does suffer...enough for it to be noticeable on anything.

Moving heads are OK, dissolves as expected a disaster.

Bob.
musicvid10 wrote on 5/27/2013, 3:49 PM
Jerry's got some low-bitrate tests up on his dog's site:
http://www.jazzythedog.com/testing/DNxHD/hd-guide.aspx#LBR
amendegw wrote on 5/27/2013, 4:35 PM
"Jerry's got some low-bitrate tests up on his dog's site"Heh! Jazzy hopes he can help!



...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

farss wrote on 5/27/2013, 4:56 PM
[I]"Heh! Jazzy hopes he can help!"[/I]

Cute head and I'm sure he's got more talent that what I have to work with :)
Can he talk though, if so I might have a job for him after I'm done with the human doctors, we might move onto the animal doctors.

That aside my challenge now is a bit different, the lower the bitrate I end up encoding at the more profit the client makes. I used to run vision and sound at 128Kbps over ISDN 20 years ago but that was with hardware encoders in real time.

If I can get reasonable SD at 256Kbps the client will be impressed by their savings.

I know all the tricks on the vision side, good lighting, large sensor cameras locked off etc, etc. There's no music and I only need intelligible speech. From experience a few years ago encoding audio at low bitrates is similar to vision, the cleaner it is from the get go the better it will hold up.
For example 15ips masters with Dolby SR would survive low bitrate MP3 way better than scratchy acetates. Cleaning up the acetates as best I could with SF before encoding helped a lot.

Bob.
amendegw wrote on 5/27/2013, 5:09 PM
All three of Jazzy's tests where 1200Kbps / 1280x720 29.97fps. Of course, you can get by with lower bitrates at a smaller framesize. In Jazzy's opinion, the HandBrake render looks very acceptable. Mainconcept was 2nd, Sony in last.

The other thing I've (err... Jazzy has) found is that minimizing bitrate is very much a trial-and-error process. As you mentioned in an earlier post, very low bitrates are more easily attained for talking heads than action scenes.

...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

musicvid10 wrote on 5/27/2013, 5:32 PM
Or just use stills of the heads -- you can get full HD down to 256K easily.
farss wrote on 5/27/2013, 7:02 PM
[I]"Or just use stills of the heads"[/I]

Now there's a thought. Perhaps use Crazytalk so the lips move and add a some random eye movements, just to sell it.

Probably not a good look though for a plastic surgeon :)

Bob.
R0cky wrote on 5/28/2013, 9:48 PM
DNxHD has an audio option of low complexity AAC. MusicVid10, does this pass through in HB encoding? If so, is it better than the HB encoders?

thx,
rocky
musicvid10 wrote on 5/28/2013, 10:40 PM
rocky,
I don't have that option in 8.0c with QT 7.6 and Avid LE 2.3.8
Could you render me a couple of short DNxHD files, one with PCM and one with the low-complexity AAC option, and I'll compare to Nero AAC.
With great source audio, of course!

It wouldn't take much for the result to be better than either of the AAC encoders in Handbrake. I've had a recent discussion with a Handbrake developer about including Fraunhofer AAC, which would be a godsend!

;?)

Laurence wrote on 5/29/2013, 4:48 PM
It's so strange that I'm not noticing any audio degradation with Handbrake. Are you sure you guys aren't doing a sample rate conversion or something with Handbrake? Another thing, I stopped using the Avid codec a while back with Handbrake because I would get this weird occasional distortion just with that codec. I am now using XDcam mp4 as my Handbrake intermediate. For Youtube and Vimeo, I always set the audio to the maximum bitrate, and I really can't hear a difference between the source and the Handbrake versions. Strange that some of you are getting such different results. I would think that maybe I'm just not hearing it except that I have made a living as an audio engineer for the past twenty years or so, and I usually hear things long before everyone else does.

Edit: I realize that the above statement doesn't make technical sense. After all, DNxHD is a video codec and should have nothing to do with the audio. None the less, I experience audio distortion when I use a Quicktime wrapper with DNxHD video even though the source audio is uncompressed. I was happy with the video in this format. It was the audio quality that made me change. I tried XDcam .mxf to avoid this but got only one side of the stereo in my Handbrake encodes. XDcam mp4 gives me clean video and audio in Handbrake, at least I think it does. Am I wrong? I could well be. I have been doing sound reinforcement so long that I have three tinnitus frequencies ringing constantly in my ears, and I know when I test my ears with an iPad hearing test app that I hear nothing above 12k. Still, I am suspicious that the problem might have more to do with the QuickTime wrapper and Handbrake compatibility than it does with the actual Handbrake AAC audio encoders.
musicvid10 wrote on 5/29/2013, 7:27 PM
Laurence,
Of course you didn't buy into the hype on the audio forums that faac is the crappiest AAC codec ever! And that ffmpeg AAC is quirky. So your mind wasn't already made up for you.
;?)

Seriously, I didn't think it was that bad either when I was just encoding for Youtube, whose AAC encoding is equally as awful. It was only when I started putting several audio versions on the timeline and selectively soloing tracks that I noticed that the gasheads on hydrogenaudio were right, and more than just a little bit.

The video in my first post sounds so much better than its faac version that the kid playing the piano, who's your son's age, noticed and mentioned the difference after the first few seconds.

I'm going to figure out a way to do a head-to-head comparison between the two encoders on a level playing field and post it here; not to prove anything, but to give others the opportunity to compare with their own ears, and decide for themselves whether it is worth the extra effort to encode NeroAAC and mux it with their Handbrake files.

Meanwhile, the parallel discussion over on the Handbrake forum has got one of their developers interested in exposing the Fraunhofer AAC encoder (fdk-aac), which I have a feeling is on par with NeroAAC, maybe even better at low bitrates.

In some of us old guys with sensorineural hearing loss, the effects of Q noise, aliasing, and harmonic distortion are suppressed, and in others including me, it can become quite exaggerated, similar if not identical to hyperacusis and misophonia.