- 2Basic Concepts
- 3Your choices and my recommendations
- 3.1A - 3 video codecs
- 3.1.1Option 1 – H264:
- 3.1.2Option 2 – H265:
- 3.1.3Option 3 – VP9:
- 3.2B - MKV vs MP4 vs WebM
- 3.3C – Handbrake and AviDemux
- 3.4D - Constant Quality vs Constant Bitrate vs Variable Bitrate
- 3.5E – Considerations for the future
- 4Encoding comparisons to help you choose
- 4.5.1HandBrake x264:
- 4.5.2Handbrake x265:
- 4.5.3FFMPEG VP9:
- 4.5.4Premiere H264:
- 4.5.5HandBrake x264:
- 4.5.6Handbrake x265:
- 4.5.7FFMPEG VP9:
- 4.5.8Handbrake x264:
- 4.5.9Handbrake x265:
- 4.5.10FFMPEG VP9:
- 5Encoding the Video portion with Handbrake
- 5.5.1Picture Tab:
- 5.5.2Filters Tab:
- 5.5.3Video Tab:
- 5.5.4Advanced Tab:
- 5.5.5FFMPEG for VP9:
- 6Encoding the Audio portion with Handbrake
- 6.1A – Separating the myths from the facts
- 6.2B – Definitions, formats and my choices
- 6.3C – Examples of the Handbrake Audio Tab
- 7Streaming basics and a comparison – CPU vs NVENC vs Quick Sync
- 7.3.1Audio, push-to-talk and hotkeys Settings:
- 7.3.2Stream Settings:
- 7.3.3Video Settings:
- 7.3.4Output Settings:
- 8Streaming examples for CPU, NVENC & Quick Sync
- 8.3.1CPU (x264)
- 8.3.2Quick Sync (QSV)
- 9Recording examples for CPU, NVENC & Quick Sync and recording while streaming
- 9.3.1CPU (x264)
- 9.3.2Quick Sync (QSV)
- 10Buffer recording and AVIDemux
- 10.3.1Replay Buffer
Encoding the Audio portion with Handbrake
A – Separating the myths from the facts
There’s a term called Audiophile. It refers to people who “care” more about audio than your average bear. All over the internet you can find examples of Audiophiles saying that this is better than that, pointing out that there’s a difference between “most humans” and those with “golden ears” or whatever topic is in fashion at the time. By and large, it’s rubbish.
A great old example was in 1984 when CD technology was first invented. A gentleman named Ivor Tiefenbrun was making claims about what was ruining sound. Ivor was instrumental in developing high-fidelity audio equipment throughout the 70s and 80s and received an “Order of the British Empire” by Queen Elizabeth II in 1992. He was a renowned Audiophile who said that digital technologies were not able to reproduce audio as well as analogue could. He was challenged to prove this by listening to a record play through his own brand’s equipment with a switch. In the A position, the audio went entirely through his own equipment. In the B position, the audio went into a Sony PCM digital box, was converted into a digital signal, then converted back into an analogue signal and played through the rest of his equipment. In the X position, the box would be in either A or B, but the user would not know which one. The user was freely able to switch between A/B/X as many times as they wanted before making a choice by saying X matched A or B.
The results? Absolutely no ability to differentiate the two. The PCM digital box did not affect the sound enough for Ivor to tell the difference. And as the tester points out, it was effectively only using 13-bits of it’s 16-bit capability!!! If the founder a of top quality audio equipment company, awarded by the Queen, could not hear an effect of digital coding at 13-bit depth vs the original analogue audio, then who can?
Sadly, people everywhere claim things like “I can tell the difference between 16-bit and 24-bit” or “people with golden ears can tell the difference between 44.1KHz and 192KHz”. They are wrong. Not just subjectively wrong, they are fundamentally wrong.
It stems from the idea that digital audio is stored as blocky steps instead of a smooth curve.
People think that if you make the blocks smaller, the audio quality will be better. This is mathematically incorrect. The Nyquist-Shannon Sampling Theorem is the basis for digital audio. It doesn’t “explain” digital audio, rather, digital audio was created AROUND the Sampling Theorem. It states that there is a point at which a wave can be reproduced PERFECTLY. For human ears, this is 40KHz. CDs sample audio at 44.1KHz just to be “safe” and some people distribute audio at 48KHz just to be “safer”. There is a reason to use higher sample rates when you’re an engineer inventing new sound effects, but once the sound effect is finalised, playing it at 44.1KHz is identical to playing it at any higher sampling rate. The Guiness book of world records 2017 states that the human limit is “almost 20KHz” which is perfectly sampled by a mere 40KHz, 10% lower than that of a CD.
So that’s the sampling rate part, what about bit-depth? Well that’s to do with dynamic range. The way digital audio works, errors or inaccuracies in bit-depth are dithered to produce white noise in addition to the sound wave. This white noise at 16-bit is pretty small. The white noise at 24-bit is 256 times smaller. The reality of this difference is akin to listening to a guy jackhammer the road 1 metre away and still knowing that behind him there is either 1 or 2 mosquitos buzzing around. People who claim they can hear the difference are mistaken.
Here’s a nice bullet point list about fact vs myth:
- Your ears do not hear the “blockiness” of digital audio. The “blockiness” of digital audio limits the frequencies at which it can PERFECTLY reproduce a sound wave.
- A human who can hear frequencies not allowed by a 48KHz sample rate is equivalent to a human who has 4 cones in their eye retina instead of the usual three, and can see x-rays as easily as the rest of us can see the colour green. Or a human who has eyes in the back of their head to see predators coming. In 100 years of searching for these people, neither one has ever been found. The “golden ears” are equally as rare as eyes in the back of your head.
- The Guinness book of world records 2017 states that the human limit is “almost 20KHz” which is perfectly sampled by a mere 40KHz, 10% lower than that of a CD.
- A guy awarded by the Queen for contribution to audio electronics claimed he could hear the difference between original sound and 16-bit 44.1KHz digital sound. He put his reputation on the line to participate in a test that proved he couldn’t do it even with 13-bit digital sound in a noise protected studio under controlled conditions. At least he had the guts to do it.
- The noise-floor of a recording studio is usually around 30dB and the dynamic range of a 16-bit CD is up to 120dB (some dithering enhances this to 150dB) meaning that to hear the white noise in a soundproof recording studio, the loudest part of the CD is 150dB, enough to cause permanent hearing loss in seconds. As loud as that of firing a sniper rifle.
- The same point above for 24-bit audio is 194dB where 180dB is enough to kill any human being from impact by pressure from the sound wave. In fact, people have been killed by 160dB.
- Just to reiterate the above 2 points. To hear the white noise in 16-bit digital audio the volume needs to be as loud as if you were firing a sniper rifle, and in 21-bit audio to hear the white noise you would need to turn the volume up enough TO ACTUALLY DIE. Yet Apple sells 24-bit audio files and people everywhere swear they can hear the difference.
Some referencing if you would like to know more:
B – Definitions, formats and my choices
Transparent – Where the encoded LOSSY sound cannot be told apart from the original. Bear in mind that if you encode a sound 100 times “transparently” you will end up with garbage if the encoding is lossy. So this can usually only be done once or twice.
Passthrough – Where the audio in the original is passed directly into the encoded file. You’re encoding the video, but keeping the audio unchanged.
PCM – This is an uncompressed pure WAV or CD style format. The raw audio. It can have a bit depth and a sample rate, but almost always they are 16-bit 44.1KHz. You will have these if you recorded gameplay with Fraps or maybe some other software like DxTory.
Dolby Surround – Technically this is stereo. But while stereo is 2.0 channels for 2 speakers, Dolby Surround takes into account speaker separation. So stereo works great on headphones, Dolby Surround will sound different on a speaker setup where the speakers are in corners of the room. You can setup your home theatre system to know the distance between the TV and each speaker (they could be at different distances) and it will use Dolby Surround encoding to sync them in such a way that you hear what the author originally intended. Stereo alone cannot do this. However for the purposes of this guide, we won’t really talk about Dolby Surround as being different to stereo, and all 2.0 channel might just be called stereo for simplicity.
AC3 – Also known as Dolby Digital. This is what you usually get on a DVD and it’s usually in 5.1 channels (6 channels) at 384Kbit/sec giving 64Kbit/sec to each channel. Dolby TrueHD is a lossless version in 96KHz 24-bit sometimes found on Blu-Ray and always on HD-DVD.
DTS – Once called Digital Theatre Systems it’s an audio codec that was a competitor to Dolby Digital. DTS is widely accepted to be slightly better than AC3 and current Blu-Ray discs often contain DTS MA which is a lossless version, up to 192KHz in 24-bit.
AAC – Apple’s codec of choice (because they made it). Usually considered transparent at 128Kbit/sec stereo, or 64Kbit/sec per channel.
Opus – Fully sick new codec that is usually considered transparent at 112Kbit/sec stereo or 56Kbit/sec per channel. Outperforms AAC, Vorbis and MP3 at all bitrates up to 112Kbit/sec after which point they start all sounding like the original, except MP3 which needs more like 160kbit/sec. Also quite smart, can use unused bits from some channels to support others that require help. Latency is also much better with Opus, but that’s not really related to this guide. Finally, if you look up YouTube’s recommended upload specs, they say MP4 with AAC audio. But, if your video becomes popular or stays up for long enough, they actually convert it to WebM with Opus audio. Upload with Opus audio at the start and the only conversion that happens is they will create lower bitrate versions for viewers with poor internet connections. Your top quality video/audio streams will remain unconverted.
Don’t believe me about Opus? Check these blind tests out:
This WAV file example contains the original audio, a version in Opus at 16kbps and a version in MP3 at 24kbps. You can clearly hear that even at a measly 16kbps Opus still sounds OK and is far superior to the 50% higher bitrate MP3.
Dolby Pro Logic 2 – DPL2 is a strange beast. It’s 2 channels encoded in a way that will sound great on a stereo system, but also have surround sound qualities on a 5.1 system. It has some secret data in the encoding that will help your 5.1 system extract sound out of the stereo part to add it to the rear or centre speakers with particular delays that make it sound like real surround sound. It’s not real however, it’s an illusion, but it’s a pretty decent one.
Recommendation for DVDs: Converting AC3 5.1 channel into stereo often produces undesired effects. Sounds can be left out or appear to come through in an unintended way. DVDs will often include a 5.1 sound track AND a 2.0 stereo track, where the stereo has been mastered to sound the way the author wants it to sound (on 2 speakers). Using Handbrake, I recommend taking the AC3 5.1 sound track and down-mixing it into Dolby Pro Logic 2. The encoder will use half the bitrate for each channel and in that, a little can be taken aside to allow for surround effects. The LFE (subwoofer) channel will be dropped completely and your sub will just use the general soundwave. YOU DO LOSE QUALITY doing this (only really through speaker separation, not soundwaves) but the playback on a stereo system or headphones will be transparent, yet using a 5.1 surround sound system will be better than if you encoded a regular stereo stream, but not as good as the original. It’s a compromise in some situations, but not in others. 160Kbit/sec for DPL2 gives 80Kbit/sec to each channel when the original only had 64 so the extra 16Kbit/sec can be dedicated to adding the secret coding for surround effects (this isn’t really how it works, but it’s a decent way to think about it). It saves you 101MiB per hour of footage. If you are an “audiophile” and you want the original surround sound then by all means, passthrough the AC3 5.1 channel track, but remember, playback on a stereo system or headphones will sound bad and you’re doing this to prevent that, so you’ll have to include the stereo track which is another 128Kbit/sec so another 57MiB per hour. With 100 hours of footage you’re looking at 15.8GiB so make the decision yourself, but I personally do the DPL2 mixdown.
Recommendation for OBS Recordings: Passthrough. Basically when you record in OBS I recommend Lossless video so that you can effectively encode it in Handbrake later, since Handbrake produces such better quality but it can’t be done in real time without using obscene amounts of space. The Audio however can be encoded quite easily since it’s considerably less complicated. So, OBS by default will encode into AAC 128Kbit/sec which if you look above, is the transparent rate for stereo AAC. UPDATE: OBS now has a default bitrate of 160kbps AAC which I recommend changing to 128kbps. Just pass that through in Handbrake and your audio will remain unchanged. If you’re clever, you’ll encode it in OBS to Opus at 112Kbit/sec and then pass it through Handbrake saving 7.2MiB per hour. But if you didn’t do that, still passthrough in Handbrake because your audio has already been recorded lossy at a rate JUST high enough to be transparent. If you convert it again then you could start to hear differences because you’re introducing 2 lossy iterations.
Recommendation for Fraps Recordings: Encode to Opus 112Kbit/sec stereo. Fraps records your Audio in PCM just like a WAV file so you HAVE to convert it or it will cost you 640MiB per hour (It’s a bitrate of 1411.2kbps just for stereo)!!! If you have a surround sound setup and gameplay you MIGHT want to save it as DPL2 to keep some of that, or even 5.1 channel Opus at 336Kbit/sec to mimic DVD surround sound or possibly 256Kbit/sec since Opus will intelligently allocate less to the low-requiring LFE and more to the other channels as they need it. Either way, surround sound Opus at these bitrates will sound more like the original than Dolby Digital would. However playback on YouTube will be stereo for most viewers, or even mono for their phones so if YouTube is your goal then you’re pretty much wasting bandwidth. My recommendation, don’t stress about it. 112Kbit/sec stereo Opus sounds amazing. You CAN do FLAC for lossless purposes. I recommend FLAC if you’re going to be doing lots of editing, clipping and effects so that you don’t lose quality for every render, but your editing software needs to be able to handle FLAC. Regardless, the final version can be Opus and nobody has a right to hate this.
C – Examples of the Handbrake Audio Tab
The above screenshot shows what the Audio Tab looks like when the source video was recorded with Fraps. In this example the source has one single audio track in an “Unknown” language. The format is Pulse Code Modulation (PCM) it’s in 16-bit, is signed (can be positive or negative) and in little endian (backwards to normal handwriting). These details are unimportant, all we need to know is that Handbrake recognises it, therefore Handbrake can convert it into another format. PCM cannot be passed through with Handbrake (because it would be stupid to do so).
One detail that IS important is that the source audio is in 2.0 channels meaning stereo. You will only be able to mixdown to stereo or mono. Handbrake will not “invent” new channels.
You have 3 main options:
- Encode lossless to FLAC. Change the codec to FLAC 16-bit. Don’t use 24-bit because the Audio rate will be 50% larger for no difference in soundwave and it also means you didn’t read about it above so you haven’t done your research. This is compatible on good media players and PCs, but often not car stereos or cheap/old home theatre systems. When you set the codec to FLAC, the bitrate part will vanish because FLAC is always lossless, bitrate changes to accomplish that. Usually you end up with 700kbps (about half a CD bitrate) or 308 MiBytes per hour.
- Encode to AAC 128kbps bitrate. 64kbps per channel is widely considered transparent for the avcodec AAC encoding (avcodec is Handbrake’s AAC encoder of choice). Because this is stereo, you’ll need 128kbps. This will be most compatible with players since it’s Apple’s format and has been in use for yonkers. MP3 is just as compatible but requires more bitrate, so no reason to do MP3. 128kbps is 56.25 MiBytes per hour.
- RECOMMENDED Encode to Opus 112kpbs bitrate. Less compatible on players than AAC, but WILL become more and more widely used in the immediate future. Opus IS the future. 112kbps is just under 49.25 MiBytes per hour.
This example above is what the Audio Tab looks like from one of my OBS recordings. The video was recorded lossless but is HUGE so I’m using Handbrake to remedy this. The audio however was encoded DURING the recording into AAC 128kbps stereo, which is the default setting for OBS.
There is no point doing FLAC with this. The audio has been lossy encoded and lost quality. Putting it into FLAC now will save that lost quality in a track using 5 to 6 times more data and you gain nothing.
You have 2 main options:
- Encode to Opus 112kbps bitrate. You lose some quality because you’re lossy encoding for a second time on the original audio. You also lose compatibility on some players. But it will PROBABLY still sound exactly the same and will use about 12.5% less data on your disk at just under 49.25 MiBytes per hour.
- RECOMMENDED Auto Passthru. Set this in the codec drop-down and your AAC 128kpbs audio track will be copied bit for bit into the final video compressed file. In other words, it remains unchanged, so you won’t lose any more audio quality than you did when you first recorded it in OBS. You’re looking at 56.25 MiBytes per hour and to me it sounds identical to when I played the game.
This time the example is from a DVD. I like to archive my disks in H265 so I don’t have bookcases full of them and I can play one straight away on my TV with a media player without having to find it on the shelf or worry about damaging and losing them.
You’ll notice the source disk could have many audio tracks in multiple languages. This one has 2 in English. One is 5.1 channel AC3 (Dolby Digital) designed for home theatre systems that have 5 speakers and a subwoofer. The other is 2.0 channel AC3 (Dolby Surround or just stereo) and it is REMASTERED to sound the way the author wants it to on 2 speakers. It’s not the same as just the left and right of the 5.1 version, it’s actually a little different. Playing 5.1 on a stereo system can sometimes sound a little awful, so players will recognise that they can only play stereo and therefore choose the 2.0 channel track to play.
You have 3 main options:
- Encode to Opus 112kbps bitrate stereo only. If you really don’t give much of a crap about the sound quality of this DVD you can delete the second audio track and just leave a single one from the 2.0 stereo source. Encode this with Opus at 112kbps mixdown Stereo or Dolby Surround or Dolby Pro Logic II and you’re good to go. Play it on a 5.1 home theatre system and the rear speakers might just play their front speaker counterpart’s audio, or they might not, while the sub just guesses what to do. Your final file will have only one audio track and it will be under 49.25 MiBytes per hour.
- Passthru the surround sound and encode a second stereo track. If you’re fussy about the sound you can passthrough the AC3 5.1 channel audio track so your final file will match the original DVD on a 5.1 home theatre system. These are often 384-448kbps or 167-197 MiBytes per hour for all 6 channels. As stated a couple of times above, if you play this back on a stereo system it could be crap and you might miss sounds, so since you care so much about perfect audio, you should keep a stereo track for playback on stereo systems. Considering this is for low-end systems, Opus might be incompatible so forget about that. You can passthrough the stereo track for another 192kpbs or 84.4 MiBytes per hour, combined with the first track making 576-640kpbs or 253-281 MiBytes per hour. Or you can say you care about the surround quality but not the stereo quality, you’re only keeping it for compatibility with low-end systems. In this case take the 2.0 stereo track from the source and encode it to AAC 128kbps or 56.25 MiBytes per hour, combined with track 1 to make a total of 512-576kbps or 225-253 MiBytes per hour.
- RECOMMENDED Mixdown to Dolby Pro Logic II in Opus 160kbps. Option 1 sounds perfect on stereo but has no surround sound capability @ 49.25 MiBytes per hour. Option 2 will sound perfect on surround sound and great to perfect on stereo but uses between 225-281 MiBytes per hour. If you make the final file contain only 1 track and use the 5.1 channel AC3 source but mixdown to Dolby Pro Logic II with a bitrate of 160kbps you have something else entirely. It will capture all of the surround sound waves, yet not be remastered specifically for stereo playback nor surround sound playback, but rather a combination of both. Playback on a stereo system will sound NEARLY as good as option 1 and 2, possibly transparent for most DVDs to most people. Playback on a surround sound system will sound NEARLY as good as option 2, also possibly transparent for most DVDs to most people. You end up with a single track that is stereo, with a little extra data to give a good home theatre system some information about how to turn that into 5.1 surround sound. AAC can do this, but Opus is much better at it with more intelligent bit allocation for sound separation vs quality. Your system will need to play Opus, but it will sound fantastic no matter the speaker setup you have and still only 160kbps or 70.3 MiBytes per hour. My media player cost me $80 at the end of 2016, it plays these files and it sounds fantastic.
NOTE: If your DVD has only 1 audio track in the source and it’s 5.1 but you want option 1, then just use the 5.1 source at 112kbps Opus and mixdown to Dolby Pro Logic II. It means that they didn’t remaster it specifically for stereo systems but you can capture all the sounds by downmixing. If you wanted to do Option 2, you will only need one track, just passthrough the AC3 or DTS. Also note that if you see DTS, it’s conceptually the same as AC3 and all the same principles apply, just that often some people think DTS originals sound better than AC3s.
So in the end, there’s a whole bunch of different ways DVDs will have their audio, affecting how you might decide to make your final file. But ultimately when you settle on the quality you’re happy with, you’ll worry about it less and less. Blu-Rays will have slightly different codecs too, but your end result will be the same. Generally speaking, if you want the original audio, passthrough. But if you’re happy with a slight, possibly unnoticeable quality drop, then you can encode to save a lot of space. Audio has come a long way, when DVDs and Blu-Rays were invented audio needed a lot of space to be truly spectacular. Nowadays, if your media player can play Opus in Dolby Pro Logic II, you can make massive savings. If there was another disk format being released this year to supersede Blu-Ray, it would surely allow Opus. In a way there is, YouTube and Netflix are both adopting it.
When it comes to the recordings from OBS or Fraps, they are much more straightforward as you can see!