x264 presets and settings – Quality comparison of 720p60 Apex Legends

Introduction

This is the first post in a long series I plan on making to hash out as much data as possible. After the release of Turing RTX cards, I started to evaluate NVENC’s potential as an x264 replacement. Some people expressed a desire to see similar tests done with some variations. This could be additional settings applied, or other quality metrics included.

Note that currently I’m only collecting data for single-pass or equivalent encode methods. This is to remain relevant for live streaming. For example, NVENC 2-pass can use CUDA or the CPU for the first pass, and NVENC for the second. By doing both simultaneously, it’s still live encoding and suitable for streaming. x264 2-pass requires completely scanning the input then starting again for a second time. Therefore, I only list single-pass encodes here.

And thus the quest begins!

TLDR

If you aren’t sure yet that you want to read all the results before seeing some images, here’s a preview:

Fast g480 aq0 vs VerySlow

x264 Medium vs Slow

Progress

Older posts that led to this study

720p 60 fps

  • Apex Legends: H.264 – x264 (this post), H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

1080p 60 fps

  • Apex Legends: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

1440p 60 fps

  • Apex Legends: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

Test Bench

The test bench isn’t particularly relative in a quality analysis. No matter what hardware you use, you will get identical quality so long as it actually works. Speed tests will start coming through later where hardware matters more.

  • CPU = Intel i7-8086k stock clock. Coffee Lake QuickSync Version 6.
  • Graphics Card = NVIDIA RTX 2070. Turing NVENC 6th Generation.
  • RAM = 16GB
  • Encoding by FFMPEG v4.1.1 12th February 2019 with libx264 core 157 r2935 545de2f by VideoLAN http://www.videolan.org/x264.html
  • Scoring by FFMPEG version N-93394-g14eea7c with libvmaf v1.3.14 built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04) all compiled from source on 17th March 2019.
  • libvmaf model vmaf_v0.6.1.pkl

All original sources are recorded lossless from original gameplay at the resolution in question. Instead of scaling the same video down for alternative resolutions, I recorded each one individually. This is to avoid introducing any scaling artifacts.

Testing Strategy

Feedback from previous x264 tests included a request for fixed GOP length. I tried 4, 8 and 16 seconds on a couple of test bitrates. 8 seconds (-g 480) showed an improvement over 4 seconds (-g 240). However 16 seconds provided no difference or worse results sometimes. So I ran the full test with 8 (-g 480). This is indicated in the data by “g480”.

Another idea was to include variations in adaptive quantization. Disabling adaptive quantization (-aq-mode 0) improved VMAF and PSNR scores, but lowered MS-SSIM scores. So I included it for people to make their own informed choices. This is indicated in the data by “aq0”.

One of the more controversial results from earlier tests, was how faster presets beat slower presets in VMAF. Spoiler alert… This test will show the same. I know it’s not intuitive, but it is indeed what actually happens. Bear in mind, that with different resolutions and different source gameplay footage, this may change. So stay tuned for 720p60 in both Overwatch and Heroes of the Storm, followed by the same again at higher resolutions.

Results

x264 Fast Preset

VMAF

VMAF scores - Fast preset variations for x264
VMAF scores – Fast preset variations for x264
  • Disabling adaptive quantization (aq0) provides a decent buff to the quality per bitrate.
  • Setting Group of Pictures length to 8 seconds (g480) provides a tiny additional buff. This affects both the standard Fast preset and the aq0 variant.
  • x264 Fast preset with both aq0 and g40 performs better than the other variants on VMAF.

MS-SSIM

MS-SSIM scores - Fast preset variations for x264
MS-SSIM scores – Fast preset variations for x264
  • g480 provides a minimal buff to the MS-SSIM score.
  • aq0 hurts the score a large deal. aq0 at 5500kbps scores the same as 4400kbps without the aq0 flag.
  • Fast preset with only g480 performs better than the other variants on MS-SSIM.

PSNR

PSNR scores - Fast preset variations for x264
PSNR scores – Fast preset variations for x264
  • The quality improvement by bitrate according to PSNR is much closer to linear than the other metrics.
  • aq0 provides a decent buff to PSNR scores from non aq0 variants.
  • g480 provides a minimal buff to the scores.
  • x264 Fast preset with both aq0 gives the best by a tiny margin.

x264 Medium Preset

VMAF

VMAF scores - Medium preset variations for x264
VMAF scores – Medium preset variations for x264
  • g480 outperforms the other variants for the middle of the curve.
  • aq0 & g480 together win on the low and high end of the curve.

MS-SSIM

MS-SSIM scores - Medium preset variations for x264
MS-SSIM scores – Medium preset variations for x264
  • As before, MS-SSIM performs worse with aq0 and a tiny bit better with g480.

PSNR

MS-SSIM scores - Medium preset variations for x264
MS-SSIM scores – Medium preset variations for x264
  • On Medium, the PSNR scores do the same as they did on Fast preset. The variant with both aq0 and g480 performs the best.

x264 Slow Preset

VMAF

VMAF scores - Slow preset variations for x264
VMAF scores – Slow preset variations for x264
  • aq0 begins to show a more consistent detriment to x264 when we get to the Slow preset.
  • g480 still provides a small buff to non-g480.
  • Slow g480 WITHOUT aq0 will be selected as the best performer from this graph.

MS-SSIM

MS-SSIM scores - Slow preset variations for x264
MS-SSIM scores – Slow preset variations for x264
  • The MS-SSIM scores work exactly as in the previous presets.

PSNR

PSNR scores - Slow preset variations for x264
PSNR scores – Slow preset variations for x264
  • The PSNR scores on the Slow preset show the same pattern as with the previous presets. The only thing to notice is that the lines are getting closer together as the preset gets slower.

x264 Very Slow Preset

VMAF

VMAF scores - VerySlow preset variations for
VMAF scores – VerySlow preset variations for x264
  • For the most part, all variants are quite similar in scores. The largest separation is at the very lowest end of the curve, where aq0 provides a quality buff.
  • The variant with both g480 and aq0 manages to achieve the best scores overall.
  • In summary, different x264 presets at different bitrates have varying degrees of success with VMAF. Generally speaking, g480 and aq0 both provide a buff, just not in EVERY circumstance.

MS-SSIM

MS-SSIM scores - VerySlow preset variations
MS-SSIM scores – VerySlow preset variations for x264
  • Again, the MS-SSIM scores follow the same pattern. g480 provides a tiny buff and aq0 hurts the score.
  • In summary, MS-SSIM agrees with VMAF and PSNR regarding the small quality buff of g480. However it disagrees with the other metrics regarding aq0.

PSNR

SSIM scores - VerySlow preset variations
SSIM scores – VerySlow preset variations for x264
  • PSNR also consistently views each variant as either a buff or nerf in all tested presets at all tested bitrates. aq0 is a decent buff, and g480 is a very minor one.
  • The only thing that changes as the x264 preset gets slower is the gap between the variants.

Finalists

Here we select the best curve from each preset and compare them per quality metric.

VMAF

VMAF scores - best variant of each preset
VMAF scores – best variant of each x264 preset

The variant with aq0 AND g480 is used for all presets except Slow, where the g480 only variant is selected. The VMAF scores for x264 presets are the exact opposite to what most people expect. Fast comes out as the winner, followed by Medium, then Slow and finally VerySlow. In other resolutions and games it may not be exactly like this. Different resolutions see different rankings between the presets. But that is a graph for another day. Stay tuned!

PSNR

PSNR scores - best variant of each preset
PSNR scores – best variant of each x264 preset

All presets for PSNR use the aq0 & g480 variant since they consistently scored the best. Compared together, they are remarkably close in scores. The Slow preset performs the worst all round, while Fast performs the best. Medium starts equal to Fast at low bitrates, but drops off. On the other hand, VerySlow catches up with Fast at the higher bitrates. VeryFast and Medium swap places around 4350kbps.

MS-SSIM

MS-SSIM scores - best variant of each preset
MS-SSIM scores – best variant of each x264 preset

This graph is the more commonly expected one. VerySlow scores the highest across all bitrates. Slow and Medium are approximately equal from start to finish. Fast starts off well below, but catches up to Slow and Medium at around 3200kbps.

Sample frames

Webspace limitations at this time prevent me from just making a bunch of videos available for download. This may change in the future and this post may get updated. For now, a least we can look at some sample frames.

Fast g480 aq0 vs VerySlow

To point out the disagreement between VMAF & PSNR from MS-SSIM I picked a frame of high motion. VMAF & PSNR believe that the left side is better while MS-SSIM believes that the right side is better. In my own personal opinion, I believe that MS-SSIM makes the better choice in this instance.

Unfortunately there is a catch. This catch is one of the problems with subjective opinions on quality. I believe that the MS-SSIM pick is better because I can see the detail that it is trying to preserve. On the VMAF choice, some of this quality is blended and blurred in. The MS-SSIM winner does seem to retain more details, however those small details look blocky and pixelated. The VMAF winner tends to look smoother and more consistent. Personally, the blockiness bothers me less than the blurriness in this particular video. But this is because I have access to the original, while a stream viewer would not!

I’d be lying if I didn’t admit something. This is that if I did not have both versions, AND the original lossless source material, AND play Apex on a regular basis, I may very well not have been aware of the detail that is lost in the VMAF preferred frame. A random viewer who doesn’t have the original to check, AND is not aware of the detail that is blurred out, COULD easily reach the opposite conclusion, being that the VMAF preferred video looks smoother and cleaner, while the MS-SSIM video looks blocky and pixelated. This is the catch with being subjective, and the reason why we need metrics. It’s good to learn which metric prefers what differences so we can form better conclusions.

x264 Medium vs Slow

This is the same frame but from 2 more encodes. This time, I chose encodes to point out what VMAF sees that MS-SSIM does not. With neither the g480 nor aq0 options, MS-SSIM rates Medium and Slow almost exactly the same at this bitrate. However, VMAF and PSNR both say that Medium is at least a little bit better than Slow at this bitrate.

A good place to look at this video is the rocks and grass on the far right hand side. Also, the flare from the gunfire just to the left of the weapon. This frame looks much the same on both encodes but the subtle difference is most apparent in these two places. Similar to the above comparision, the Slow version retains slightly more detail for the price of some slight blockiness. MS-SSIM thinks that these two versions are effectively the same, with a SLIGHT preference for the Medium. VMAF and PSNR both show clear preference for the Medium version. MS-SSIM doesn’t consider some blockiness to be detrimental to the score. However, VMAF and PSNR both prefer elimination of blockiness by smoothing out some detail.

Conclusion

For this test video, Apex Legends in 720p at 60 fps, the differences are honestly not particularly huge. Choosing a slower preset or aq0 is just a matter of preferring MS-SSIM scores or VMAF and PSNR. The only consistent way of improving ALL scores is to take whatever preset you have and add g480. Even then, the benefit is only very very minor and not easily noticeable.

I would hazard a guess, that the gradual optimisation of x264 over the past decade and a half has largely been based on SSIM as a benchmark. Adaptive quantization, as well as the search range, direction and weighting across pixels and between frames has likely been developed on a basic premise of “if it improves the SSIM then it’s good, if it doesn’t then it’s not helping”. At least to some degree. While PSNR has been around the entire time, VMAF was only adopted by Netflix from VQA in mid 2016. This is well after the release of HEVC and VP9, as well as several alternative and hardware encoders for the AVC H.265 format.

In the end, for this resolution and game, you simply need to select your own preference. Would you pay the price of blokiness to retain some detail? Or do you prefer the image to be smoother at the price of blurring out some detail? In this case, I personally prefer the MS-SSIM results, but I can understand why some people prefer the other. Especially if they don’t have access to the original for comparison.

Data

UNREAL AUSSIES ARE ALL OVER THE WEB

HELP SUPPORT
UNREAL AUSSIES

Unreal Aussies run many events over the year to help connect and build the Australian gaming community. If you are interested in helping out in any current or future planned events or wishing to offer some more ideas for us to explore - let us know!

About Us

Unreal Aussies is for passionate gamers from all walks of life. Games come and go, but the people still remain. From meetups to tournaments, hardcore teams to charity streams, Unreal Aussies core mission is to make gaming more fun as part of a community than it can ever be alone.