x264 presets and settings – Quality comparison of 1080p60 Apex Legends

Introduction

Welcome to the second post in this series. This time I ran x264 encodes on a 1080p 60 fps Apex Legends clip. The main differences compared to 720p are with PSNR scores, while VMAF and MS-SSIM tend to maintain their patterns.

Note that currently I’m only collecting data for single-pass or equivalent encode methods. This is to remain relevant for live streaming. For example, NVENC 2-pass can use CUDA or the CPU for the first pass, and NVENC for the second. By doing both simultaneously, it’s still live encoding and suitable for streaming. x264 2-pass requires completely scanning the input then starting again for a second time. Therefore, I only list single-pass encodes here.

TLDR

If you aren’t sure yet that you want to read all the results before seeing some images, here’s a preview:

x264 Fast g480 aq0 4390kbps vs VerySlow 4898kbps

x264 Medium 3946kbps vs 3914kbps

Progress

Older posts that led to this study

720p 60 fps

  • Apex Legends: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

1080p 60 fps

  • Apex Legends: H.264 – x264 (this post), H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

1440p 60 fps

  • Apex Legends: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Overwatch: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.
  • Heroes of the Storm: H.264 – x264, H.264 – NVENC Turing – H.264 QuickSync Coffee Lake.

Test Bench

The test bench isn’t particularly relative in a quality analysis. No matter what hardware you use, you will get identical quality so long as it actually works. Speed tests will start coming through later where hardware matters more.

  • CPU = Intel i7-8086k stock clock. Coffee Lake QuickSync Version 6.
  • Graphics Card = NVIDIA RTX 2070. Turing NVENC 6th Generation.
  • RAM = 16GB
  • Encoding by FFMPEG v4.1.1 12th February 2019 with libx264 core 157 r2935 545de2f by VideoLAN http://www.videolan.org/x264.html
  • Scoring by FFMPEG version N-93394-g14eea7c with libvmaf v1.3.14 built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04) all compiled from source on 17th March 2019.
  • libvmaf model vmaf_v0.6.1.pkl

All original sources are recorded lossless from original gameplay at the resolution in question. Instead of scaling the same video down for alternative resolutions, I recorded each one individually. This is to avoid introducing any scaling artifacts.

x264 Testing Strategy

g480

Due to feedback from earlier tests, the variations applied to the presets are as follows… Setting the group of picture size to 480 (-g 480). I tried 4, 8 and 16 seconds on a couple of test bitrates. 8 seconds (-g 480) showed an improvement over 4 seconds (-g 240). However 16 seconds provided no difference or worse results sometimes. So I ran the full test with 8 (-g 480). This is indicated in the data by “g480”.

aq0

Another idea from feedback was regarding adaptive quantization. Disabling adaptive quantization (-aq-mode 0) improved VMAF and PSNR scores, but lowered MS-SSIM scores. So I included it for people to make their own informed choices. This is indicated in the data by “aq0”.

Harmonic Mean

Another piece of feedback was with the averaging of scores. Typically, the default method is to use the arithmetic mean. But I’ve included the harmonic mean as a same-coloured-dotted-line for each result. This has almost not effect on MS-SSIM scores, but does lower the overall score for VMAF and PSNR. Harmonic mean is the reciprocal of the arithmetic mean of reciprocals. A simple explanation is that with scores of [4, 4, 1] the arithmetic mean is 3, while the harmonic mean is 2. Outliers tend to affect the harmonic score more, which means the odd frame that is completely warped will hurt the overall score greatly, so the harmonic mean difference from the regular average reveals how frequently the worst frames appear.

As long as the harmonic mean remains at a consistent small distance from the arithmetic mean, you can be certain that the very worst frames in a video are not too far from the average quality.

Results

x264 Fast Preset

VMAF

VMAF scores - Fast preset variations for x264 on 1080p60 Apex Legends
VMAF scores – Fast preset variations for x264 on 1080p60 Apex Legends
  • Just like with 720p60, disabling adaptive quantization (aq0) provides a decent buff to the quality per bitrate.
  • Setting Group of Pictures length to 8 seconds (g480) has varying success. On the regular Fast preset it provides a small buff while under 3.6Mbps. At higher bitrates it performs worse. With aq0 set, g480 provides only a marginal buff.
  • x264 Fast preset with both aq0 and g480 performs better than the other variants on VMAF.

MS-SSIM

MS-SSIM scores - Fast preset variations for x264 on 1080p60 Apex Legends
MS-SSIM scores – Fast preset variations for x264 on 1080p60 Apex Legends
  • g480 provides an indiscernible buff to the MS-SSIM score with aq0. Conversely, g480 gives a small buff on the regular Fast below 3600kbps which then reverses for higher bitrates.
  • aq0 decreases MS-SSIM a large deal. aq0 at 6500kbps scores the same as 5400kbps without the aq0 flag.
  • Fast preset with neither g480 nor aq0 performs better than the other variants on MS-SSIM.

PSNR

PSNR scores - Fast preset variations for x264 on 1080p60 Apex Legends
PSNR scores – Fast preset variations for x264 on 1080p60 Apex Legends
  • The PSNR metric has the most linear quality per bitrate rise of all the metrics.
  • aq0 provides a decent buff to PSNR scores from the standard fast preset.
  • g480 provides no discernible difference to the aq0 variant.
  • Although g480 without aq0 gives better scores than the standard Fast preset, it causes the command which is otherwise the same to use more bits, resulting in a curve that ends up lower for the most part.
  • x264 Fast preset with both aq0 and g480 gives the best by a tiny margin.

x264 Medium Preset

VMAF

VMAF scores - Medium preset variations for x264 on 1080p60 Apex Legends
VMAF scores – Medium preset variations for x264 on 1080p60 Apex Legends
  • g480 provides an imperceptible increase on Medium, while aq0 gives a larger one.

MS-SSIM

MS-SSIM scores - Medium preset variations for x264 on 1080p60 Apex Legends
MS-SSIM scores – Medium preset variations for x264 on 1080p60 Apex Legends
  • This graphs shows MS-SSIM scoring worse with aq0 and an almost non-existent amount better with g480.

PSNR

PSNR scores - Medium preset variations for x264 on 1080p60 Apex Legends
PSNR scores – Medium preset variations for x264 on 1080p60 Apex Legends
  • On Medium, g480 makes no discernible difference to PSNR.
  • aq0 has a consistent positive buff to PSNR on Medium preset.

x264 Slow Preset

VMAF

VMAF scores - Slow preset variations for x264 on 1080p60 Apex Legends
VMAF scores – Slow preset variations for x264 on 1080p60 Apex Legends
  • g480 still provides only a tiny improvement, but aq0 provides slightly more
  • The variant with both aq0 and g480 still performs the best on VMAF on x264 Medium.

MS-SSIM

MS-SSIM scores - Slow preset variations for x264 on 1080p60 Apex Legends
MS-SSIM scores – Slow preset variations for x264 on 1080p60 Apex Legends
  • The MS-SSIM scores work much the same as in the previous presets, with aq0 lowing the score significantly and g480 improving it just marginally.

PSNR

PSNR scores - Slow preset variations for x264 on 1080p60 Apex Legends
PSNR scores – Slow preset variations for x264 on 1080p60 Apex Legends
  • The PSNR scores on the Slow preset show the same pattern but with the lines getting closer together as the preset gets slower.

x264 Very Slow Preset

VMAF

VMAF scores - VerySlow preset variations for x264 on 1080p60 Apex Legends
VMAF scores – VerySlow preset variations for x264 on 1080p60 Apex Legends
  • g480 on VerySlow now provides around half the buff of aq0.
  • The variant with both g480 and aq0 achieves the best VMAF scores again.

MS-SSIM

MS-SSIM scores - VerySlow preset variations for x264 on 1080p60 Apex Legends
MS-SSIM scores – VerySlow preset variations for x264 on 1080p60 Apex Legends
  • As usual, aq0 hurts the MS-SSIM score drastically.
  • g480 provides a small but consistent improvement across bitrates.

PSNR

PSNR scores - VerySlow preset variations for x264 on 1080p60 Apex Legends
PSNR scores – VerySlow preset variations for x264 on 1080p60 Apex Legends
  • Each variant is consistently either a buff or nerf in all tested presets and bitrates.

Finalists

Here we select the best curve from each preset and compare them per quality metric.

VMAF

VMAF scores - best variant of each x264 preset on 1080p60 Apex Legends
VMAF scores – best variant of each x264 preset on 1080p60 Apex Legends

The variant with aq0 AND g480 is used for all presets. The VMAF scores for x264 presets are mixed. For the most part, Fast is the winner, followed by Medium, then Slow and finally VerySlow. The only exception, and where this differs from the 720p 60 fps Apex Legends test, is at very low bitrates, where Medium beats fast and VerySlow beats Slow. However, this bitrate gives scores that are well below acceptable quality. At bitrates this low, it may be better to lower the resolution or frame rate and accept scaling artifacts instead of encoder quantization artifacts.

PSNR

PSNR scores - best variant of each x264 preset on 1080p60 Apex Legends
PSNR scores – best variant of each x264 preset on 1080p60 Apex Legends

All presets for the PSNR metric use the aq0 & g480 variant since they consistently scored the best. Compared together, they are remarkably close in scores. The Slow preset scores the lowest below 3200kbps. VerySlow scores the highest for the entire curve. Medium is second best until 7.5Mbps where Fast catches up to it and they even out.

MS-SSIM

MS-SSIM scores - best variant of each x264 preset on 1080p60 Apex Legends
MS-SSIM scores – best variant of each x264 preset on 1080p60 Apex Legends

The MS-SSIM agrees with the x264 search selection method most accurately. VerySlow scores the highest and Fast the lowest across all bitrates. Slow and Medium are approximately equal starting from 4Mbps.

Sample frames

There’s 256 encoded files occupying nearly 6GB, and the original lossless version is 2GB. So I can’t really just pop them up here for download. What I can do, is make some frame comparisons and provide all the log files if somebody wants to analyse them, or repeat the test themselves on another source.

x264 Fast g480 aq0 vs VerySlow

I selected a frame with medium amounts of movement, not too fast, not too slow. The left side is Fast aq0 g480 and the right side is VerySlow default. In VMAF’s opinion, the left side is a tiny bit better than the right side. PSNR rates the right side as a tiny bit better than the left. However, according to MS-SSIM, the right side is considerably better than the left.

x264 obviously agrees with MS-SSIM since that’s the one with adaptive quantization and a wider search space through the preset. In my personal opinion, the right side image is the better one.

Clearly VMAF favours smooth blurring while SSIM favours fine detail. When looking at the above image, take a close look around the bullet cartridge. In the right-side image it retains more detail, but some light bleeds into the surroundings. In the left-side image it’s blurred over, but looks cleaner at the same time. Another good example is the label on the gun. The left-side is smoothed over, while the right-side retains more detail, but it looks somewhat distorted in a different way.

With this in mind, pick your preferred metric for Apex 1080p60 and use the finalists graphs to examine the best preset for you.

x264 Medium 3946kbps vs 4914kbps

Above is the same frame from 2 other encodes. Both are the Medium preset with no additional flags, the only difference is the bitrate. Left-side was set to 4000 and came out at 3946, right-side set to 5000 and gave 4914. The difference is not blatantly obvious, but it does make a difference.

Conclusion

There is a little variation in the scoring in this test compared to the same at 720p. The differences between MS-SSIM’s and VMAF’s rating are more obvious and you’ll need to choose which one you prefer.

Apex Legends actually scores pretty low across the board compared to other games. It is indeed a beautiful game, and the speed it plays at will both combine to affect stream compression capacity.

In the end, for this resolution and game, you simply need to select your own preference. Would you pay the price of blockiness to retain some detail? Or do you prefer the image to be smoother at the price of blurring out some detail? In this case, I personally prefer the MS-SSIM results, but I can understand why some people prefer the other.

Data

UNREAL AUSSIES ARE ALL OVER THE WEB

HELP SUPPORT
UNREAL AUSSIES

Unreal Aussies run many events over the year to help connect and build the Australian gaming community. If you are interested in helping out in any current or future planned events or wishing to offer some more ideas for us to explore - let us know!

About Us

Unreal Aussies is for passionate gamers from all walks of life. Games come and go, but the people still remain. From meetups to tournaments, hardcore teams to charity streams, Unreal Aussies core mission is to make gaming more fun as part of a community than it can ever be alone.