Recompile ffmpeg, is it worth?

From tech
Jump to navigation Jump to search

Recompile FFmpeg: is it worth?

Overview

This article documents a controlled benchmark study aimed at evaluating whether recompiling FFmpeg with different compilers and optimization flags provides measurable performance benefits on a first-generation AMD Ryzen CPU.

The focus is on CPU-bound, real-world video transcoding workloads, not synthetic microbenchmarks. All tests were performed on the same machine, using identical inputs and methodology, varying only the compiler and compilation flags.

Test System Configuration

Hardware

  • CPU: AMD Ryzen 7 1700 (Zen 1, 8 cores / 16 threads)
  • RAM: 16 GB
  • Storage: SSD
  • Architecture: x86_64

Software Environment

  • Operating System: Linux Mint 22.3 (Cinnamon, 64-bit)
  • Debian base: trixie/sid
  • Kernel: 6.8.0-90-generic
  • CPU governor: performance (fixed during benchmarks)

Toolchain

  • FFmpeg (system): 6.1.1-3ubuntu5
  • GCC: 13.3.0
  • Clang/LLVM: 18.1.3
  • Assembler:
    • NASM 2.16.01
    • YASM 1.3.0

External Libraries

  • x264: built locally from source and linked dynamically
 (system x264 package not used)

Test Material

All input files are unmodified Blender Foundation open movies. Files were not trimmed, re-encoded, or altered in any way. SHA-256 checksums were recorded to ensure bitwise-identical inputs across all tests.

File Resolution FPS Codec
Sintel.2010.1080p.mkv 1920×818 24 H.264 High
Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film.mp4 1920×1080 60 H.264 High
Tears of Steel - Blender VFX Open Movie.mp4 1728×720 24 H.264 High

Benchmark Methodology

Transcoding Scenario

All benchmarks used a realistic end-to-end transcoding workflow:

  • Video encoder: libx264
  • Preset: slow
  • Rate control: CRF 18
  • Audio: copied bit-exact (-c:a copy)
  • Threads: 16 (matching logical CPU threads)

Example command (simplified):

ffmpeg -i input \
  -c:v libx264 -preset slow -crf 18 \
  -c:a copy -threads 16 output.mkv

Measurement

  • Tool: /usr/bin/time -v
  • Primary metric: wall clock time
  • Secondary metrics:
    • user CPU time
    • peak RSS memory
  • Three runs per test, arithmetic mean reported
  • Locale forced to LC_ALL=C to ensure numeric consistency

Tested Configurations

Baseline

  • System FFmpeg package (distribution build)

GCC Builds

  • GCC -O2
  • GCC -O2 -march=znver1 -mtune=znver1
  • GCC -O3
  • GCC -O3 -march=znver1 -mtune=znver1

Clang Builds

  • Clang -O2
  • Clang -O2 -march=znver1 -mtune=znver1

All FFmpeg builds:

  • were linked against the same locally built x264
  • used identical configure options except for compiler flags

Results Summary

Average Encoding Time (wall clock, seconds)

Video System GCC O2 GCC O2 znver1 Clang O2
Sintel 554.00 550.68 548.06 547.48
Big Buck Bunny 598.33 595.53 592.65 589.88
Tears of Steel 308.67 306.59 306.04 305.26

(Standard deviation across runs was consistently below 1 second.)

Analysis

GCC

  • Recompiling FFmpeg with GCC -O2 provides a small but consistent improvement (~0.5–0.7%).
  • Adding Zen1-specific tuning (-march=znver1) yields an additional ~0.3–0.5%.
  • -O3 does not consistently improve performance and may slightly degrade results depending on workload.

Best GCC configuration:

-O2 -march=znver1 -mtune=znver1

Clang

  • Clang -O2 outperforms all GCC configurations tested.
  • CPU-specific tuning (znver1) does not improve results with Clang and can be marginally negative.
  • Memory usage and stability remain comparable to GCC builds.

Best overall configuration:

Clang -O2

Final Conclusion

On a first-generation AMD Zen processor (Ryzen 7 1700):

  • Recompiling FFmpeg provides measurable but modest gains.
  • The best results in this study were achieved using Clang -O2, with improvements of approximately 1–1.4% over the distribution build.
  • GCC benefits slightly from CPU-specific tuning; Clang does not.
  • No configuration produced dramatic gains, as most performance-critical paths in x264 are already hand-optimized in assembly.

Practical Recommendation

Recompiling FFmpeg is worthwhile only if:

  • encoding is CPU-bound
  • workloads are frequent or long-running
  • maintaining a custom build is acceptable

For general desktop usage or GPU-based encoding, the system FFmpeg package remains the most practical choice.