Hi.
I’m trying to encode frames using AV1 hardware encoder of NVENC, targeting the best possible quality.
I’m surprised to observe that -tune ull (ultra low latency) leads to a better quality compared to -tune uhq (ultra HQ), on a RTX 5090 with 570.124.06 Linux drivers.
The ffmpeg command I use is:
ffmpeg -hide_banner -loglevel error -i raw_frames.yuv -c:v av1_nvenc -video_track_timescale 50 -preset p7 -tune uhq -rc constqp -qp 1 -bitrate 1G -y encoded_frames.mp4
Here is a minimal script that downloads and encodes a bunch of images for comparison:
#!/usr/bin/env bash
set -e
# 500 frames, 720p, YUV 4:2:0, 50 FPS.
url_source="https://media.xiph.org/video/derf/y4m/ducks_take_off_420_720p50.y4m"
checksum="547cce45773077e27c71fd02d3411237"
ref_path="/tmp/encoding_ref.y4m"
if [ ! -f "$ref_path" ]; then
echo "Downloading video sample to '$ref_path'."
wget "$url_source" -O "$ref_path"
fi
if ! echo "$checksum $ref_path" | md5sum -c --status; then
echo "Checksum mismatched. Files will be deleted, please retry."
rm -f "$ref_path"
exit 1
fi
declare -A ssim_results
declare -A psnr_results
declare -A size_results
keys=()
tunes="ull ll hq uhq"
modes="vbr constqp"
for mode in $modes; do
for tune in $tunes; do
echo "Encoding video in mode '$mode' using tune '$tune'"
key="${mode}_${tune}"
keys+=("$key")
encode_path="/tmp/encoded_${key}.mp4"
if [ "$mode" == "vbr" ]; then
ffmpeg -hide_banner -loglevel error -i "$ref_path" -c:v av1_nvenc -video_track_timescale 50 -preset p7 -tune $tune -rc vbr -bitrate 1G -y "$encode_path"
else
ffmpeg -hide_banner -loglevel error -i "$ref_path" -c:v av1_nvenc -video_track_timescale 50 -preset p7 -tune $tune -rc constqp -qp 1 -bitrate 1G -y "$encode_path"
fi
result=$(ffmpeg -i "$encode_path" -i "$ref_path" -lavfi "[0:v][1:v]ssim;[0:v][1:v]psnr" -f null - 2>&1)
psnr=$(echo "$result" | grep -oP "PSNR .*? average:\K[0-9.]+")
ssim=$(echo "$result" | grep -oP "SSIM .*? All:\K[0-9.]+")
size=$(du -h "$encode_path" | cut -f1)
psnr_results["$key"]=$psnr
ssim_results["$key"]=$ssim
size_results["$key"]=$size
done
done
echo
echo "======== Summary of SSIM and PSNR results ========"
for key in "${keys[@]}"; do
echo "- $key:"
echo " PSNR: ${psnr_results[$key]}"
echo " SSIM: ${ssim_results[$key]}"
echo " Size: ${size_results[$key]}"
done
It produces the following output:
======== Summary of SSIM and PSNR results ========
- vbr_ull:
PSNR: 26.435912
SSIM: 0.735398
Size: 2.6M
- vbr_ll:
PSNR: 26.435912
SSIM: 0.735398
Size: 2.6M
- vbr_hq:
PSNR: 27.462104
SSIM: 0.786980
Size: 2.7M
- vbr_uhq:
PSNR: 27.468363
SSIM: 0.789680
Size: 2.6M
- constqp_ull:
PSNR: 58.942369
SSIM: 0.999355
Size: 381M
- constqp_ll:
PSNR: 58.942369
SSIM: 0.999355
Size: 381M
- constqp_hq:
PSNR: 52.918545
SSIM: 0.998359
Size: 335M
- constqp_uhq:
PSNR: 49.381956
SSIM: 0.996885
Size: 327M
How to explain that PSNR/SSIM values are better with -tune ull than with -tune uhq in constQP mode?