Performance question about nvenc

Has anyone ever tested the compression ratio of nvenc? I tried to test the performance of NVEnc using rigaya/NVEnc. Below is the download address of rigaya/NVEnc.

I tested hevc and h264. But the pompression ratio of hevc was lower than that of h264. It is quite strange.

NVEncC64.exe --input-res 3840x2160 --fps 30 --raw --input-csp yuv420p -i D:\Test\asia.yuv -c hevc --cqp 26:26:26 -u performance -b 0 --gop-len 256 --mv-precision Q-pel --tier high  -o D:\asia.h265
--------------------------------------------------------------------------------
D:\asia.h265
--------------------------------------------------------------------------------
NVEncC (x64) 5.46 (r2126) by rigaya, Mar  5 2022 12:31:25 (VC 1929/Win)
OS Version     Windows Server 2016 x64 (17763) [CP936]
CPU            Intel Xeon(R) Platinum 8163 @ 2.50GHz [TB: 2.70GHz] (5C/10T)
GPU            #0: GRID T4-8Q (2560 cores, 1590 MHz)[2147483.64]
NVENC / CUDA   NVENC API 9.0, CUDA 10.1, schedule mode: auto
Input Buffers  CUDA, 13 frames
Input Info     raw(yv12)->nv12 [AVX2], 3840x2160, 30/1 fps
Vpp Filters    copyHtoD
Output Info    H.265/HEVC main @ Level auto
               3840x2160p 1:1 30.000fps (30/1fps)
Encoder Preset performance
Rate Control   CQP  I:26  P:26  B:26
Lookahead      off
GOP length     256 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             off
CU max / min   auto / auto
Others         mv:Q-pel

encoded 1388 frames, 11.69 fps, 9753.77 kbps, 53.80 MB
encode time 0:01:58, CPU: 1.0%, GPU: 3.5%, VE: 5.5%
frame type IDR    6
frame type I      6,  total size   3.01 MB
frame type P   1382,  total size  50.78 MB
NVEncC64.exe --input-res 3840x2160 --fps 30 --raw --input-csp yuv420p -i D:\Test\asia.yuv -c h264 --cqp 26:26:26 -u performance -b 0 --gop-len 256 --mv-precision Q-pel --tier high  -o D:\asia.h264
--------------------------------------------------------------------------------
D:\asia.h264
--------------------------------------------------------------------------------
NVEncC (x64) 5.46 (r2126) by rigaya, Mar  5 2022 12:31:25 (VC 1929/Win)
OS Version     Windows Server 2016 x64 (17763) [CP936]
CPU            Intel Xeon(R) Platinum 8163 @ 2.50GHz [TB: 2.71GHz] (5C/10T)
GPU            #0: GRID T4-8Q (2560 cores, 1590 MHz)[2147483.64]
NVENC / CUDA   NVENC API 9.0, CUDA 10.1, schedule mode: auto
Input Buffers  CUDA, 13 frames
Input Info     raw(yv12)->nv12 [AVX2], 3840x2160, 30/1 fps
Vpp Filters    copyHtoD
Output Info    H.264/AVC high @ Level auto
               3840x2160p 1:1 30.000fps (30/1fps)
Encoder Preset performance
Rate Control   CQP  I:26  P:26  B:26
Lookahead      off
GOP length     256 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             off
Others         mv:Q-pel cabac deblock adapt-transform:auto

encoded 1388 frames, 153.66 fps, 7984.46 kbps, 44.04 MB
encode time 0:00:09, CPU: 11.7%, GPU: 54.5%, VE: 99.0%
frame type IDR    6
frame type I      6,  total size   3.24 MB
frame type P   1382,  total size  40.80 MB

qp scale in hevc and avc are somewhat different. So there is no way to know without also measuring quality using SSIM.

I also tested SSIM. But hevc is still not good enough.

NVEncC64.exe --input-res 3840x2160 --fps 30 --raw --input-csp yuv420p -i D:\Test\asia.yuv -c h264 --cqp 26 -u quality -b 0 --gop-len 256 --mv-precision Q-pel --tier high  --strict-gop --ref 1 -o D:\asia.h264 --ssim
--------------------------------------------------------------------------------
D:\asia.h264
--------------------------------------------------------------------------------
NVEncC (x64) 5.46 (r2126) by rigaya, Mar  5 2022 12:31:25 (VC 1929/Win)
OS Version     Windows Server 2016 x64 (17763) [CP936]
CPU            Intel Xeon(R) Platinum 8163 @ 2.50GHz [TB: 2.70GHz] (5C/10T)
GPU            #0: GRID T4-8Q (2560 cores, 1590 MHz)[2147483.64]
NVENC / CUDA   NVENC API 9.0, CUDA 10.1, schedule mode: auto
Input Buffers  CUDA, 13 frames
Input Info     raw(yv12)->nv12 [AVX2], 3840x2160, 30/1 fps
Vpp Filters    copyHtoD
               ssim (yv12)
Output Info    H.264/AVC high @ Level auto
               3840x2160p 1:1 30.000fps (30/1fps)
Encoder Preset quality
Rate Control   CQP  I:26  P:26  B:26
Lookahead      off
GOP length     256 frames
B frames       0 frames [ref mode: disabled]
Ref frames     1 frames
AQ             off
Others         mv:Q-pel cabac deblock adapt-transform:auto

encoded 1388 frames, 84.69 fps, 7851.97 kbps, 43.31 MB
encode time 0:00:16, CPU: 12.0%, GPU: 49.5%, VE: 99.3%, VD: 46.9%
frame type IDR    6
frame type I      6,  total size   3.18 MB
frame type P   1382,  total size  40.13 MB
ssim/psnr/vmaf: SSIM YUV: 0.987859 (19.157489), 0.984302 (18.041543), 0.987482 (19.024692), All: 0.987203 (18.929057), (Frames: 1388)
NVEncC64.exe --input-res 3840x2160 --fps 30 --raw --input-csp yuv420p -i D:\Test\asia.yuv -c h265 --cqp 27 -u quality -b 0 --gop-len 256 --mv-precision Q-pel --tier high  --strict-gop --ref 1 -o D:\asia.h265 --ssim
--------------------------------------------------------------------------------
D:\asia.h265
--------------------------------------------------------------------------------
NVEncC (x64) 5.46 (r2126) by rigaya, Mar  5 2022 12:31:25 (VC 1929/Win)
OS Version     Windows Server 2016 x64 (17763) [CP936]
CPU            Intel Xeon(R) Platinum 8163 @ 2.50GHz [TB: 2.70GHz] (5C/10T)
GPU            #0: GRID T4-8Q (2560 cores, 1590 MHz)[2147483.64]
NVENC / CUDA   NVENC API 9.0, CUDA 10.1, schedule mode: auto
Input Buffers  CUDA, 13 frames
Input Info     raw(yv12)->nv12 [AVX2], 3840x2160, 30/1 fps
Vpp Filters    copyHtoD
               ssim (yv12)
Output Info    H.265/HEVC main @ Level auto
               3840x2160p 1:1 30.000fps (30/1fps)
Encoder Preset quality
Rate Control   CQP  I:27  P:27  B:27
Lookahead      off
GOP length     256 frames
B frames       0 frames [ref mode: disabled]
Ref frames     1 frames
AQ             off
CU max / min   auto / auto
Others         mv:Q-pel

encoded 1388 frames, 62.37 fps, 8394.74 kbps, 46.30 MB
encode time 0:00:22, CPU: 11.4%, GPU: 36.6%, VE: 99.5%, VD: 18.0%
frame type IDR    6
frame type I      6,  total size   2.78 MB
frame type P   1382,  total size  43.52 MB
ssim/psnr/vmaf: SSIM YUV: 0.988690 (19.465536), 0.982960 (17.685348), 0.985537 (18.397503), All: 0.987210 (18.931247), (Frames: 1388)