Details about NVENC in Turing?

Now when Turing is publicly announced would it be possible for Nvidia to share more detailed information about the NVENC implementation?

My particular interest is if the HEVC encoder now also supports B frames and if you added 4:2:2 support, however performance comparison and any other information is also of interest.

Thanks for sharing.

/Anders

Yes come on Nvidia, spill the beans.

I’ve read that under Turing there is the same quality at 25% reduced bitrate for HEVC. I’d like to know what technologies have been added (B-frames maybe?) and will this improvement work out of the box or does FFmpeg etc need adding support.

More info please.

I would also be interested if NVDEC is improved, for example if 10- and 12-bit VP9 decode is supported in all GPU’s. And maybe if the GPU can assist with decoding the new AV1 codec in any way.

So any chance of some feedback, Nvidia?

How about updating the GPU support matrix with the new hardware?

I am also interested in these issues.
But no information.

I am very interested in the quality of HEVC encoding. Previously, it was not very good.
And what about AV1 support?

Some information can be found here:


but there’s no Turing

Not really the depth I want but it’s what NVIDIA seam to release to us so far.

Turing GPUs also ship with an enhanced NVENC encoder unit that adds support for H.265 (HEVC) 8K encode at 30 fps. The new NVENC encoder provides up to 25% bitrate savings for HEVC and up to 15% bitrate savings for H.264.

Turing’s new NVDEC decoder has also been updated to support decoding of HEVC YUV444 10/12b HDR at 30 fps, H.264 8K, and VP9 10/12b HDR.

Turing improves encoding quality compared to prior generation Pascal GPUs and compared to software encoders. Figure 11 shows that on common Twitch and YouTube streaming settings, Turing’s video encoder exceeds the quality of the x264 software-based encoder using the fast encode settings, with dramatically lower CPU utilization. 4K streaming is too heavy a workload for encoding on typical CPU setups, but Turing’s encoder makes 4K streaming possible.

https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/

/Anders

Thanks - it’s not clear whether these improvements will be automatic, or need software changes and an updated SDK, which hasn’t apparently been released yet.

NVIDIA?

We have RTX 2080 Asus card and i did fast test of NVENC:

  1. It looks like RTX 2080 has only one NVENC engine, GTX 1080 has two engines
  2. It is 2times slower than 1080 GTX in one instance (150 fps vs 300 fps), so actualy 4times slower!
  3. Quality of HEVC is almost same

Tested on 410.57, FHD 25 fps, bitrage 1Mbit/s.

Atleast something is good, NVDEC is more than 2 times faster on RTX card :)))))

RTX card also still doesn’t support B-frames for HEVC, so i think 25 quality improvment is just not true.

I hope NVidia will fix this with new drivers if possible.

We did also quality comparison and PSNR, it looks like without any modification quality is ver similar:

Improvment over GTX 1080:
H264 - BF: 3, PRESET: SLOW, PROFILE: HIGH - 0.5 PSNR
HEVC - BF: 0, PRESET: SLOW - 0.2 PSNR

I think it will be better when Nvidia release Video Codec SDK 9 or 10 for Turing platform

Thanks for the test results, much appreciated!

Yes it seems like that the mentioned 25% improvement/bitrate reduction is going to need software support.

Do we know for sure that the RTX hardware doesn’t support B frames for HEVC?

Tomorrow i will have RTX 2080 TI for new tests, still waiting for new Video SDK from Nvidia.

What about the CUDA 10.0 SDK? Is not that package including the correct drivers for Turing NVENC/NVDEC?

/Anders

Cuda is for computing only, it is not for video decoding/encoding.

I had a few members of a greek forum to test their new RTX 2080 and RTX 2080 Ti in NVENC. The results are really bad. Performance is lower than GTX 1070 Ti with single encode (240-260 fps for turing vs 310 fps for Pascal (1070ti) in command: ffmpeg.exe -hwaccel cuvid -c:v h264_cuvid -f mpegts -i HD-h264.ts -vcodec h264_nvenc -preset slow -c:a copy -f mpegts -y output.ts
Input file is from ffmpeg samples, url: https://samples.ffmpeg.org/V-codecs/h264/HD-h264.ts
Tests were done in Windows since owners got the cards for gaming and don’t have Linux.

And the worst thing, both RTX 2080 and 2080 Ti drop to half performance with 2 concurrent encodes, indicating only 1 NVENC.

So, if you need NVENC performance, you should stick with Pascal, at least for now. Epic fail for Turing.

Thanks for sharing your feedback.

How about quality - personally I’m more interested in what the quality of the encode is like vs Pascal.

Clearly they’ve only include one NVENC unit, so performance of parallel encodes is not going to be as good, but do you members see any difference in the quality?

Quality has been mentioned a few posts above, currently seems similar to Pascal.
Performance is worse even with single encode, although in my case the performance difference is not tha big (250 vs 310 fps - around 20% worse) as Thunderm reported (150 vs 300 fps)

I just got RTX 2080 TI.

Quality is even worst than on Pascal architecture, with small data rates (1M for FHD) it is clearly visible.

Performance drop depends on GPU and used codec, here is speed in FPS for SD channel.

GPU             DEC	H264	H265	POSTPROCESSING
GTX 1070	2600	2600	1800	5000
GTX 1080	2600	5200	2600	10000
GTX 1080 TI	2600	5200	2600	10000
RTX 2080	5800	1700	600	5800
RTX 2080 TI	5800	1700	600	5800

So what software are you using for your encodes? FFmpeg?

It seems odd they’d lower the quality, maybe it needs the new SDK…

Test today with new 410.66 drivers, teher is small improvment:

NVDEC 5800 FPS -> 8000 FPS
NVENC H264 1600 FPS -> 2000 FPS
NVENC H265 600 FPS -> 750 FPS

Quality is still same as with previous drivers.

What is the point of having high speed with poor quality?
It seems that Nvidia is not going to develop video encoding. Therefore, we have no feedback about NVENC from Nvidia.

Nvidia, say something, please. Nvidia, #Nvidia, @Nvidia