NvMediaIEP bad keyframes after 3330s

Hardware Platform: DRIVE AGX Xavier™ Developer Kit
Software Version: DRIVE Software 10
Host Machine Version: native Ubuntu 18.04

NvMediaIEP seems to return very small keyframes after 3330–3410s resulting in terrible video quality.

Normal:

Once the issue occurs the quality stays like this:

NvMediaIEP bitstream output buffer sizes plotted a bit prior to and after the issue begins:

Note that the the keyframe peaks get significantly smaller and don’t recover.

Reproducer:

#include <dw/core/Context.h>
#include <dw/core/VersionCurrent.h>
#include <dw/sensors/camera/Camera.h>

#include <nvmedia_iep.h>

#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <thread>

constexpr std::uint32_t bitrate{4000000};

std::int32_t main()
{
	dwContextObject* context;
	dwInitialize(&context, DW_VERSION, nullptr);

	dwSALObject* sal;
	dwSAL_initialize(&sal, context);

	dwSensorObject* camera;
	dwSAL_createSensor(&camera, {.protocol = "camera.gmsl", .parameters = "camera-type=ar0231-rccb-bae-sf3324,camera-group=a,camera-count=1,output-format=yuv,isp-mode=yuv420-uint8"}, sal);
	dwSensor_start(camera);

	dwCameraProperties cam_props;
	dwSensorCamera_getSensorProperties(&cam_props, camera);

	dwImageProperties img_props;
	dwSensorCamera_getImageProperties(&img_props, DW_CAMERA_OUTPUT_NATIVE_PROCESSED, camera);

	NVM_SURF_FMT_DEFINE_ATTR(format_attrs);
	NVM_SURF_FMT_SET_ATTR_YUV(format_attrs, YUV, 420, SEMI_PLANAR, UINT, 8, BL);
	const std::uint32_t iep_format{NvMediaSurfaceFormatGetType(format_attrs, 7)};

	NvMediaDevice* nvmedia{NvMediaDeviceCreate()};

	NvMediaSurfAllocAttr surface_attrs[3]{
		{
			.type = NVM_SURF_ATTR_WIDTH,
			.value = img_props.width,
		},
		{
			.type = NVM_SURF_ATTR_HEIGHT,
			.value = img_props.height,
		},
		{
			.type = NVM_SURF_ATTR_CPU_ACCESS,
			.value = NVM_SURF_ATTR_CPU_ACCESS_CACHED,
		},
	};
	NvMediaImage* block_image{NvMediaImageCreateNew(nvmedia, iep_format, surface_attrs, 3, 0)};

	NvMediaEncodeInitializeParamsH265 enc_params{};
	enc_params.encodeWidth = img_props.width;
	enc_params.encodeHeight = img_props.height;
	enc_params.frameRateNum = 30;
	enc_params.frameRateDen = 1;
	enc_params.profile = NVMEDIA_ENCODE_PROFILE_HIGH;
	enc_params.level = NVMEDIA_ENCODE_LEVEL_AUTOSELECT;
	enc_params.maxNumRefFrames = 1;
	NvMediaIEP* iep{NvMediaIEPCreate(nvmedia, NVMEDIA_IMAGE_ENCODE_HEVC, &enc_params, iep_format, 4, 6, NVMEDIA_ENCODER_INSTANCE_0)};

	NvMediaEncodeConfigH265 configuration{};
	configuration.gopLength = 30;
	configuration.rcParams.rateControlMode = NVMEDIA_ENCODE_PARAMS_RC_CBR;
	configuration.rcParams.params.cbr.averageBitRate = bitrate;
	configuration.repeatSPSPPS = NVMEDIA_ENCODE_SPSPPS_REPEAT_INTRA_FRAMES;
	NvMediaIEPSetConfiguration(iep, &configuration);

	std::uint8_t* buffer{reinterpret_cast<std::uint8_t*>(std::malloc(bitrate * 2))};
	NvMediaBitstreamBuffer bitstream{};
	bitstream.bitstream = buffer;
	bitstream.bitstreamSize = bitrate * 2;

	std::thread receiver_thread{[&]()
	{
		// If everything's well the biggest buffer (the keyframe) is usually at least a fifth of the bitrate.
		const std::uint32_t expected_biggest{bitrate / 8 / 5};

		const std::chrono::steady_clock::time_point beginning{std::chrono::steady_clock::now()};
		std::chrono::steady_clock::time_point prev{std::chrono::steady_clock::now()};
		std::uint32_t biggest{};

		while (true)
		{
			std::uint32_t count;
			NvMediaIEPGetBitsEx(iep, &count, 1, &bitstream, nullptr);
			biggest = std::max(biggest, count);

			const std::chrono::steady_clock::time_point now{std::chrono::steady_clock::now()};

			if (now - prev >= std::chrono::seconds{1})
			{
				prev = now;
				std::printf("%lds %dB%s\n", std::chrono::duration_cast<std::chrono::seconds>(now - beginning).count(), biggest, biggest < expected_biggest ? " bad!" : "");
				biggest = 0;
			}
		}
	}};

	while (true)
	{
		dwCameraFrame* frame;
		dwSensorCamera_readFrame(&frame, 0, 1000000, camera);

		dwImageObject* image;
		dwSensorCamera_getImage(&image, DW_CAMERA_OUTPUT_NATIVE_PROCESSED, frame);

		dwImageNvMedia* nvmedia;
		dwImage_getNvMedia(&nvmedia, image);

		NvMediaImageSurfaceMap map, block_map;
		NvMediaImageLock(nvmedia->img, NVMEDIA_IMAGE_ACCESS_READ, &map);
		NvMediaImageLock(block_image, NVMEDIA_IMAGE_ACCESS_WRITE, &block_map);

		void* planes[]{map.surface[0].mapping, map.surface[1].mapping, map.surface[2].mapping};
		std::uint32_t pitches[]{map.surface[0].pitch, map.surface[1].pitch, map.surface[2].pitch};
		NvMediaImagePutBits(block_image, nullptr, planes, pitches);

		NvMediaImageUnlock(nvmedia->img);
		NvMediaImageUnlock(block_image);
		dwSensorCamera_returnFrame(&frame);

		NvMediaEncodePicParamsH265 config{};
		NvMediaIEPFeedFrame(iep, block_image, nullptr, &config, NVMEDIA_ENCODER_INSTANCE_AUTO);
	}
}

The code can be compiled as a DriveWorks sample. Requires a SF3324 attached to group A port 1 to be ran.
The reproducer will print the biggest bitstream buffer received from NvMediaIEP every second.

The GOP length is set to 30 frames so there will be a keyframe per second.
The keyframes are usually at least 1/5th the bitrate. If this condition is not met the logged message will contain the text “bad”.

After 3330–3410s the size of the biggest buffer for every second drops dramatically and never recovers. Be patient as the time it takes for the issue to begin varies a bit.
This beginning of the issue coincides with a dramatical drop in video quality in our real-world application.
The symptoms are as if no more keyframes are produced.

Notably the average bitrate is still met, just the keyframes seem to be very small (and rest bigger to compensate) and as a result the resulting quality of the video is a mush.

This seems like a bug in NvMediaIEP. We’ve worked around this by recreating the encoder every 3330s (the shortest time we’ve seen for the issue to occur), but that’s obviously not an ideal solution.
Could this issue be looked into and advice be provided on how to to configure NvMediaIEP such that this doesn’t occur?

Hi @raul.tambre,

Thank you for the detailed information! We will check with the team and get back to you here.

We tried to simulate your application with our application.
Ran it for almost 2 hours but still cannot reproduce the issue.
May I know if it only happens for a particular content or content agnostic?

I believe it’s content agnostic, but much easier to notice with a higher baseline keyframe size.
Thus I would advise having the camera look at something “interesting” to make the issue more noticeable, i.e. not a blank wall or a cardboard box. A baseline of the keyframe being at least >40’000B would be very good.

Do I understand correctly you didn’t try with the minimal example code I posted? Please try with that.

I tested again am still able to reproduce this.
I’ve improved the sample with SIGINT handling, logging all bitstream buffer sizes to /var/log/nvm_repro.log and printing the real bitrate for your convenience:

#include <dw/core/Context.h>
#include <dw/core/VersionCurrent.h>
#include <dw/sensors/camera/Camera.h>

#include <nvmedia_iep.h>

#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <fstream>
#include <thread>

#include <signal.h>

constexpr std::uint32_t bitrate{4'000'000}; // 4 Mbit/s
bool running{true};

void sigint(std::int32_t)
{
	running = false;
}

std::int32_t main()
{
	struct sigaction action{};
	action.sa_handler = &sigint;
	sigaction(SIGINT, &action, nullptr);

	dwContextObject* context;
	dwInitialize(&context, DW_VERSION, nullptr);

	dwSALObject* sal;
	dwSAL_initialize(&sal, context);

	dwSensorObject* camera;
	dwSAL_createSensor(&camera, {.protocol = "camera.gmsl", .parameters = "camera-type=ar0231-rccb-bae-sf3324,camera-group=a,camera-count=1,output-format=yuv,isp-mode=yuv420-uint8"}, sal);
	dwSensor_start(camera);

	dwCameraProperties cam_props;
	dwSensorCamera_getSensorProperties(&cam_props, camera);

	dwImageProperties img_props;
	dwSensorCamera_getImageProperties(&img_props, DW_CAMERA_OUTPUT_NATIVE_PROCESSED, camera);

	NVM_SURF_FMT_DEFINE_ATTR(format_attrs);
	NVM_SURF_FMT_SET_ATTR_YUV(format_attrs, YUV, 420, SEMI_PLANAR, UINT, 8, BL);
	const std::uint32_t iep_format{NvMediaSurfaceFormatGetType(format_attrs, 7)};

	NvMediaDevice* nvmedia{NvMediaDeviceCreate()};

	NvMediaSurfAllocAttr surface_attrs[3]{
		{
			.type = NVM_SURF_ATTR_WIDTH,
			.value = img_props.width,
		},
		{
			.type = NVM_SURF_ATTR_HEIGHT,
			.value = img_props.height,
		},
		{
			.type = NVM_SURF_ATTR_CPU_ACCESS,
			.value = NVM_SURF_ATTR_CPU_ACCESS_CACHED,
		},
	};
	NvMediaImage* block_image{NvMediaImageCreateNew(nvmedia, iep_format, surface_attrs, 3, 0)};

	NvMediaEncodeInitializeParamsH265 enc_params{};
	enc_params.encodeWidth = img_props.width;
	enc_params.encodeHeight = img_props.height;
	enc_params.frameRateNum = 30;
	enc_params.frameRateDen = 1;
	enc_params.profile = NVMEDIA_ENCODE_PROFILE_HIGH;
	enc_params.level = NVMEDIA_ENCODE_LEVEL_AUTOSELECT;
	enc_params.maxNumRefFrames = 1;
	NvMediaIEP* iep{NvMediaIEPCreate(nvmedia, NVMEDIA_IMAGE_ENCODE_HEVC, &enc_params, iep_format, 4, 6, NVMEDIA_ENCODER_INSTANCE_0)};

	NvMediaEncodeConfigH265 configuration{};
	configuration.gopLength = 30;
	configuration.rcParams.rateControlMode = NVMEDIA_ENCODE_PARAMS_RC_CBR;
	configuration.rcParams.params.cbr.averageBitRate = bitrate;
	configuration.repeatSPSPPS = NVMEDIA_ENCODE_SPSPPS_REPEAT_INTRA_FRAMES;
	NvMediaIEPSetConfiguration(iep, &configuration);

	std::thread receiver_thread{[&]()
	{
		std::uint8_t* buffer{reinterpret_cast<std::uint8_t*>(std::malloc(bitrate * 2))};
		NvMediaBitstreamBuffer bitstream{};
		bitstream.bitstream = buffer;
		bitstream.bitstreamSize = bitrate * 2;

		const std::chrono::steady_clock::time_point beginning{std::chrono::steady_clock::now()};
		std::chrono::steady_clock::time_point prev{std::chrono::steady_clock::now()};
		std::uint32_t biggest{};
		std::uint32_t bytes{};

		std::ofstream data{"/var/log/nvm_repro.log", std::ios_base::out | std::ios_base::trunc};

		while (running)
		{
			std::uint32_t count;
			NvMediaIEPGetBitsEx(iep, &count, 1, &bitstream, nullptr);
			biggest = std::max(biggest, count);
			bytes += count;
			data << count << "\n";

			const std::chrono::steady_clock::time_point now{std::chrono::steady_clock::now()};

			if (now - prev >= std::chrono::seconds{1})
			{
				prev = now;
				std::printf("%lds %dB %dB/s\n", std::chrono::duration_cast<std::chrono::seconds>(now - beginning).count(), biggest, bytes);
				biggest = 0;
				bytes = 0;
			}
		}
	}};

	while (running)
	{
		dwCameraFrame* frame;
		dwSensorCamera_readFrame(&frame, 0, 1000000, camera);

		dwImageObject* image;
		dwSensorCamera_getImage(&image, DW_CAMERA_OUTPUT_NATIVE_PROCESSED, frame);

		dwImageNvMedia* nvmedia;
		dwImage_getNvMedia(&nvmedia, image);

		NvMediaImageSurfaceMap map, block_map;
		NvMediaImageLock(nvmedia->img, NVMEDIA_IMAGE_ACCESS_READ, &map);
		NvMediaImageLock(block_image, NVMEDIA_IMAGE_ACCESS_WRITE, &block_map);

		void* planes[]{map.surface[0].mapping, map.surface[1].mapping, map.surface[2].mapping};
		std::uint32_t pitches[]{map.surface[0].pitch, map.surface[1].pitch, map.surface[2].pitch};
		NvMediaImagePutBits(block_image, nullptr, planes, pitches);

		NvMediaImageUnlock(nvmedia->img);
		NvMediaImageUnlock(block_image);
		dwSensorCamera_returnFrame(&frame);

		NvMediaEncodePicParamsH265 config{};
		NvMediaIEPFeedFrame(iep, block_image, nullptr, &config, NVMEDIA_ENCODER_INSTANCE_AUTO);
	}

	receiver_thread.join();
}

Pre-compiled: nvm_repro.tar.gz (252.4 KB)

Logs from an example run:
stdout.txt (120.7 KB)
nvm_repro.log (967.3 KB)

Graphs based on those logs:



I believe the buffer size scatter plot demonstrates the degradation pretty well.
Do let me know if you need any assistance in reproducing the issue, I’d be happy to help.

1 Like