dwImageTransformation empty lines and incorrect colours

I’m trying to use dwImageTransformation API to stitch together multiple tonemapped images from SF3324-100 cameras on DRIVE AGX Xavier dev kit with DRIVE SW 10.

The second camera’s output in the stitched image is missing lines and the colours are incorrect.
Screenshot: https://i.vgy.me/xd2Z3e.png

If I change

output.x = i * width;


output.x = 0;

so the second camera’s image is output to the same location as the first and overwrites it, it looks fine: https://i.vgy.me/9TPuW0.png

How can I make the second image be placed correctly next to the first?

My code simplified and with error checking ommitted:

u8 cameras = 2;
u16 width;
u16 height;

dwContextHandle_t sdk;
dwSALHandle_t sal;
dwSoftISPHandle_t isp;
dwSensorHandle_t sensor;

// Demosaic output (unused)
dwImageHandle_t rcb;
dwImageCUDA* rcb_cuda;

// Tonemap outputs
std::vector<dwImageHandle_t> rgba(cameras);
std::vector<dwImageCUDA*> rgba_cuda(cameras);

// Final stitched picture
dwImageTransformationHandle_t stitcher;
cudaStream_t stitcher_stream;
dwImageHandle_t stitched;
dwImageCUDA* stitched_cuda;

dwImageStreamerHandle_t streamer;

void render(void* out)
	for (u8 i = 0; i < cameras; i++)
		dwCameraFrameHandle_t frame;
		dwSensorCamera_readFrame(&frame, i, DW_TIMEOUT_INFINITE, sensor);

		dwImageHandle_t image;
		dwSensorCamera_getImage(&image, DW_CAMERA_OUTPUT_CUDA_RAW_UINT16, frame);

		dwImageCUDA* image_cuda;
		dwImage_getCUDA(&image_cuda, image);

		dwSoftISP_bindInputRaw(image_cuda, isp);
		dwSoftISP_bindOutputTonemap(rgba_cuda[i], isp);

		// Copy into result image
		dwRect output;
		output.x = i * width;
		output.y = 0;
		output.width = width;
		output.height = height;

		dwImageTransformation_copy(stitched, rgba[i], &output, nullptr, stitcher);

	// Wait for stitching to complete

	// Stream to the CPU and output it.
	dwImageStreamer_producerSend(stitched, streamer);

	dwImageHandle_t stitched;
	dwImageStreamer_consumerReceive(&stitched, DW_TIMEOUT_INFINITE, streamer);

	dwImageCPU* stitched_image;
	dwImage_getCPU(&stitched_image, stitched);

	// Copy the image
	std::memcpy(out, stitched_image->data[0], (cameras * width) * height);

	dwImageStreamer_consumerReturn(&stitched, streamer);
	dwImageStreamer_producerReturn(nullptr, DW_TIMEOUT_INFINITE, streamer);

void initialize()
	dwInitialize(&sdk, DW_VERSION, nullptr);
	dwSAL_initialize(&sal, sdk);

	dwSensorParams params;
	params.protocol = "camera.gmsl";
	params.parameters = "output-format=raw,camera-type=ar0231-rccb-bae-sf3324,camera-group=a,camera-count=2";
	params.auxiliarydata = nullptr;

	dwSAL_createSensor(&sensor, params, sal);

	dwCameraProperties sensor_properties;
	dwSensorCamera_getSensorProperties(&sensor_properties, sensor);

	dwSoftISPParams isp_params;
	dwSoftISP_initParamsFromCamera(&isp_params, &sensor_properties);
	dwSoftISP_initialize(&isp, &isp_params, sdk);

	// Demosaicing output image
	dwImageProperties demosaic_props;
	dwSoftISP_getDemosaicImageProperties(&demosaic_props, isp);

	width = demosaic_props.width;
	height = demosaic_props.height;

	dwImage_create(&rcb, demosaic_props, sdk);
	dwImage_getCUDA(&rcb_cuda, rcb);
	dwSoftISP_bindOutputDemosaic(rcb_cuda, isp);

	// Tonemap output images
	dwImageProperties tonemap_props{};
	tonemap_props.format = DW_IMAGE_FORMAT_RGBA_UINT8;
	tonemap_props.type = DW_IMAGE_CUDA;
	tonemap_props.width = demosaic_props.width;
	tonemap_props.height = demosaic_props.height;

	for (u8 i = 0; i < cameras; i++)
		dwImage_create(&rgba[i], tonemap_props, sdk);
		dwImage_getCUDA(&rgba_cuda[i], rgba[i]);

	// Stitched output image
	dwImageProperties properties{};
	properties.format = DW_IMAGE_FORMAT_RGBA_UINT8;
	properties.type = DW_IMAGE_CUDA;
	properties.width = cameras * width;
	properties.height = height;

	dwImage_create(&stitched, properties, sdk);
	dwImage_getCUDA(&stitched_cuda, stitched);
	dwImageStreamer_initialize(&streamer, &properties, DW_IMAGE_CPU, sdk);

	// Stitcher
	dwImageTransformationParameters stitcher_params;
	stitcher_params.ignoreAspectRatio = true;

	dwImageTransformation_initialize(&stitcher, stitcher_params, sdk);
	dwImageTransformation_setCUDAStream(stitcher_stream, stitcher);

Dear raul.tambre,
The dwImageTransformation_copy() in code snippet seems to be correct. Did you check displaying just second camera toned image output to confirm if it issue with image or stitching?

I’ve confirmed that the second tonemapped camera image is fine by itself: https://i.vgy.me/L3FwR7.png

I tested a bit more and it seems the issue only happens past the half width of the destination image.
Example with the second image offset at x=320: https://i.vgy.me/b0zRMP.png

Any other ideas why this is happening?

Dear raul.tambre,
It is difficult to give comment with the provided snippet code. Is it possible to share complete code as a DW sample?

Sure, here’s a modified version of camera_gmsl_raw showcasing the issue:

// Core
#include <dw/core/Context.h>
#include <dw/core/Logger.h>
#include <dw/core/VersionCurrent.h>

// HAL
#include <dw/sensors/Sensors.h>
#include <dw/sensors/SensorSerializer.h>
#include <dw/sensors/camera/Camera.h>

// Image
#include <dw/interop/streamer/ImageStreamer.h>

// ISP
#include <dw/isp/SoftISP.h>

// Renderer
#include <dwvisualization/core/Renderer.h>

// Transform
#include <dw/imageprocessing/geometry/imageTransformation/ImageTransformation.h>

// Sample Includes
#include <framework/DriveWorksSample.hpp>
#include <framework/Log.hpp>
#include <framework/DataPath.hpp>
#include <framework/WindowGLFW.hpp>

using namespace dw_samples::common;

class CameraGMSLRawSample : public DriveWorksSample
	dwContextHandle_t m_sdk                = DW_NULL_HANDLE;
	dwVisualizationContextHandle_t m_viz   = DW_NULL_HANDLE;
	dwSALHandle_t m_sal                    = DW_NULL_HANDLE;
	dwRendererHandle_t m_renderer          = DW_NULL_HANDLE;

	std::unique_ptr<ScreenshotHelper> m_screenshot;

	dwSoftISPHandle_t m_isp = DW_NULL_HANDLE;
	dwSensorHandle_t m_camera = DW_NULL_HANDLE;
	dwImageProperties m_cameraImageProperties;
	dwCameraProperties m_cameraProperties;
	dwImageStreamerHandle_t m_streamerCUDAtoGL = DW_NULL_HANDLE;
	dwSensorSerializerHandle_t m_serializer = DW_NULL_HANDLE;

	dwCameraFrameHandle_t m_frame = DW_NULL_HANDLE;

	dwImageHandle_t m_rcbImage = DW_NULL_HANDLE;
	dwImageCUDA* m_rcbCUDAImage;

	dwImageHandle_t m_rgbaImage = DW_NULL_HANDLE;
	dwImageCUDA* m_rgbaCUDAImage;

	dwImageHandle_t m_frameCUDA = DW_NULL_HANDLE;

	uint32_t m_ispOutput;

	bool m_recordCamera = false;

	cudaStream_t m_stream = nullptr;

	dwImageTransformationHandle_t stitcher;
	cudaStream_t stitcher_stream;
	dwImageHandle_t stitched_image;

	/// -----------------------------
	/// Initialize application
	/// -----------------------------
	CameraGMSLRawSample(const ProgramArguments& args)
		: DriveWorksSample(args)

	/// -----------------------------
	/// Initialize Renderer, Sensors, and Image Streamers, Egomotion
	/// -----------------------------
	bool onInitialize() override
		// -----------------------------------------
		// Initialize DriveWorks SDK context and SAL
		// -----------------------------------------
			// initialize logger to print verbose message on console in color

			// initialize SDK context, using data folder
			dwContextParameters sdkParams = {};

			#ifdef VIBRANTE
			sdkParams.eglDisplay = getEGLDisplay();

			CHECK_DW_ERROR_MSG(dwInitialize(&m_sdk, DW_VERSION, &sdkParams),
							   "Cannot initialize Drive-Works SDK Context");


		// initializes rendering subpart
		// - the rendering module
		// - the render buffers
		// - projection and modelview matrices
		// - renderer settings
		// -----------------------------------------
			CHECK_DW_ERROR( dwVisualizationInitialize(&m_viz, m_sdk) );
			CHECK_DW_ERROR( dwRenderer_initialize(&m_renderer, m_viz) );

			dwRect rect;
			rect.width  = getWindowWidth();
			rect.height = getWindowHeight();
			rect.x      = 0;
			rect.y      = 0;
			dwRenderer_setRect(rect, m_renderer);

		// initializes camera
		// - the SensorCamera module
		// -----------------------------------------


			CHECK_DW_ERROR(dwSAL_initialize(&m_sal, m_sdk));

			if (getArgument("camera-type").compare("ar0144-cccc-none-gazet1") == 0 ||
				getArgument("camera-type").compare("ov2311-cccc-none-none") == 0) {
				// ar0144 only supports direct tonemap output

			dwSensorParams params;
			std::string parameterString = std::string("output-format=raw+data,camera-type=") +
			parameterString             += std::string(",camera-group=") + getArgument("camera-group").c_str();
			parameterString             += std::string(",format=") + std::string(getArgument("serializer-type"));
			parameterString             += std::string(",fifo-size=") + std::string(getArgument("camera-fifo-size"));
			parameterString             += std::string(",slave=") + std::string(getArgument("tegra-slave"));
			params.parameters           = parameterString.c_str();
			params.protocol             = "camera.gmsl";

			CHECK_DW_ERROR(dwSAL_createSensor(&m_camera, params, m_sal));

			CHECK_DW_ERROR(dwSensorCamera_getSensorProperties(&m_cameraProperties, m_camera));
			log("Successfully initialized camera with resolution of %dx%d at framerate of %f FPS\n",
				m_cameraProperties.resolution.x, m_cameraProperties.resolution.y, m_cameraProperties.framerate);


		// initializes software ISP for processing RAW RCCB images
		// - the SensorCamera module
		// -----------------------------------------
			dwSoftISPParams softISPParams;
			CHECK_DW_ERROR(dwSoftISP_initParamsFromCamera(&softISPParams, &m_cameraProperties));
			CHECK_DW_ERROR(dwSoftISP_initialize(&m_isp, &softISPParams, m_sdk));

			if ((m_ispOutput & DW_SOFTISP_PROCESS_TYPE_DEMOSAIC) &&
				std::stoi(getArgument("interpolationDemosaic")) > 0) {

		// initializes camera
		// - the SensorCamera module
		// -----------------------------------------
			// we need to allocate memory for a demosaic image and bind it to the ISP
			dwImageProperties rcbProperties{};
				// getting the properties directly from the ISP
				CHECK_DW_ERROR(dwSoftISP_getDemosaicImageProperties(&rcbProperties, m_isp));
				CHECK_DW_ERROR(dwImage_create(&m_rcbImage, rcbProperties, m_sdk));
				CHECK_DW_ERROR(dwImage_getCUDA(&m_rcbCUDAImage, m_rcbImage));
				// bind the image as the output for demosaic process to the ISP, will be filled at the call of
				// dwSoftISP_processDeviceAsync
				CHECK_DW_ERROR(dwSoftISP_bindOutputDemosaic(m_rcbCUDAImage, m_isp));

			// in order ot visualize we prepare the properties of the tonemapped image
			dwImageProperties rgbaImageProperties{};
			rgbaImageProperties.format = DW_IMAGE_FORMAT_RGBA_UINT8;
			rgbaImageProperties.type = DW_IMAGE_CUDA;

				rgbaImageProperties.width = rcbProperties.width;
				rgbaImageProperties.height = rcbProperties.height;
			} else {
				// In case no demosaic operation is performed, assume
				// the width/height of the raw image.
				dwImageProperties rawImageProperties{};
							&rawImageProperties, DW_CAMERA_OUTPUT_CUDA_RAW_UINT16, m_camera));

				rgbaImageProperties.width = rawImageProperties.width;
				rgbaImageProperties.height = rawImageProperties.height;

			// Create image transformer
			dwImageTransformationParameters stitcher_params;
			stitcher_params.ignoreAspectRatio = true;

			CHECK_DW_ERROR(dwImageTransformation_initialize(&stitcher, stitcher_params, m_sdk));
			CHECK_DW_ERROR(dwImageTransformation_setCUDAStream(stitcher_stream, stitcher));

			// Final stitched image
			dwImageProperties stitched_props = rgbaImageProperties;
			stitched_props.width *= 2;
			CHECK_DW_ERROR(dwImage_create(&stitched_image, stitched_props, m_sdk));
			CHECK_DW_ERROR(dwImageStreamerGL_initialize(&m_streamerCUDAtoGL, &stitched_props, DW_IMAGE_GL, m_sdk));

			// allocate the rgba image
			CHECK_DW_ERROR(dwImage_create(&m_rgbaImage, rgbaImageProperties, m_sdk));
			CHECK_DW_ERROR(dwImage_getCUDA(&m_rgbaCUDAImage, m_rgbaImage));
			CHECK_DW_ERROR(dwSoftISP_bindOutputTonemap(m_rgbaCUDAImage, m_isp));

			m_screenshot.reset(new ScreenshotHelper(m_sdk, m_sal, getWindowWidth(), getWindowHeight(), "CameraGMSL_Raw"));

		// initializes serializer
		// -----------------------------------------
			m_recordCamera = !getArgument("write-file").empty();

			if (m_recordCamera) {
				dwSerializerParams serializerParams;
				serializerParams.parameters = "";
				std::string newParams = "";
				newParams += std::string("format=") + std::string(getArgument("serializer-type"));
				newParams += std::string(",type=disk,file=") + std::string(getArgument("write-file"));
				newParams += std::string(",encoder-instance=") + std::string(getArgument("encoder-instance"));

				serializerParams.parameters = newParams.c_str();
				serializerParams.onData     = nullptr;

				CHECK_DW_ERROR(dwSensorSerializer_initialize(&m_serializer, &serializerParams, m_camera));

		// sensor can take some time to start, it's possible to call the read function and check if the return status is ok
		// before proceding

		dwCameraFrameHandle_t frame;
		dwStatus status = DW_NOT_READY;
		do {
			status = dwSensorCamera_readFrame(&frame, 0, 500000, m_camera);
		} while (status == DW_NOT_READY);

		// something wrong happened, aborting
		if (status != DW_SUCCESS) {
			throw std::runtime_error("Cameras did not start correctly");

		CHECK_DW_ERROR(dwSensorCamera_setCUDAStream(m_stream, m_camera));
		return true;

	/// Free up used memory here
	void onRelease() override
		if (m_frame)

		if (m_camera) {

		if(m_serializer) {

		if (m_camera) {

		if (m_rcbImage) {

		if (m_rgbaImage) {

		if (m_isp) {

		if (m_streamerCUDAtoGL) {

		if (m_renderer)



	/// Main processing of the sample (combined processing and renering for more clarity)
	///     - read from camera
	///     - get an image with a useful format
	///     - use softISP to convert from raw and tonemap
	///     - render
	void onProcess() override
			// return frame
		if (m_frame)
			m_frame = DW_NULL_HANDLE;

		dwTime_t timeout = 500000;

		// read from camera
		uint32_t cameraSiblingID = 0;

		CHECK_DW_ERROR(dwSensorCamera_readFrame(&m_frame, cameraSiblingID, timeout, m_camera));

		// get an image with the desired output format (async operation)
		CHECK_DW_ERROR(dwSensorCamera_getImageAsync(&m_frameCUDA, DW_CAMERA_OUTPUT_CUDA_RAW_UINT16, m_frame));

		if (m_recordCamera) {
			dw::common::ProfileCUDASection s(getProfilerCUDA(), "serializeAsync");
			dwStatus status = dwSensorSerializer_serializeCameraFrameAsync(m_frame, m_serializer);
			if (status == DW_BUFFER_FULL)
				logError("SensorSerializer failed to serialize data, aborting.");


	void onRender() override
		if (!m_frame || isOffscreen())

		dwTime_t timeout = 33000;

		cudaStream_t cameraStream = nullptr;
		CHECK_DW_ERROR(dwSensorCamera_getCUDAStream(&cameraStream, m_camera));

		if (m_frameCUDA == DW_NULL_HANDLE)
			logError("Failed to get cuda image, aborting");

		// raw images need to be processed through the softISP
		dwImageCUDA* rawImageCUDA;
		CHECK_DW_ERROR(dwImage_getCUDA(&rawImageCUDA, m_frameCUDA));
		CHECK_DW_ERROR(dwSoftISP_bindInputRaw(rawImageCUDA, m_isp));
		// request the softISP to perform a demosaic and a tonemap. This is for edmonstration purposes, the demosaic
		// output will not be used in this sample, only the tonemap output

		// Copy the image onto one that has 2x the width with a slight x offset to demonstrate a dwImageTransformation problem.
		// https://devtalk.nvidia.com/default/topic/1068829/general/dwimagetransformation-empty-lines-and-incorrect-colours/
		dwRect output;
		output.x = 320;
		output.y = 0;
		output.width = 960;
		output.height = 604;

		CHECK_DW_ERROR(dwImageTransformation_copy(stitched_image, m_rgbaImage, &output, nullptr, stitcher));

		// stream that tonamap image to the GL domain
		CHECK_DW_ERROR(dwImageStreamerGL_producerSend(stitched_image, m_streamerCUDAtoGL));

		// receive the streamed image as a handle
		dwImageHandle_t frameGL;
		CHECK_DW_ERROR(dwImageStreamerGL_consumerReceive(&frameGL, timeout, m_streamerCUDAtoGL));

		// get the specific image struct to be able to access texture ID and target
		dwImageGL* imageGL;
		CHECK_DW_ERROR(dwImage_getGL(&imageGL, frameGL));

		// render received texture
		CHECK_DW_ERROR(dwRenderer_renderTexture(imageGL->tex, imageGL->target, m_renderer));

		// returned the consumed image
		CHECK_DW_ERROR(dwImageStreamerGL_consumerReturn(&frameGL, m_streamerCUDAtoGL));

		// notify the producer that the work is done
		CHECK_DW_ERROR(dwImageStreamerGL_producerReturn(nullptr, timeout, m_streamerCUDAtoGL));

		// screenshot if required

int main(int argc, const char **argv)

	ProgramArguments args(argc, argv,
		ProgramArguments::Option_t("camera-type", "ar0231-rccb-bae-sf3324", "camera gmsl type (see sample_sensors_info for all available camera types on this platform)\n"),

		ProgramArguments::Option_t("camera-group", "a", "Camera group [values between a and d, default a]"),

		ProgramArguments::Option_t("interpolationDemosaic", "0", "activates softISP interpolation at full resolution"),
		ProgramArguments::Option_t("serializer-type", "raw", "Serialization type for raw images, either raw or lraw"),
		ProgramArguments::Option_t("write-file", "", "If this string is not empty, then the serializer will record in this location\n"),
		ProgramArguments::Option_t("tegra-slave", "0", "Optional parameter used only for Tegra B, enables slave mode."),
		ProgramArguments::Option_t("camera-fifo-size", "3", "Size of the internal camera fifo (minimum 3). "
							  "A larger value might be required during recording due to slowdown"),
		ProgramArguments::Option_t("encoder-instance", "", "Instance of the HW encoder for LRAW. By default the instance is "
							  "altered at every frame. By specifying 0 or 1, the selected encoder will always run"),

	}, "DriveWorks camera GMSL Raw sample");

	// -------------------
	// initialize and start a window application (with offscreen support if required)
	CameraGMSLRawSample app(args);

	app.initializeWindow("Camera GMSL Raw Sample", 1280, 800, args.enabled("offscreen"));

	return app.run();

Simply run it on a DRIVE with SF3324-100 and you should see the issue.
Example screenshot: https://i.vgy.me/Ip5rnh.png

Hi raul.tambre,
I was able to reproduce the issue you are raising.

to summarize the issue: you are facing some unexpected behavior of the api dwImageTransformation_copy

we are checking it internally and will update here ASAP on the conclusion.

While we are checking this, you might consider using render engine tiles for your purpose of displaying 2 or more images in parallel (look at render engine sample)


Dear raul.tambre,
Please check the following to confirm if it is a CUDA stream synchronization issue.

  1. Add cudaStreamSynchronize(stitcher_stream) after dwImageTransformation_copy() function and see if it helps.
  2. If above does not help, remove using separate CUDA stream for imageTransformation handle

Let us know the update.

  1. Doesn't help. Nothing changed.
  2. Doesn't help either. Nothing changed.

Let me know if you’d like me to trying anything else to aid the investigation.

Render engine seems to be intended for visualizations that end up being displayed on the DRIVE. That isn’t my end goal.
I think there are a few easier workarounds I could try, but I’d prefer to use dwImageTransformation with a fix for this issue.

Hi raul.tambre,

Unfortunately, It seems we have a bug on the API dwImageTransformation_copy.

we will be working on getting it solved for the next release version of DW.

at the meantime, if you have a possible workaround, than I suggest you do that.

we’ll update when this will be solved.


Dear raul.tambre,
FYI, We have fixed this issue and it is available in next DRIVE SW release.


Great to hear!
Any idea as to when the next release might be?

Dear raul.tambre,
We will update you once the release schedule is fixed.