Usage of cudaMalloc3d()

Kindly give an example code for using cudaMalloc3d() function to allocate a logical 3d data structure in device memory.

The following snippet of code gives me an error:

struct cudaPitchedPtr {

   	void *ptr;

   	size_t pitch;

   	size_t xsize;

   	size_t ysize;



	struct cudaExtent {

	size_t width;

   	size_t height;

   	size_t depth;


		cudaExtent extent;

	extent.width = 30;

	extent.height = 30;

	extent.depth = 30;

	cudaError_t cudaMalloc3D(struct cudaPitchedPtr* pitchDevPtr, struct cudaExtent extent);

The error statement is - use of a local type to declare a function






float element;

cudaPitchedPtr devPitchedPtr;

cudaExtent extent = make_cudaExtent(3, 3, 3);

cudaMalloc3D(&devPitchedPtr, extent);

cudaMemset3D( devPitchedPtr, 50, extent);

	char* devPtr = (char*)devPitchedPtr.ptr;

	size_t pitch = devPitchedPtr.pitch;

	size_t slicePitch = pitch * extent.height;


		for (int z = 0; z < extent.depth; z++) {

			char* slice = devPtr + z * slicePitch;

			for (int y = 0; y < extent.height; y++) {

			 	float* row = (float*)(slice + y * pitch);

			 	for (int x = 0; x < extent.width; x++) {

				  element = row[x];







return 0;


When I run this code on emulator mode, the output should be the value of elements in 3D array set by cudamemset3d, right??

The output should have been 50 being printed out 27 times. but I don’t get the desired output. Somebody please try this code!!



Aren’t you forgetting to transfer the data from device to host before printing the values (whether using emulator mode or not)?


No. I am not forgetting. I am not transferrring because I donot now know how to transfer 3D datasets from device to host or back.

But this will not affect the way the code executes because I am using emulator mode. Therefore, I should be able to see the values of the elements.