how to free memory allocated by cudaMalloc3D?

merlin74 · June 9, 2009, 1:24am

Hi All,

Suppose I have the following code:

float* deviceData = 0;

cudaPitchedPtr pitchPtr = make_cudaPitchedPtr( (void*)deviceData, dimx*sizeof(float), dimx, dimy );

printf("cudaPitchedPtr: %s\n", cudaGetErrorString(cudaGetLastError()));

cudaExtent ca_extent;

ca_extent.width  = dimx;

ca_extent.height = dimy;

ca_extent.depth  = dimz;

cudaMalloc3D( &pitchPtr, ca_extent);

How shall I free the memory pointed to by pitchPtr? In the reference manual, cudaFree() seems only working with cudaMalloc or cudaMallocPitch.

Currently, I use

cudaFree(deviceData);

which doesn’t give me errors. But I don’t know if that really frees the memory.

Also if I have calls like

cudaPitchedPtr  obj_gpu;

ca_extent = make_cudaExtent(dim[0]*sizeof(char), dim[1], dim[2]);

udaMalloc3D( &obj_gpu, ca_extent);

cudaMemset3D( obj_gpu, 0, ca_extent);

I don’t have an additional pointer to work with, how to free this block of memory?

bitminer · June 29, 2010, 4:32pm

I believe

cudaFree(deviceData);

should be changed to

cudaFree(pitchPtr.ptr);

This post is somewhat dated, but I came across it and thought I should reply.

I don’t think the call to make_cudaPitchedPtr is needed as pitchPtr is filled in by cudaMalloc3D based on the extent specified and the pitchPtr.ptr points to the memory allocated in the GPU.

from Nvidia 3.0 programmers guide:

cudaPitchedPtr devPitchedPtr; 

cudaExtent extent = make_cudaExtent(64, 64, 64); 

cudaMalloc3D(&devPitchedPtr, extent);

From 3.0 Reference Manual

cudaError_t cudaMalloc3D (struct cudaPitchedPtr * pitchedDevPtr, struct cudaExtent extent)

Allocates at least width * height * depth bytes of linear memory on the device and [b]returns a cudaPitchedPtr in

which ptr is a pointer to the allocated memory. [/b] The function may pad the allocation to ensure hardware alignment

requirements are met. The pitch returned in the pitch field of pitchedDevPtr is the width in bytes of the

allocation.

The returned cudaPitchedPtr contains additional fields xsize and ysize, the logical width and height of the allocation,

which are equivalent to the width and height extent parameters provided by the programmer during

allocation.

For allocations of 2D and 3D objects, it is highly recommended that programmers perform allocations using cudaMalloc3D()

or cudaMallocPitch(). Due to alignment restrictions in the hardware, this is especially true if the application

will be performing memory copies involving 2D or 3D objects (whether linear memory or CUDA arrays).

Parameters:

pitchedDevPtr - Pointer to allocated pitched device memory

extent - Requested allocation size

bitminer · June 29, 2010, 4:32pm