Hi All,
Suppose I have the following code:
float* deviceData = 0;
cudaPitchedPtr pitchPtr = make_cudaPitchedPtr( (void*)deviceData, dimx*sizeof(float), dimx, dimy );
printf("cudaPitchedPtr: %s\n", cudaGetErrorString(cudaGetLastError()));
cudaExtent ca_extent;
ca_extent.width = dimx;
ca_extent.height = dimy;
ca_extent.depth = dimz;
cudaMalloc3D( &pitchPtr, ca_extent);
How shall I free the memory pointed to by pitchPtr? In the reference manual, cudaFree() seems only working with cudaMalloc or cudaMallocPitch.
Currently, I use
cudaFree(deviceData);
which doesn’t give me errors. But I don’t know if that really frees the memory.
Also if I have calls like
cudaPitchedPtr obj_gpu;
ca_extent = make_cudaExtent(dim[0]*sizeof(char), dim[1], dim[2]);
udaMalloc3D( &obj_gpu, ca_extent);
cudaMemset3D( obj_gpu, 0, ca_extent);
I don’t have an additional pointer to work with, how to free this block of memory?
I believe
cudaFree(deviceData);
should be changed to
cudaFree(pitchPtr.ptr);
This post is somewhat dated, but I came across it and thought I should reply.
I don’t think the call to make_cudaPitchedPtr is needed as pitchPtr is filled in by cudaMalloc3D based on the extent specified and the pitchPtr.ptr points to the memory allocated in the GPU.
from Nvidia 3.0 programmers guide:
cudaPitchedPtr devPitchedPtr;
cudaExtent extent = make_cudaExtent(64, 64, 64);
cudaMalloc3D(&devPitchedPtr, extent);
From 3.0 Reference Manual
cudaError_t cudaMalloc3D (struct cudaPitchedPtr * pitchedDevPtr, struct cudaExtent extent)
Allocates at least width * height * depth bytes of linear memory on the device and [b]returns a cudaPitchedPtr in
which ptr is a pointer to the allocated memory. [/b] The function may pad the allocation to ensure hardware alignment
requirements are met. The pitch returned in the pitch field of pitchedDevPtr is the width in bytes of the
allocation.
The returned cudaPitchedPtr contains additional fields xsize and ysize, the logical width and height of the allocation,
which are equivalent to the width and height extent parameters provided by the programmer during
allocation.
For allocations of 2D and 3D objects, it is highly recommended that programmers perform allocations using cudaMalloc3D()
or cudaMallocPitch(). Due to alignment restrictions in the hardware, this is especially true if the application
will be performing memory copies involving 2D or 3D objects (whether linear memory or CUDA arrays).
Parameters:
pitchedDevPtr - Pointer to allocated pitched device memory
extent - Requested allocation size
I believe
cudaFree(deviceData);
should be changed to
cudaFree(pitchPtr.ptr);
This post is somewhat dated, but I came across it and thought I should reply.
I don’t think the call to make_cudaPitchedPtr is needed as pitchPtr is filled in by cudaMalloc3D based on the extent specified and the pitchPtr.ptr points to the memory allocated in the GPU.
from Nvidia 3.0 programmers guide:
cudaPitchedPtr devPitchedPtr;
cudaExtent extent = make_cudaExtent(64, 64, 64);
cudaMalloc3D(&devPitchedPtr, extent);
From 3.0 Reference Manual
cudaError_t cudaMalloc3D (struct cudaPitchedPtr * pitchedDevPtr, struct cudaExtent extent)
Allocates at least width * height * depth bytes of linear memory on the device and [b]returns a cudaPitchedPtr in
which ptr is a pointer to the allocated memory. [/b] The function may pad the allocation to ensure hardware alignment
requirements are met. The pitch returned in the pitch field of pitchedDevPtr is the width in bytes of the
allocation.
The returned cudaPitchedPtr contains additional fields xsize and ysize, the logical width and height of the allocation,
which are equivalent to the width and height extent parameters provided by the programmer during
allocation.
For allocations of 2D and 3D objects, it is highly recommended that programmers perform allocations using cudaMalloc3D()
or cudaMallocPitch(). Due to alignment restrictions in the hardware, this is especially true if the application
will be performing memory copies involving 2D or 3D objects (whether linear memory or CUDA arrays).
Parameters:
pitchedDevPtr - Pointer to allocated pitched device memory
extent - Requested allocation size