Crashes leads to 'out of memory'

mattis · October 19, 2009, 4:49pm

Hi everyone, this is my first post here and I have some trouble with a memory allocation:

Is it OK to allocate for instance 64x34x64 3d textures?
My program crasches at “cutilSafeCall( cudaMemcpy3D(&copyParams));” with a segmentation fault at (64x34x64) never if im using 64x64x64.
After several failures and crashes, none of my cuda code will execute due to “out of memory”. Ive heard that the device should free the memory, even if the launch fails.
Sometimes my data_h pointer has the memory address 0x2aaac488b010, is this normal? Seems very large compared to what im used :o)

My questions relates to the following code snippet (data_d is the device cudaArray pointer and dada_h is the host pointer):

[codebox]

void allocateVolumeMemory(cudaArray *(&data_d), cudaExtent extent)

    {

        cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>();

        cutilSafeCall(cudaMalloc3DArray(&data_d, &channelDesc, extent));

        checkCUDAError("Failed allocateVolumeMemory");

    }

void transferVolumeDataToDevice(cudaArray (&data_d), const float data_h, cudaExtent extent)

    {

            cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>();

// copy from host memory to device memory

            cudaMemcpy3DParms copyParams = {0};

            copyParams.srcPtr = make_cudaPitchedPtr((void*)data_h, extent.width*sizeof(float), extent.height, extent.width);

            copyParams.dstArray = data_d;

            copyParams.extent = extent;

            copyParams.kind = cudaMemcpyHostToDevice;

//copy from host memory to device memory

            cutilSafeCall( cudaMemcpy3D(&copyParams));

checkCUDAError(“Failed transferVolumeDataToDevice”);

[/codebox]

Im running Linux x64

Thank you

Smokey · October 19, 2009, 9:56pm

This is common on all platforms if the crash was severe enough, or you have enough crashes to trigger some condition inside the CUDA api/drivers… I encounter the same issue using the Driver API quite often when working on new kernels / debugging crashes and other issues.

I’m not sure if this has officially been brought up with nVidia before though…

Sadly I don’t know enough about the Runtime API to actually help you with what could be going wrong though.

Linh_Ha · October 20, 2009, 4:08am

There’s nothing officially say if is not allowed, however, I second the problem. It takes me a whole day to figure out what is going on to finally realize that I can not use every kernel size. I try the 80 x 112 x 80 kernel it is fine, however not 72x96x80. It seems that the width should be a multiplier of 16, but in your case even with a multiplier of 16 it is still crashed. It should be a bug, i’m waiting for the fix

mattis · October 20, 2009, 7:08am

I don’t really get this, kernel size? Im talking about a texture memory allocation (3d), how does that affect kernel sizes?

Also; If i allocate extra memory for the host array, say: 128x128x128 floats (on the host), there is no crashes.

eyalhir74 · October 20, 2009, 7:34am

Sounds more like an out of bounds error and not a driver/CUDA problem. Once the kernel/copy fails

can you print the error code/message to make sure its not out of bounds problem?

also if you want to debug it, try valgrind on linux or on windows try to figure out why you’re accessing data too far beyond your

pointer.

This certainly applies for the other post (from Linh Ha) - there is no mul 16 limit either on the kernel size or the memory you allocate.

eyal

mattis · October 20, 2009, 8:41am

First of all, thanks to all of you for the fast replies!

There seems to be a problem with the size of the host array.

If I allocate new float[width * height * depth] and height < width it will cause a crash. Thus

128x127x128 → crash

128x128x128 → works

128x128x64 → works

and so on…

Can you see any problem with my allocations?

eyalhir74 · October 20, 2009, 8:46am

I doubt if it has anything to do with the allocations its probably because in the kernel when you access the array

you go beyond the bounderias of the array.

You probably have some sort of code like this at the end of the kernel:

myFaultyArray[ iOutputPosition] = fCalculatedValue;

iOutputPosition is probably causing out of bounds access. Try to access fixed positions in the array, like : myFaultyArray[ 0 ] += fCalculatedValue;

and see if this crashes. If its not (and probably won’t) try to see why iOutputPosition is not calculated correctly, probably for the last block or some

other situation…

eyal

mattis · October 20, 2009, 8:52am

I doubt if it has anything to do with the allocations its probably because in the kernel when you access the array

you go beyond the bounderias of the array.

You probably have some sort of code like this at the end of the kernel:
myFaultyArray[ iOutputPosition] = fCalculatedValue;
iOutputPosition is probably causing out of bounds access. Try to access fixed positions in the array, like : myFaultyArray[ 0 ] += fCalculatedValue;

and see if this crashes. If its not (and probably won’t) try to see why iOutputPosition is not calculated correctly, probably for the last block or some

other situation…

eyal

Actually, it crashes at the:

cutilSafeCall( cudaMemcpy3D(&copyParams));

Thus, before any kernel execution.

mattis · October 20, 2009, 9:50am

copyParams.srcPtr = make_cudaPitchedPtr((void*)data_h, extent.width*sizeof(float), extent.height, extent.width);

Thats the problem…

height and width should change position… Sometimes you just don’t see the most basic errors :) Thank you all for trying to help me!

Topic		Replies	Views
Device memory size CUDA Programming and Performance	11	46832	June 6, 2008
cudaMalloc3DArray out of memory can not allocate the available amount of memory CUDA Programming and Performance	3	1809	January 31, 2011
kernel memory allocation tenets CUDA Programming and Performance	5	2436	May 12, 2008
Maximum memory allocation size CUDA Programming and Performance	7	16555	January 24, 2012
Need Help with Shared Memory Allocation for 1D and 2D Arrays in CUDA CUDA Programming and Performance	15	499	May 16, 2024
memory allocation bug? CUDA Programming and Performance	1	1845	March 27, 2007
using cudaMalloc and cudaFree within a loop unspecified launch failure! CUDA Programming and Performance	21	37690	April 23, 2009
cudaMalloc3D and friends proper use for whatever data type CUDA Programming and Performance	6	5918	July 14, 2010
Setting up 3d arryas I have some questions about how to use 3d arrays and cudaArrays CUDA Programming and Performance	10	27886	April 5, 2010
texture memory limt CUDA Programming and Performance	7	7864	July 30, 2009

Crashes leads to 'out of memory'

Related topics