textures don’t support float 3 though so I need to copy the data to float4 values. Copying the above array give a wrong result as there’s only three values for each vector. Is there any way to copy these values to a 3d texture with float4 in it and set the last element to 0 in each vector?
The corresponding thing from linear host memory to linear device memory would be:
For the moment I’m just realigning the momery myself so if anyone know of a better way, I’d be pleased to hear it :)
I have another question though. The Kernel use for the advection “crashes” when the density fields get too big. It doesn’t really crash as the program keeps executing but the values returned from the gpu are all zeroes. I guess it’s because I overstep the global device memory or something as it works in emulation mode but not otherwise. Although I think it “crashes” too early for it to be global memory.
Is there any way to check why it crashes? I’ve tried to put CUT_CHECK_ERROR(“kernel failed”) after the kernel call and CUDA_SAFE_CALL() around all other cudaFunctions. Don’t know if this is the way to do it though.
To another note though. Do anyone know how much space an 3d array and texture takes in global memory?
For a program I’m allocating two 3d textures (one with float and one with float4) where one of the textures is a density field and the other a vector field to push the densities around. I also have a result array in linear memory of the same size as the density field.
If I allocate all fields with a size of 240x240x240 everything runs smoothly. But if I change to 250x250x250 it runs out of memory on the last cudaMalloc for the resulting field.
I run on a Quadro FX 3700 which means about 512MB of global memory. If I were to have everything in linear memory the program would need about:
( 250 x 250 x 250 * 16 + 250 x 250 x 250 * 4 *2 ) / 1024 / 1024 = 300Mb of memory (disregarding any other junk and small constant memory pieces allocated)
Can the Cuda arrays really be occupying the last 200 MB of memory or am I doing something wrong?
Typically, CUDA eats up ~50 MiB. If this GPU is running a display (especially one with a fancy compositing desktop) that usage can be higher. You can check how much memory is free with cuMemGetInfo (see any of a number of recent threads on this topic).
Ah, thanks a lot. Been looking for a way to see available Cuda memory. I kinda started with Cuda just recently so haven’t dared to wander into the driver api yet but I will definitely check it out. Can’t try it out right now as I ain’t got access to that computer right now. Thanks for the help.