Out of memory?

Hi there, im trying to figure out why a cuda_safe_call returns an out of memory error in my application.

Important lines are as follows:

#define KERNEL_W  15

#define KERNEL_H  15

#define DATA_W  8000

#define DATA_H 8000

const int KERNEL_SIZE = KERNEL_W * KERNEL_H * sizeof(float);

const int   DATA_SIZE = DATA_W   * DATA_H   * sizeof(float);

__device__ __constant__ float d_Kernel[KERNEL_W*KERNEL_H];

texture<float, 2, cudaReadModeElementType> texData;

float *d_Result;

cudaArray *a_Data;

cudaChannelFormatDesc floatTex = cudaCreateChannelDesc<float>();

CUDA_SAFE_CALL( cudaMalloc((void **)&d_Result,   DATA_SIZE) );

CUDA_SAFE_CALL( cudaMallocArray(&a_Data, &floatTex, DATA_W, DATA_H) );

This last line is the one that fails and i cant really see why. 7000x7000 works fine but 8000x8000 is no go.

Unless im missing something, this amounts to 8000x8000x2 floats to allocate if we forget the small kernel which is in constant memory and well under 65k. The x2 is for data texture array and result array. This then yields 122MB of data.

I have a 8800gt with 512MB of vram.

Can anyone enlighten me on this?

Thanks!

Check your math: (8 000 * 8 000 * 2 * 4 ) / (1 024^2) = 488.28125
There is some memory already used for the framebuffer and other stuff => you can’t get the full 512 MiB.

Duh, i had 1 byte to a float all along…! It was late and all… and you know… other excuses like that!

Thanks!