cudaError_enum on large Arrays

Hey guys - I got this pretty common error (I think), but i can’t figure out a solution although i read the other threads about it :/

What i have is actually some pretty simple code:

__global__ void sobelThis(Pixel* pixels, int iWidth, int iHeight) {

	for(long i = 0; i < iWidth * iHeight; i++) {

 Â pixels[i].Red += 1;



void bridgeSobelThis(Pixel* pixels, int iWidth, int iHeight) {

	printf("Sobeling... ");

	Pixel *devicePic;

	// alloc memory on device

	CUDA_SAFE_CALL(cudaMalloc((void**)&devicePic, sizeof(Pixel) * iWidth * iHeight));

	CUDA_SAFE_CALL(cudaMemcpy(devicePic, pixels, sizeof(Pixel) * iWidth * iHeight, cudaMemcpyHostToDevice));

	sobelThis<<<1, 1>>>(devicePic, iWidth, iHeight);


	CUDA_SAFE_CALL(cudaMemcpy(pixels, devicePic, sizeof(Pixel) * iWidth * iHeight, cudaMemcpyDeviceToHost));




What i got is an array of Pixels which is iWidth * iHeight big - I want to do some simple computation on every pixel.

  • So I allocate some memory on the global memory of my device

  • Copy the data from my *pixel memory to the device

  • Call the Kernel (all it does atm is only increasing a value… ;))

  • Copy the computated memory back to the host

  • Release the used Memory on the device

The thing is - when I do call “pixels[i].Red += 1;” in my kernel then my driver crashes and therefor my program (VS says the error is cudaError_enum).

When i uncomment that part and the kernel therefor does nothing everything is fine.

Also when my Pixel array is pretty small (1000 x 750 pixels) everything is ok too.

The crash happens for pictures of the size 3000 x 2000 ie. That would be about 25 Megs on the device memory which should be ok.

Does someone have an idea why my driver could crash at this point?

Thx for any help ;)

FYI: I think i solved the “problem”

When my pictures became quite big and I wanted to do all my calculations in one single thread then the execution time was bigger than 5 seconds and my driver “crashed” because this thread had a timeout.