2D Array Filter (very simple) Error due to my C code...

Hello Forum,

I’m a raw beginner in CUDA programing. When I run the attached program the following error-messages occour:

When I change X and Y to:

the error messages change to:

Does anyone please can help me - or give me a hint to solve this error? Maybe it’s quite simple… but I can’t figure it out External Media

Enclosed I attached the source code.

Thank you very much for your help and your time.

Best regards,

Sandra

#include <cuda.h>

#include <stdio.h>

#include <cutil_inline.h>

// Kernel that executes on the CUDA device

__global__ void filter(int *grey[X][Y], int *color[X][Y])

{

	int x = blockIdx.x * blockDim.x + threadIdx.x;

	int y = blockIdx.y * blockDim.y + threadIdx.y;

	if (x > 1 && y > 1 && x < X-1 && y < Y-1) 

	{

		color[x][y] = grey[x-1][y-1] + 

			grey[x][y-1]   + 

			grey[x+1][y-1] + 

			grey[x-1][y]   + 

			grey[x+1][y]   + 

			grey[x-1][y+1] + 

			grey[x][y+1]   + 

			grey[x+1][y+1];

	}

}

int main(void)

{

	int X = 10;

	int Y = 10;

	

	int elemente = X * Y;

	int size = elemente * sizeof(int);

	int** ENbayered_h = new int[X][Y];		  // Allocate array on host

	int** DEbayered_h = new int[X][Y];		  // Allocate array on host

	int** ENbayered_d;						  

	int** DEbayered_d;						  

	cudaMalloc((void **) &ENbayered_d, size);   // Allocate array on device

	cudaMalloc((void **) &DEbayered_d, size);   // Allocate array on device

	// blocksize

	int block_size = 256;

	// number of blocks

	int n_blocks = elemente/block_size + (size%block_size == 0 ? 0:1);

	// Initialize host array

	for (int i=0; i<X; i++) 

	{ 

		for (int j=0; j<Y; j++) 

		{

			ENbayered_h[i][j] = 1; // enbayered grey values

		}

	}

	cudaMemcpy(ENbayered_d, ENbayered_h, size, cudaMemcpyHostToDevice);

	filter <<< n_blocks, block_size >>> (ENbayered_d, DEbayered_d);

	cudaMemcpy(DEbayered_h, DEbayered_d, size, cudaMemcpyDeviceToHost);

	free(ENbayered_h); 

	free(DEbayered_h); 

	cudaFree(ENbayered_d); 

	cudaFree(DEbayered_d); 

}
__global__ void filter(int *grey[X][Y], int *color[X][Y])

That syntax is totally illegal. If you are passing pointers to pointers do it like this:

__global__ void filter(int **grey, int **color)

I should warn you that your host side memory allocation and copying to device memory are also doomed to fail as they are written as well. Consider using simple 1D arrays for you storage on both device and host, it will make life considerably less complex in the long run.

Hi,

It seems you are declaring X and Y in the wrong scope. You could declare them as global variables or you could pass them along in your kernel invocation

ex: myKernel(grey, color, X,Y)

If X,Y are known at compile time however, you might as well declare them as constant integers as that might be beneficial performance wise.

#include <cuda.h>

#include <stdio.h>

#include <cutil_inline.h>

constant int X = 10;

constant int Y = 10;

// Kernel that executes on the CUDA device

__global__ void filter(int *grey[X][Y], int *color[X][Y])

{

	int x = blockIdx.x * blockDim.x + threadIdx.x;

	int y = blockIdx.y * blockDim.y + threadIdx.y;

	if (x > 1 && y > 1 && x < X-1 && y < Y-1) 

	{

		color[x][y] = grey[x-1][y-1] + 

			grey[x][y-1]   + 

			grey[x+1][y-1] + 

		

..............

............

................

}

...........

.........

.........