Segmentation Fault

Hey guys, I’m new to cuda programming; below is my code for a game of life program. When I run it I get a Segmentation fault error, the debugger says the error is with the hBlockAll[i][j] = 0; line, but I’m not sure what this means or why I am getting this error.

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>

const int matrix = 10;
void gameOflifeOnDevice();
global void gameOfLifeOnGPU(int** dBlockAll,int** hBlockAll, int matrix);

int main(){

gameOflifeOnDevice();

getchar();

}

void gameOflifeOnDevice(){

int size = matrix * matrix * sizeof(float);

int **hBlockAll,**dBlockAll;



hBlockAll = (int**)malloc(size);
dBlockAll = (int**)malloc(size);
cudaMalloc(&hBlockAll,size);
cudaMalloc(&dBlockAll,size);


for(int i = 1 ; i < matrix ; i ++){
        for(int j = 1 ; j < matrix ; j++){
            hBlockAll[i][j] = 0;
        }
}
cudaMemcpy(&dBlockAll, hBlockAll , size ,cudaMemcpyHostToDevice);



gameOfLifeOnGPU<<<1,100>>>(dBlockAll,hBlockAll,matrix);

cudaMemcpy(&hBlockAll, dBlockAll , size ,cudaMemcpyDeviceToHost);

for (int i = 1 ; i < matrix ; i++){
    for (int j = 1 ; j < matrix ; j++){
        printf("|%c",(hBlockAll[i][j] == 1)? 'x' : '-');
    }
    printf("|\n");
}

free(hBlockAll);
free(dBlockAll);
cudaFree(hBlockAll);
cudaFree(dBlockAll);

}

global void gameOfLifeOnGPU(int** dBlockAll,int** hBlockAll, int matrix){

int idx = (blockIdx.x * blockDim.x) + threadIdx.x;
int idy = (blockIdx.y * blockDim.y) + threadIdx.y;
int countx;
for(int x = 0 ; x <= idx ; x++)
{
    for(int y = 0 ; y <= idy ; y++)
    {
        if (hBlockAll[x][y] == 1)
         {          
              // check block around currsor
              countx += hBlockAll[x][y+1];
              countx += hBlockAll[x][y-1];
              countx += hBlockAll[x+1][y];
              countx += hBlockAll[x+1][y+1];
              countx += hBlockAll[x+1][y-1];
              countx += hBlockAll[x-1][y];
              countx += hBlockAll[x-1][y-1];
              countx += hBlockAll[x-1][y-1];
              
              if (countx < 2) dBlockAll[x][y] = 0;
             
              if (countx > 3) dBlockAll[x][y] = 0;

              if (countx == 2 || countx == 3) dBlockAll[x][y] = 1;
         }
         else{
              // check block around currsor
              countx += hBlockAll[x][y+1];
              countx += hBlockAll[x][y-1];
              countx += hBlockAll[x+1][y];
              countx += hBlockAll[x+1][y+1];
              countx += hBlockAll[x+1][y-1];
              countx += hBlockAll[x-1][y];
              countx += hBlockAll[x-1][y-1];
              countx += hBlockAll[x-1][y-1];
              if (countx > 1 && countx < 4)
              {
                   dBlockAll[x][y] = 1;    
              }   
              
         } // end if
    }
}

}//gameOfLifeOnGPU

Hey guys, I’m new to cuda programming; below is my code for a game of life program. When I run it I get a Segmentation fault error, the debugger says the error is with the hBlockAll[i][j] = 0; line, but I’m not sure what this means or why I am getting this error.

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>

const int matrix = 10;
void gameOflifeOnDevice();
global void gameOfLifeOnGPU(int** dBlockAll,int** hBlockAll, int matrix);

int main(){

gameOflifeOnDevice();

getchar();

}

void gameOflifeOnDevice(){

int size = matrix * matrix * sizeof(float);

int **hBlockAll,**dBlockAll;



hBlockAll = (int**)malloc(size);
dBlockAll = (int**)malloc(size);
cudaMalloc(&hBlockAll,size);
cudaMalloc(&dBlockAll,size);


for(int i = 1 ; i < matrix ; i ++){
        for(int j = 1 ; j < matrix ; j++){
            hBlockAll[i][j] = 0;
        }
}
cudaMemcpy(&dBlockAll, hBlockAll , size ,cudaMemcpyHostToDevice);



gameOfLifeOnGPU<<<1,100>>>(dBlockAll,hBlockAll,matrix);

cudaMemcpy(&hBlockAll, dBlockAll , size ,cudaMemcpyDeviceToHost);

for (int i = 1 ; i < matrix ; i++){
    for (int j = 1 ; j < matrix ; j++){
        printf("|%c",(hBlockAll[i][j] == 1)? 'x' : '-');
    }
    printf("|\n");
}

free(hBlockAll);
free(dBlockAll);
cudaFree(hBlockAll);
cudaFree(dBlockAll);

}

global void gameOfLifeOnGPU(int** dBlockAll,int** hBlockAll, int matrix){

int idx = (blockIdx.x * blockDim.x) + threadIdx.x;
int idy = (blockIdx.y * blockDim.y) + threadIdx.y;
int countx;
for(int x = 0 ; x <= idx ; x++)
{
    for(int y = 0 ; y <= idy ; y++)
    {
        if (hBlockAll[x][y] == 1)
         {          
              // check block around currsor
              countx += hBlockAll[x][y+1];
              countx += hBlockAll[x][y-1];
              countx += hBlockAll[x+1][y];
              countx += hBlockAll[x+1][y+1];
              countx += hBlockAll[x+1][y-1];
              countx += hBlockAll[x-1][y];
              countx += hBlockAll[x-1][y-1];
              countx += hBlockAll[x-1][y-1];
              
              if (countx < 2) dBlockAll[x][y] = 0;
             
              if (countx > 3) dBlockAll[x][y] = 0;

              if (countx == 2 || countx == 3) dBlockAll[x][y] = 1;
         }
         else{
              // check block around currsor
              countx += hBlockAll[x][y+1];
              countx += hBlockAll[x][y-1];
              countx += hBlockAll[x+1][y];
              countx += hBlockAll[x+1][y+1];
              countx += hBlockAll[x+1][y-1];
              countx += hBlockAll[x-1][y];
              countx += hBlockAll[x-1][y-1];
              countx += hBlockAll[x-1][y-1];
              if (countx > 1 && countx < 4)
              {
                   dBlockAll[x][y] = 1;    
              }   
              
         } // end if
    }
}

}//gameOfLifeOnGPU

My guess is your indexes are out of bounds:

hBlockAll[x-1][y-1];

when x=0 or y=0 the code above goes out of bounds.

You’ll need to fix that first somehow.

Different solutions thinkable… but I’ll leave it up to you to come up with a solution External Image

My guess is your indexes are out of bounds:

hBlockAll[x-1][y-1];

when x=0 or y=0 the code above goes out of bounds.

You’ll need to fix that first somehow.

Different solutions thinkable… but I’ll leave it up to you to come up with a solution External Image

Yes, I figured that would be an issue and I was going to tackle that later on, but the error occurs before those lines are even reached. I think the problem might be that I’m allocating memory for a two-dimensional array wrong. I am use to programming in java and not having to worry about that.

Yes, I figured that would be an issue and I was going to tackle that later on, but the error occurs before those lines are even reached. I think the problem might be that I’m allocating memory for a two-dimensional array wrong. I am use to programming in java and not having to worry about that.

The problem is probably with this code:

for(int i = 1 ; i < matrix ; i ++){
for(int j = 1 ; j < matrix ; j++){
hBlockAll[i][j] = 0;
}
}

^

As far as I know C/C++ does not provide “multi dimensional index operator” like you seem to think.

Therefore this code is probably totally wrong… and c interprets it as an array to pointers which point to an array of pointers.

But that’s not what your malloc does… your malloc is a 1d array of pointers.

So to solve it you need to do:

hBlockAll[i * Width + j] = 0;

^ something like that.

So in your case something like:
hBlockAll(i * matrix + j] = 0;

Since matrix appears to be your width and height.

But the i and j should start at zero… so to me it seems you a noobie programmer and noobie c programmer External Image :)

Good luck ! External Image =D

I have seen plenty of weird c code by now :)

So what nvidia can learn from this is: “noobies and beginners and average programmers” want to program cuda too…

But C/C++ is probably way to difficult for them.

So NVIDIA would be wise to add other languages like free basic/basic and/or pascal or perhaps even java or anything that’s easier to program External Image

The problem is probably with this code:

for(int i = 1 ; i < matrix ; i ++){
for(int j = 1 ; j < matrix ; j++){
hBlockAll[i][j] = 0;
}
}

^

As far as I know C/C++ does not provide “multi dimensional index operator” like you seem to think.

Therefore this code is probably totally wrong… and c interprets it as an array to pointers which point to an array of pointers.

But that’s not what your malloc does… your malloc is a 1d array of pointers.

So to solve it you need to do:

hBlockAll[i * Width + j] = 0;

^ something like that.

So in your case something like:
hBlockAll(i * matrix + j] = 0;

Since matrix appears to be your width and height.

But the i and j should start at zero… so to me it seems you a noobie programmer and noobie c programmer External Image :)

Good luck ! External Image =D

I have seen plenty of weird c code by now :)

So what nvidia can learn from this is: “noobies and beginners and average programmers” want to program cuda too…

But C/C++ is probably way to difficult for them.

So NVIDIA would be wise to add other languages like free basic/basic and/or pascal or perhaps even java or anything that’s easier to program External Image

C has multi-dimensional arrays. However, the commonly used “a one-dimensional array and a pointer can be used interchangeably” trick doesn’t apply there, so the double pointer [font=“Courier New”]**a[/font] cannot be used in place of a two-dimensional array.

Declare a 2-dimensional array with

int a;

and a pointer to a 2-dimensional array with

int (*p);

Note that SIZE_X and SIZE_Y must be known at compile time. Variable size arrays were only introduced with C99 and AFAIK are not available in CUDA. If you want to set the array size at runtime, you need to flatten the array to a 1-dimensional array as Skybuck showed.

C has multi-dimensional arrays. However, the commonly used “a one-dimensional array and a pointer can be used interchangeably” trick doesn’t apply there, so the double pointer [font=“Courier New”]**a[/font] cannot be used in place of a two-dimensional array.

Declare a 2-dimensional array with

int a;

and a pointer to a 2-dimensional array with

int (*p);

Note that SIZE_X and SIZE_Y must be known at compile time. Variable size arrays were only introduced with C99 and AFAIK are not available in CUDA. If you want to set the array size at runtime, you need to flatten the array to a 1-dimensional array as Skybuck showed.