can anyone here help me… i want to build a 3D array using CudaMalloc, can i? and how??
i made my own code, no error while compiling, but i got an error after i run it…
the error is : Unhandled exception at 0x1000faa5 in CUDA-Try1.exe: 0xC0000005: Access violation writing location 0x00100000.
here is my code… i just a beginner in this… any help will be appreciated… thanks :D
can anyone here help me… i want to build a 3D array using CudaMalloc, can i? and how??
i made my own code, no error while compiling, but i got an error after i run it…
the error is : Unhandled exception at 0x1000faa5 in CUDA-Try1.exe: 0xC0000005: Access violation writing location 0x00100000.
here is my code… i just a beginner in this… any help will be appreciated… thanks :D
You do not allocate multi-dimensional arrays like that in CUDA - performance will be dreadful. You allocate a 1D array big enough to hold all the data, and then compute locations into that( i.e. [font=“Courier New”]index = ix + nx*iy[/font]). This is what you should do on the CPU side too, unless you’re looking to benchmark the caching hardware on your chip.
You do not allocate multi-dimensional arrays like that in CUDA - performance will be dreadful. You allocate a 1D array big enough to hold all the data, and then compute locations into that( i.e. [font=“Courier New”]index = ix + nx*iy[/font]). This is what you should do on the CPU side too, unless you’re looking to benchmark the caching hardware on your chip.
The numbers ix, iy, and nx are (respectively) the desired x index, the desired y index and the number of elements in the x direction (I’m assuming that the array is stored running the x index fastest). And copying it to a 3D array on the CPU side? As I said, you should never, never have code looking like [font=“Courier New”]C[i][j][k][/font] on the CPU, unless you are specifically trying to benchmark the caching hardware.
As a getting-started example, this is a very simple Matrix class
To the more experienced in C++… this was a quick example, so error checking is not as robust as it might be. The point is that it allocates the 2D matrix as a single block of memory, and overrides the parenthesis operator to allow access via 2D indices.
And yes, that code represents an extreme example of cache-cruelty too, and is completely ignoring the carefully inserted [font=“Courier New”]_mm_malloc[/font].
The numbers ix, iy, and nx are (respectively) the desired x index, the desired y index and the number of elements in the x direction (I’m assuming that the array is stored running the x index fastest). And copying it to a 3D array on the CPU side? As I said, you should never, never have code looking like [font=“Courier New”]C[i][j][k][/font] on the CPU, unless you are specifically trying to benchmark the caching hardware.
As a getting-started example, this is a very simple Matrix class
To the more experienced in C++… this was a quick example, so error checking is not as robust as it might be. The point is that it allocates the 2D matrix as a single block of memory, and overrides the parenthesis operator to allow access via 2D indices.
And yes, that code represents an extreme example of cache-cruelty too, and is completely ignoring the carefully inserted [font=“Courier New”]_mm_malloc[/font].