I am trying to load to CUDA memory a large volume (could be 256x256x500) of short integers or even larger using CUDAMALLOC. Allocation is successfull, but when i try to access it on the kernel in indices larger than a small number around 10k as volume_array i get an unknown error exception. Is there a limit on the size of an allocation you can perform, or an alternate (better) way to handle big blocks of data?
GPU shouldn’t be any problem, its a Quadro FX 1700.
Indexing shouldnt be an issue as i’ve replaced variable index in volume_array[index] with numbers (if that’s what you mean).
update so far…seems i had miscalculated…i doesn’t crash when indexes are 0 - 262143 but above it it does with an unknown error. Could this be an addressing issue?
Apart from that, it seems to be a bit unstable as well, sometimes it runs without problems (apart from the mentioned one)) and sometimes it crashes with an out of memory error. Thats for the debug dll. The release dll always gives me out of memory error. This could be due to my OS (vista 64-bit), drivers or due to the actual project. Its written in Java and uses CUDA through a C dll.
I will try to see if i can post any code, but it might not be possible as I am not sure if I am allowed to do this :(.
Can you make anything out of the information i’m giving so far?