Hey all, I was wondering if anyone knew whether using cudamallocpitch / cuda memcopy2D / 3D etc are nessesary when using 2D/3D data or does it jsut optimize the code so the memory accessses are coalesed?
I hav a 2D code that uses the mallocpitch etc but I am trying to write a 3D code and it seems to be such a hassel to use malloc3D etc. If I just use a linear array with cudamalloc() and access it through index computation (liek iM + jL*M + k for example) will the code work? Willit be THAT much slower?
They’re not necessary, but you will get much better performance if your volume is an odd size because malloc3D will make sure starts of the rows are aligned for coalsesced reads. It’s not that much hassle - there’s an example in the programming guide.
They’re not necessary, but you will get much better performance if your volume is an odd size because malloc3D will make sure starts of the rows are aligned for coalsesced reads. It’s not that much hassle - there’s an example in the programming guide.