cudaMallocPitch seg fault I've probably missunderstood the docs but can't see where

I have some fully working code that does Java => JNI => C++ => Kernel => C++ => JNI => Java, so I think I’ve got most of the problems covered.

However, I try to expand on the code, (incrementally moving from host to device code) and add the below segment, and while it compiles fine it appears to segfault when cudaMallocPitch executes.

Most likely I haven’t quite understood the documentation for MallocPitch, but as far as I can tell, I’m doing it right.

[codebox]vector

//Read file and put data into inputVector. When done,

//inputVector basically looks like inputVector[59][10000]

float* kernel_inputArray;

size_t* size;

int error = 0;

error = cudaMallocPitch((void**)&kernel_inputArray, kernel_size, inputVector.size() * sizeof(float), inputVector[0].size());[/codebox]

It would be helpful if you stated what kernel_size is.

I would look over the reference manual if I were you; you’re misusing cudaMallocPitch();

Hey Mr_Nuke, thank you. I was so sure I had gotten it right.

I removed the * from kernel_size and added & to the pitch invokation.

While pitch now executes like expected now, and again I feel certain I’m doing it like in the manual, I still get the feeling that’s not quite what you meant?

Thanks for taking the time anyway.

The way you were doing it impied that you were trying to impose a pitch of kernel_size on the allocation. The pitch is decided by the function, and returned in *pitch. That’s what I meant when I emphasised ‘returned’.