Sending std::vector to kernel

mischan · March 6, 2009, 4:27pm

Hi all,

I am having problems figuring out how a cuda kernel receives an std::vector?

My host code is something like:

#include <vector>

using namespace std;

const int N = 10;  

vector<int> tmp;

for(int i = 0; i < N; i++) tmp.push_back(i);

vector<int>* tmp2;

cudaMalloc((void**)&tmp2, tmp.size() * sizeof(int));

cudaMemcpy(tmp2, &tmp, tmp.size() * sizeof(int), cudaMemcpyHostToDevice);

(First: is the above part right in and of itself?)

call the kernel

square_vector <<< nblocks, block_size >>> (tmp2, N);

the kernel

__global__ void square_vector(int *a, int N)

What should the specification of the kernel incoming parameters be? How do you pass to it a vector?

And then how would I index the vector components?

–

I know probably some people will reply and say not to use c++ vectors, but I know this has been done successfully, and I’d like to know what I’m doing wrong above.

Thank you!

mischan · March 6, 2009, 9:21pm

Oh man, this took way longer than it should have to get working, but here is the solution for some other poor souls out there:

vector<int> tmp;

for(int i = 0; i < N; i++) tmp.push_back(i);

size_t size = N * sizeof(int);

cudaArray* cuArray = 0;

cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<int>();

cudaError_t error = cudaMallocArray( &cuArray, &channelDesc, size,1);

cudaError_t error1= cudaMemcpyToArray(cuArray, 0, 0, (void*)&tmp[0], size, cudaMemcpyHostToDevice);

Then send it as a cudaArray and the kernel receives it as such.