Hi,
can I use cudaMemcpyAsync() for multiple variables?
for example,
for i=StreamSize{
cudaMemcpyAsync(d_b, a, N, cudaMemcpyHostToDevice, stream1)
cudaMemcpyAsync(d_b, b, N, cudaMemcpyHostToDevice, stream1)
}
for i=StreamSize{
my_kernel<<<1,N,0,stream1>>>(d_a,d_b)
}
Thank you, that was exactly what i wanted to avoid.
I’ve already implemented this by gathering my values into o bigger array buffer[ [a], [b] ], so i was wondering if it was possible to avoid all memcpys to my buffer and just send a,b,… to my card. :)