Built-in Vector Type Memory allocation


I am unable to find any document about memory allocation for Built-in vector type such as Char4.

I want to do following in CUDA.


declare char4 for host using malloc such as char4 *N = (char4 *)malloc(length * sizeof(char4));

declare char4 for device using cudaMalloc such as char4 dN; cudaMalloc((void*) &dN, length * sizeof(char4));

Afterward, I want to use cudaMemcpy to assign host data of char4 to device char4.

Any suggestion?

Best Regards,


Do exactly as you have written.

Thanks for your reply.

I just want to confirm about this syntax as I have not found any example related to it.