cudaMemcpyToSymbol reversing order representation of byte array


I’m new to CUDA and I’m losing my mind over a strange behaviour of cudaMemcpyToSymbol.

I need to copy an array of 20 unsigned char (bHash_h) from the host into 5 unsigned long (bHash) in the device constant memory.

The code is:

[codebox]unsigned char bHash_h[20] = { 0x62, 0x6D, 0x36, 0xE7, 0x11, 0x78, 0x26, 0xCF, 0xF4, 0xD6, 0xC5, 0xB5, 0x66, 0x9B, 0x08, 0xAF, 0xD9, 0x95, 0x14, 0x9A};

int main(int argc, char* argv)


rc = cudaMemcpyToSymbol( "bHash", bHash_h, 20);


/******* device code ***********/

constant unsigned long bHash[5];

global void myFunc( unsigned long *Result )


printf(" bhash[0] %08X \n", bHash[0]);



The output is :


instead of:


How can it be?

I tried to use

rc = cudaMemcpyToSymbol( “bHash”, bHash_h, 20, 0, cudaMemcpyHostToDevice);

but I didn’t work…

Thanks for your time in advance!!


The output is correct. The IA-32/x86_64 are little endian architectures - the first byte in a word is the least significant byte and the last is the most signficant byte. You can read more here if you haven’t come across the concept before.

ups it is an endianess issue… Thank you!! so I guess I need to swap the bytes or is there another (faster) method?