In the ancient lands where my ancestors come from, unsigned int is a 16-bit integer that takes up 2 bytes of memory.
It holds ranges from 0 to 65535.
An array allocated with an unsigned int would therefore be aligned along 2-byte boundaries.
Also in my lands, long int is a 32-bit integer that takes up 4 bytes of memory.
It holds ranges from -2.1b to 2.1b
An array allocated with a long int would therefore be aligned along 4-byte boundaries.
If the above is true (and it might not be anymore), it would be highly stupid to mix-and-match long int array pointers with unsigned int array pointers.
Despite this dangerous stupidity, below is some code that comes packaged with the SDK. In it you can see this brave programmer performing death-defying casts of unknown data types into unsigned ints.
If I compile this code on a machine running a 32-bit windows OS, what assurances do I have that it will work at all?
Taken from testradixsort.cpp
1. //$// $$$ T IS A TEMPLATE DATA TYPE -->
2.
3. template <typename T, bool floatKeys>
4. void testSort(int argc, char **argv)
5. {
6. int cmdVal;
7. int keybits = 32;
8. .
9. .
10. .
11. .
12.
13.
14. //$// $$$ d_keys IS ALLOCATED AS A 'T' HERE -->
15. // Copy data onto the GPU
16. T *d_keys;
17. unsigned int *d_values;
18. cudaMalloc((void **)&d_keys, numElements*sizeof(T));
19. if (!keysOnly)
20. cudaMalloc((void **)&d_values, numElements*sizeof(unsigned int));
21. else
22. d_values = 0;
23.
24. // Creat the RadixSort object
25. nvRadixSort::RadixSort radixsort(numElements, keysOnly);
26.
27. cudaMemcpy(d_keys, h_keys, numElements * sizeof(T), cudaMemcpyHostToDevice);
28. if (!keysOnly)
29. cudaMemcpy(d_values, h_values, numElements * sizeof(unsigned int), cudaMemcpyHostToDevice);
30. .
31. .
32. .
33. .
34. //$// $$$ AT SORT TIME HE CASTS d_keys AS AN UNSIGNED INT !
35. //$// $$$ WHILE THIS MIGHT WORK ON A UNIX OS, THIS IS EXTREMELY
36. //$// $$$ DANGEROUS AND BAD CODING ON A MSWINDOWS MACHINE.
37. if (floatKeys)
38. radixsort.sort((float*)d_keys, d_values, numElements, keybits, true);
39. else
40. radixsort.sort((unsigned int*)d_keys, d_values, numElements, keybits);