Size_t

f600 · September 29, 2020, 9:02pm

I am used to use size_t for variables such as index to large arrays or other quantities that could hold large ineger numbers, in other programming languages such as C.

After doing some experiments with CUDA I found that if I use
size_t indx = blockIdx.x*blockDim.x +threadIdx.x;
I get bad results (for memory elements, whether in global or shared memory)

while if I change to
int indx = blockIdx.x*blockDim.x +threadIdx.x;

Am I correct to conclude that I should not use size_t in CUDA?

njuffa · September 29, 2020, 10:54pm

Can you show a minimal self-contained repro code that demonstrates the issue? Also, include the exact nvcc command line used to compile the code.

There is no reason why size_t shouldn’t be fully functional, however you may want to avoid it for performance reasons. GPUs are 32-bit machines and 64-bit integer arithmetic must therefore be emulated. size_t basically maps to unsigned long long int on any supported 64-bit platform.

In most contexts, enumerating data objects with int or unsigned int will suffice.

f600 · September 30, 2020, 11:05am

Thanks for your reply. I am running on Linux 64 bit (Fedora) so based on your answer I guess this may be the reason given that the GPU runs 32 bit.

At any rate, per your request, I attach two files. One is called “size_t.txt” which shows the problem. The other is called “int.txt” which works well.

It all comes down to changing the size_t to int in the definition of the index to make it work. I am not sure why (I am still a learner of CUDA).

size_t.txt (8.8 KB) .
int.txt (8.2 KB)

Cheers

njuffa · September 30, 2020, 11:22am

I am using CUDA 9.2 on Windows. I don’t know whether %zu is supported by device-side printf, please check the documentation. I changed all instances of the format specifier%zu to %llu. The output of the two programs variants matches on my system (Kepler-based Quadro K420) and cuda-memcheck has no complaints.

f600 · September 30, 2020, 11:30am

I can indeed see that your choice of %llu solves this problem.

So I guess the problem was with the **printf** choice of format on the GPU

Interestingly, the printf on the host has no problem with %zu.

thanks for your solution

Topic		Replies	Views
CUDA3.0: OpenCL compile error for built-in type "size_t" CUDA Programming and Performance	6	3778	March 24, 2010
signed vs unsigned int for indexes and sizes CUDA Programming and Performance	9	13409	October 8, 2016
Malloc and sizeof CUDA Programming and Performance	2	2813	May 15, 2012
BUG: Possible shared state in CUDA Toolkit Internals - Makes size_t 64-bit or 32-bit CUDA Programming and Performance	4	670	March 29, 2017
Couldn't the ID of thread be assigned to other variable？ CUDA Programming and Performance	3	946	January 6, 2016
printf vs cuPrintf in kernels CUDA Programming and Performance	2	5811	February 5, 2013
conversion from size_t to float always results in zero? CUDA Programming and Performance	4	5364	February 15, 2011
CUDA FAQ posted CUDA Programming and Performance	3	6320	May 22, 2007
Using std:: integer types in device code CUDA Programming and Performance	4	1565	October 12, 2021
Reduction & block dimension Using the easiest reduction example of the SDK CUDA Programming and Performance	6	2181	November 23, 2009

Size_t

Related Topics