Leaving cuda-gdb out of it, since there is no support for long double
in CUDA device code, if you use it in CUDA device code anyway, I guess I would suggest to expect UB. All bets are off.
What cuda-gdb does in that case, I don’t know. But the folks on the cuda-gdb forum may know. However my guess would be that cuda-gdb
is behaving in a sensible fashion, and what you are witnessing is some combination of the fact that
and the warning you get when you try to compile this test case:
$ cat t3.cu
#include <cstdio>
__global__ void k(long double *d){
long double xx[2] = {1.0L / 10.0L, 2.0L / 10.0L};
unsigned char *lxx = reinterpret_cast<unsigned char *>(xx);
unsigned char *ld = reinterpret_cast<unsigned char *>(d);
for (int i = 0; i < 32; i++)
printf("%d, %d\n", (int)(lxx[i]), (int)(ld[i]));
}
int main(){
long double xx[2] = {1.0L / 10.0L, 2.0L / 10.0L};
long double *d;
cudaMalloc(&d, 2*sizeof(long double));
cudaMemcpy(d, xx, 2*sizeof(long double), cudaMemcpyHostToDevice);
k<<<1,1>>>(d);
cudaDeviceSynchronize();
double xxd[4] = {1.0L / 10.0L, 2.0L / 10.0L, 0, 0};
unsigned char *lxx = reinterpret_cast<unsigned char *>(xx);
unsigned char *lxxd = reinterpret_cast<unsigned char *>(xxd);
for (int i = 0; i < 32; i++)
printf("%d,%d\n", (int)(lxxd[i]), (int)(lxx[i]));
}
$ nvcc -o t3 t3.cu
t3.cu(4): warning #20208-D: 'long double' is treated as 'double' in device code
long double xx[2] = {1.0L / 10.0L, 2.0L / 10.0L};
^
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
t3.cu(4): warning #20208-D: 'long double' is treated as 'double' in device code
t3.cu(4): warning #20208-D: 'long double' is treated as 'double' in device code
$ compute-sanitizer ./t3
========= COMPUTE-SANITIZER
154, 205
153, 204
153, 204
153, 204
153, 204
153, 204
185, 204
63, 204
154, 251
153, 63
153, 52
153, 239
153, 217
153, 73
201, 21
63, 181
0, 205
0, 204
0, 204
0, 204
0, 204
0, 204
0, 204
0, 204
0, 252
0, 63
0, 0
0, 0
0, 0
0, 0
0, 0
0, 0
154,205
153,204
153,204
153,204
153,204
153,204
185,204
63,204
154,251
153,63
153,52
153,239
153,217
153,73
201,21
63,181
0,205
0,204
0,204
0,204
0,204
0,204
0,204
0,204
0,252
0,63
0,0
0,0
0,0
0,0
0,0
0,0
========= ERROR SUMMARY: 0 errors
$
(CUDA 12.5)
Of course, we already have the statement that its unsupported, and therefore if you try to use it, it might be UB, and therefore writing a test case and then trying to infer something from it might be a fool’s errand. However, if we push past that, it seems to me that:
- The behavior of that
long double
declaration in host code and in device code manifests differently.
- The behavior of the declaration in device code might be indicated by the warning, whatever that warning means. It looks fairly literal, to me. It appears to behave almost exactly like you replace
long double
with double
. (You could nitpick about the total size of the storage space, perhaps.)
- I don’t see any evidence that
cuda-gdb
is doing anything untoward. It appears to be using the host definition of long double
(quite sensible to me, since its unsupported in device code, but we might want to inspect a structure that happens to be carrying that), and interpreting the bytes it finds according to that. Since use of long double
in device code is unsupported, that might actually be the most useful behavior for cuda-gdb
to manifest.