illegal to declare a device float3 then use it both in global and device func?
the global func call the device func.
so it’s not allowed to use this kind of ‘global’ var in .cu, right?
I think it is possible to declare a __device__ float3 ...
and then use it in both a __global__
and a __device__
function.
thanks, but i found the data is not right, however if declare it in the global func then pass it as device func param it’s ok. don’t know why…
if you provide a short but complete example of something that is not working, I think it’s likely that someone on this forum can explain it.
just like this:
__device__ float3 var{0.0f, 0.0f, 0.0f};
__device__ void devFunc()
{
//use and maybe change the value of 'var';
}
__global__ void kernel()
{
//use and maybe change the value of 'var';
//also call devFunc();
}
just like this, but of course much more complicated, and more than one device function, but sorry i don’t know how to show it to you…
very weird that it change the render content…the image rendered by this kernel changed while make var out of all the function to use like global one. while it’s ok to declare it in the kernel and pass it to all the needed device funtions.
Thanks!
I don’t seem to have any trouble with what you have shown:
# cat t44.cu
#include <cstdio>
__device__ float3 var{0.0f, 0.0f, 0.0f};
__device__ void devFunc()
{
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 2.0f;
}
__global__ void kernel()
{
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 1.0f;
//also call devFunc();
devFunc();
printf("var.y: %f\n", var.y);
}
int main(){
kernel<<<1,1>>>();
cudaDeviceSynchronize();
}
# nvcc -o t44 t44.cu
# compute-sanitizer ./t44
========= COMPUTE-SANITIZER
var.y: 0.000000
var.y: 1.000000
var.y: 2.000000
========= ERROR SUMMARY: 0 errors
#
CUDA 12.2
I think the problem, whatever it may be, lies in something you haven’t shown.
If I were initializing __device__
variables the way you are here, I would certainly use CUDA 11.3 or newer. Anyway, I won’t be able to comment further unless you can show an actual case that actually fails. You have not provided what I asked for, and I cannot debug code you haven’t shown. In any event, I believe I have now demonstrated with my complete case that a simple statement like “(it is) Illegal to declare a __device__ float3
then use it both in __global__
and __device__
func” is simply not true. Good luck!
what if use older version? in CUDA 11.1, there is no error or warning while compile.
not support ‘float3 var{0.0f, 0.0f, 0.0f};’ no matter in or out of the function?
__device__ float3 var{0.0f, 0.0f, 0.0f};
__device__ void devFunc()
{
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 2.0f;
}
__global__ void kernel()
{
float3 var{0.0f, 0.0f, 0.0f};//add this, also no error while compile...the name is duplicated...
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 1.0f;
//also call devFunc();
devFunc();
printf("var.y: %f\n", var.y);
}
__device__ float3 var{0.0f, 0.0f, 0.0f};
__device__ void devFunc2()
{
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 3.0f;
}
__device__ void devFunc()
{
devFunc2();
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 2.0f;
}
__global__ void kernel()
{
if (idx > width || idy > height) return;
if (InBoundingbox()==false) return;
//use and maybe change the value of 'var';
printf("var.y: %f\n", var.y);
var.y = 1.0f;
//also call devFunc();
devFunc();
printf("var.y: %f\n", var.y);
}
int main(){
dim3 grid(w, h);
dim3 block(16, 16);
kernel<<<grid,block>>>();
cudaDeviceSynchronize();
}
this is closer to the actual code, using cuda11.1, of course much more complicated in the actual code.
as the debug log, if declare the __device__ float3 var
above and outside all the functions, its value seems messed up, donn’t why…while it’s ok when declare it in the __global__
func then pass it to whatever function no matter const float3&
or float3&
as function param.
maybe it’s related to the block and grid size?
eg. if not kernel<<<1,1>>>();, but the size are both larger than 1. AND the device float3 var; is not belong to each pixel but to the block? SO when it’s changed in each pixel, it’s not OK to use it outside all the functions. right?