matt42
August 17, 2010, 12:19pm
1
Hello
It looks likes there is a problem in cuda 3.1 with shared and constant variable.
Given this:
shared T a;
if T has a member that defines an constructor (even empty), the compiler will throw
error: can’t generate code for non empty constructors or destructors on device
Is there any workaround?
Here is a simple example to reproduce the problem:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Tue_Jun__8_03:55:04_PDT_2010
Cuda compilation tools, release 3.1, V0.2.1221
$ cat test.cu
struct Y
{
__host__ __device__ Y() {}
};
struct X
{
Y a;
};
__shared__ X cuda_test;
int main()
{
}
$ nvcc test.cu
test.cu
test.cu(10): error: can't generate code for non empty constructors or destructors on device
1 error detected in the compilation of "c:/windows/temp/tmpxft_00001274_00000000-6_test.cpp1.ii".
Thanks
an explicit init function that you call instead of using a constructor
an explicit init function that you call instead of using a constructor
matt42
August 17, 2010, 1:03pm
4
Thanks for the idea, but I wont make this choice, it would imply to many
changes in my code.
matt42
August 17, 2010, 1:03pm
5
Thanks for the idea, but I wont make this choice, it would imply to many
changes in my code.
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
hey matt,
recently I had exactly the same problem like you. My conclusion, there is no workaround for using structs in constant memory which have constructors.
read following topic
http://forums.nvidia.com/index.php?showtop…p;#entry1101001
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
hey matt,
recently I had exactly the same problem like you. My conclusion, there is no workaround for using structs in constant memory which have constructors.
read following topic
http://forums.nvidia.com/index.php?showtop…p;#entry1101001
matt42
August 17, 2010, 2:35pm
10
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
matt42
August 17, 2010, 2:35pm
11
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
matt42
August 17, 2010, 2:35pm
12
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
matt42
August 17, 2010, 2:35pm
13
well the problem is that CUDA can’t possibly know in which thread to execute the constructor code,
so it generates this error instead.
calling init functions would force you to choose the proper thread for execution.
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
I solved it by getting completely rid of the constructors. I did not test it, but i think the problem is, that a constructor has even defined!
This would work:
struct Y
{
__host__ __device__ init() {// do something}
};
struct X
{
Y a;
__host__ __device__ init() {a.init();}
};
I know it is very anoying to change all the constructors to an init-function, but I did not find any other solution.
Also do not forget to do the same for the destructors.
oh ok now I understand the reason why cuda doesn’t handle non empty constructor, thanks. But in my
example, the constructor of X just call Y’s constructor which is empty. There should be no problems…
I solved it by getting completely rid of the constructors. I did not test it, but i think the problem is, that a constructor has even defined!
This would work:
struct Y
{
__host__ __device__ init() {// do something}
};
struct X
{
Y a;
__host__ __device__ init() {a.init();}
};
I know it is very anoying to change all the constructors to an init-function, but I did not find any other solution.
Also do not forget to do the same for the destructors.
matt42
August 17, 2010, 2:48pm
16
Thanks for the link External Media ! Actually, you can if the default constructor of the constant object is defined as empty:
This compiles:
struct Y
{
};
struct X
{
__host__ __device__ X() {}
Y a;
};
__shared__ X cuda_test;
But unfortunately not this equivalent piece of code External Media :
struct Y
{
__host__ __device__ inline Y() {}
};
struct X
{
Y a;
};
__shared__ X cuda_test;
matt42
August 17, 2010, 2:48pm
17
Thanks for the link External Media ! Actually, you can if the default constructor of the constant object is defined as empty:
This compiles:
struct Y
{
};
struct X
{
__host__ __device__ X() {}
Y a;
};
__shared__ X cuda_test;
But unfortunately not this equivalent piece of code External Media :
struct Y
{
__host__ __device__ inline Y() {}
};
struct X
{
Y a;
};
__shared__ X cuda_test;
matt42
August 17, 2010, 2:55pm
18
I solved it by getting completely rid of the constructors. I did not test it, but i think the problem is, that a constructor has even defined!
This would work:
struct Y
{
__host__ __device__ init() {// do something}
};
struct X
{
Y a;
__host__ __device__ init() {a.init();}
};
I know it is very anoying to change all the constructors to an init-function, but I did not find any other solution.
Also do not forget to do the same for the destructors.
My problem is that in my case, Y has a copy constructor that I cannot remove without breaking a lot
of existing code.
But I guess the only solution is to wait for some cuda devs to fix this.
matt42
August 17, 2010, 2:55pm
19
I solved it by getting completely rid of the constructors. I did not test it, but i think the problem is, that a constructor has even defined!
This would work:
struct Y
{
__host__ __device__ init() {// do something}
};
struct X
{
Y a;
__host__ __device__ init() {a.init();}
};
I know it is very anoying to change all the constructors to an init-function, but I did not find any other solution.
Also do not forget to do the same for the destructors.
My problem is that in my case, Y has a copy constructor that I cannot remove without breaking a lot
of existing code.
But I guess the only solution is to wait for some cuda devs to fix this.
My problem is that in my case, Y has a copy constructor that I cannot remove without breaking a lot
of existing code.
But I guess the only solution is to wait for some cuda devs to fix this.
yes, the problem is that your class X is NOT allowed to have any element which contains a constructor through all layers.
So if Y has no constructor but an object of Z, Z is not allowed to have a constructor as well. Much work to get rid of all constructors.
Anyway, what I learned by working with CUDA, is to keep the class structure very very flat to avoid problems like this.
Also it speeds up your program if you are using a flat structure.
If you are completly stuck now, is it not possible to replace your copy constructors in the same way like i told you before? For example
struct Y
{
int value;
__host__ __device__ init() {};
__host__ __device__ copyConstructor(Y& y)
{value = y.value}
};
struct X
{
Y a, b;
__host__ __device__ init()
{
a.init();
b.init();
b.copyConstructor(a);
}
};