I wonder if one can visit a variant in device code, since c++17 is supposedly supported in CUDA.
Consider this example (with cuda 11.1
):
#include <variant>
#include <cstdio>
struct A{};
struct B{};
struct Visitor {
__device__
void operator()(const A&) { printf("It's an A!\n"); }
__device__
void operator()(const B&) { printf("It's a B!\n"); }
};
__global__
void visitVariant() {
std::variant<A, B> var{A{}};
std::visit(Visitor{}, var);
}
int main() {
visitVariant<<<1, 1>>>();
cudaDeviceSynchronize();
return 0;
}
First of all, it doesn’t compile unless we use --expt-relaxed-constexpr
. Fine - let’s do that.
The problem, however, is the fact that the compiler is unable to put device functions in the jump table of std::visit
.
You can make it compile if you replace as follows
- __device__
+ __host__ __device__
void operator()(const A&) { printf("It's an A!\n"); }
but then the compiler places /*__host__*/ void operator()
in the jump table, which leads to a crash.
It seems that std::visit
is unusable in device code. Is this supposed to work?
Note: Manual jump tables work just fine
__global__
void visitVariant() {
std::variant<A, B> var{A{}};
- std::visit(Visitor{}, var);
+ if (std::holds_alternative<A>(var)) {
+ Visitor()(std::get<A>(var));
+ } else if (std::holds_alternative<B>(var)) {
+ Visitor()(std::get<B>(var));
+ }
}
Output:
It's an A!