Std::visit a std::variant

I wonder if one can visit a variant in device code, since c++17 is supposedly supported in CUDA.

Consider this example (with cuda 11.1):

#include <variant>
#include <cstdio>

struct A{};
struct B{};
struct Visitor {
  __device__
  void operator()(const A&) { printf("It's an A!\n"); }
  __device__
  void operator()(const B&) { printf("It's a  B!\n"); }
};

__global__
void visitVariant() {
  std::variant<A, B> var{A{}};
  std::visit(Visitor{}, var);
}

int main() {
  visitVariant<<<1, 1>>>();
  cudaDeviceSynchronize();
  return 0;
}

First of all, it doesn’t compile unless we use --expt-relaxed-constexpr. Fine - let’s do that.
The problem, however, is the fact that the compiler is unable to put device functions in the jump table of std::visit.
You can make it compile if you replace as follows

- __device__
+ __host__ __device__
  void operator()(const A&) { printf("It's an A!\n"); }

but then the compiler places /*__host__*/ void operator() in the jump table, which leads to a crash.

It seems that std::visit is unusable in device code. Is this supposed to work?

Note: Manual jump tables work just fine

  __global__
  void visitVariant() {
    std::variant<A, B> var{A{}};
-   std::visit(Visitor{}, var);
+   if (std::holds_alternative<A>(var)) {
+     Visitor()(std::get<A>(var));
+   } else if (std::holds_alternative<B>(var)) {
+     Visitor()(std::get<B>(var));
+   }
}

Output:

It's an A!

The CUDA documentation states that C++17 is supported with restrictions. I am not knowledgeable about C++17 (yet), but before you file a bug with NVIDIA, check the docs:

G.4.18. C++17 Features

Thanks for the link!

I had checked that part already (it’s directly reachable from the part of the documentation I linked at the beginning of my first post). It seems to me that the two restrictions listed there don’t apply to the problem at hand.

Let’s see if somebody has an idea.