Segmentation fault when compile simple kernel

I suggest:

  1. retest on the latest CUDA (12.5, currently)
  2. if it still fails, file a bug.