Hello. I am using cub::DeviceReduce::ReduceByKey method from CUB, which expects operator for performing the reduction, in my case it’s summation, see below.
cub::Sum sum_op;
cuda::std::plus plus_op;
cub::DeviceReduce::ReduceByKey(
nullptr,
buffer_bytes,
//keys
(uint32_t*)nullptr, (uint32_t*)nullptr,
// values
(float*)nullptr, (float*)nullptr,
//remaining params
(uint32_t*)nullptr,
plus_op, N);
If I use cub::Sum, it runs without problems but if I use cuda::std::plus, I get a following compiler crash.
CUDACOMPILE : nvcc error : 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION)
I am not advocating for or against the support of cuda::std::plus in this function, I would just wish for better/more interpretable error messages when I use this “wrong“ operator.
This compiler crash initially led me to believe the problem could be in msvc-cuda versioning issue or multiple cuda toolkit versions, and it was really difficult for me to find the real cause of the crash. This bug is likely Windows-specific. I tried on Ubuntu 24.04 without any issues.
This issue was found with the following setups:
CUDA-Toolkit 12.8 or 13.0 with compatible driver ( 580 for 13.0 and 572 for 12.8)
MSVC - Visual Studio 2022 version 17.8 or 17.14
Windows 10.0.19045
Laptop RTX 3060
Please, If you have further questions, feel free to ask.
Thank you kindly for your support.