Hello,
This is my first time asking in this forum, so please let me know if I am asking in the right place.
I am currently using the Flagged() function, described in cub::DeviceSelect — CUDA Core Compute Libraries . According to the documentation, num_items is a signed integer of 64 bits. However, I checked the source code (I am using CUDA 12.2 on a H100) and they declare num_items as a 32-bit signed integer. Indeed, if I pass a large value (larger than 2^31-1) I get errors.
Am I doing something wrong? Or is the documentation wrong?
Thank you!