Missing cache hint function in Cuda >= 11.3


I have a kernel that makes use of the __ldcv load function / cache hint qualifier (to load an ulonglong2) in order to discard cached data and re-fetch it from its source. The kernel compiles and works perfectly fine with the nvcc compiler shipped with cuda 11.2 release (Linux), but as soon as I try to compile it with any version equal or newer 11.3 or even cuda 12.x, I am faced with an error message

error: identifier “__ldcv” is undefined

All other qualifiers seem to work as expected though and replacing the __ldcv command with any of the other qualifiers compiles fine into a working kernel (as long as that one does not rely on the need to discard the cache content).

The CUDA C++ Programming Guide though does still list __ldcv as a valid load operation and neither lists an alternative nor informs about this being deprecated. Therefore I would like to ask for the correct replacement using cuda 11.3 or newer.

Kind regards

I have not tested it, but include sm_32_intrinsics.hpp in my install of 11.3 contains an entry for it:

__SM_32_INTRINSICS_DECL__ ulonglong2 __ldcv(const ulonglong2 *ptr) { ulonglong2 ret; asm ("ld.global.cv.v2.u64 {%0,%1}, [%2];" : "=l"(ret.x), "=l"(ret.y) : __LDG_PTR (ptr) : "memory"); return ret; }