(Note: CUDA 12.0 and later)
Library kernels (CUkernel) are not specific to a context; they’re just sitting in a library.
However, they do have functions corresponding to those for CUfunction’s, which are context(-and-device)-specific kernels:
CUresult cuKernelGetAttribute ( int* pi, CUfunction_attribute attrib, CUkernel kernel, CUdevice dev );
CUresult cuKernelSetAttribute ( CUfunction_attribute attrib, int val, CUkernel kernel, CUdevice dev );
CUresult cuKernelSetCacheConfig ( CUkernel kernel, CUfunc_cache config, CUdevice dev );
I don’t understand how this makes sense. The contents of a library is not specific to a concrete device. If the library were to store some attributes that are compute-capability specific for each kernel - sure, why not. But this device-specificity means specific-state-of-a-specific-system specificity, which is not something that makes sense to put in a library.
And after all, one can just contextualize the kernel with cuKernelGetFunction(), then set whatever attributes you like - for the kernel-in-context.
So why do these functions even exist? What’s the rationale here?