I read through CUDA Driver API docs and find cuStreamWriteValue32_v2 is only described on CUDA 11.7 and CUDA 11.8. Is cuStreamWriteValue32_v2 deprecated from CUDA 12.0? I can still use this API in CUDA 12.3.
I wasn’t able to locate any deprecation notice. However, I observe in cuda.h
for cuda 12.x (a few that I looked at) the documented cuStreamWriteValue32
is “calling” cuStreamWriteValue32_v2
. So my suggestion would be to switch to the (currently) documented API at your earliest convenience (if for no other reason than as a convenience to code maintainers).
I believe what happened was that there was a functional migration between two behaviors that got exposed/took place in the CUDA 11.7 - 12.0 timeframe. Prior to cuda 11.7, the use of the “default” api cuStreamWriteValue32
would use the “old” functionality, and after that point the use of the default api would select the “new” (i.e. v2) functionality. In the CUDA 11.7-8 timeframe, you had an option to select either behavior.
The V1 API (according to 11.7 documentation CUDA Driver API :: CUDA Toolkit Documentation) needs a kernel module parameter modprobe nvidia NVreg_EnableStreamMemOPs=1
, the V2 not (?). Is this parameter still supported in newer driver versions at all?
If the call between the versions appears in the header, Nvidia expects it to permanently stay that way for any program compiled nowadays.
My understanding is that prior to CUDA 11.7 (e.g. CUDA 11.4.4, to pick an example) the only “methodology” that existed was the one that required an opt-in via kernel module parameter. Since there was only one “methodology”, there didn’t exist a need for other options and there is only the “default” API, i.e. cuStreamWriteValue32
.
In the 11.7-11.8 timeframe, another “methodology” was developed, which did not depend on the kernel opt-in. This was given the _v2
suffix in that timeframe. Apparently either “methodology” could be used, in that timeframe. It appears that by CUDA 12.x, the designers had “standardized” on the newer “methodology”, and therefore the default API is equivalent to the _v2
suffix API, and mention of the kernel opt-in is dropped.
I haven’t tried it, but my guess would be that in CUDA 12.x, the kernel-opt-in has no meaning. I also see no evidence of the cuStreamWriteValue32_v1
API in the CUDA headers for 12.x.