I have been using CUDA 11.6 and trying the CUDA samples 2_Concepts_and_Techniques/cuHook. However, what I notice is that although the hook on cuMemAlloc is indeed successfully invoked when the exact same function is called, it is NOT invoked when cudaMalloc is called. This can be seen by adding the following lines in the cuHook.cpp source file:
float * A;
cudaMalloc(&A, 120);
May I know whether this implies that cudaMalloc is not calling cuMemAlloc under the hood? Or did I do something wrong? Thank you.
Thank you so much for your prompt reply. Please let me know if I understand you correctly: Since CUDA runtime itself has different ways of loading the underlying CUDA driver APIs, the hook that we have for cuMemAlloc cannot be directly applied to cudaMalloc. Hence, in order for cudaMalloc to work, we need to have another hook that explicitly does this.