cudaMalloc, parameter difference

Hi,
I’m somewhat new to CUDA and I have seen the following version of calling cudaMalloc function:

  • cudaMalloc(&d_ptr,sizeof(float)*N)
  • cudaMalloc((void **) &d_ptr, sizeof(float)*N)

I have also seen that the documentation of the runtime API says that the parameter is a double pointer. What’s the difference between both versions?
Thanks for your time.

They are basically the same.

The first version generally works well with nvcc and is quicker to type.

If you are compiling this code using gnu tools (gcc, g++) you may in some cases need to use the second version.

The formal function prototype:

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g37d37965bfb4803b6d4e59ff26856356

expects a void pointer to pointer. The first realization is not technically a void pointer to pointer, but is castable to that. nvcc generally does not object to that implicit cast. The second version will work with any compiler that objects to the implicit cast in the first version.

Thanks for the explanation!