"invalid argument" error when trying to cudaMalloc a `float**` directly

Dear all,

I am trying to cudaMalloc a static float **d_test with cudaMalloc((void**) d_test, size), but it shows a invalid argument error. When declaring and cudaMalloc’ing it the usual way, as static float *d_test and cudaMalloc((void**) &d_test, size), it works. I would expect these two approaches to be equivalent; clarifications on this matter would be appreciated.

Below is a piece of the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static float **d_test;

inline void wrap_cudaGetLastError (const char *msg) 
/* Wrapper for cudaGetLastError function
Useful for debugging cudacalls without "polluting" the code 
*/
{
    cudaError_t err = cudaGetLastError ();
    if (cudaSuccess != err) { 
	fprintf (stdout, "Cuda error: %s: %s\n", msg, cudaGetErrorString (err)); 
	exit(0);   
    }
}


void wrap_cudamalloc(int size){

	cudaMalloc((void**) d_test, size);
	wrap_cudaGetLastError("cudamallocr after cudaMalloc");
}

Thanks

They are not equivalent. You’ll need to get a firmer grasp of what a pointer is in the C or C++ programming language.

This:

static float **d_test;

creates storage for a pointer variable. Storage means space in memory to hold the numerical value of the pointer. The numerical value of the pointer is usually an address; the address of where something else is located in memory. But creating storage for it doesn’t mean that the address contained in that storage points to anything meaningful. So far, that pointer has not been initialized.

cudaMalloc intends to set a pointer value, and in order to do so, it expects you to pass a pointer to that pointer (i.e. that location) that will be modified. That’s what the double-* notation means: a pointer to a pointer.

Therefore cudaMalloc expects that the first parameter will be the address of where something is located in memory; specifically the address of the location to store the allocated pointer value in.

This formulation works for that

&d_test

because d_test is a place in memory, and &d_test is the address of that location. cudaMalloc can use that to correctly store the address of the allocated space that it will provide in the storage provided at d_test.

This formulation doesn’t work:

d_test

Remember, cudaMalloc will interpret the first argument as the address of the location to store something in. However in this formulation, d_test hasn’t been initialized, so it doesn’t point to any actual space that can be used to store the address of the allocation that will be returned by cudaMalloc.

Thank you for your time in clarifying this; the bit on storage and initialization was really helpful. Thinking about initialization, could substituting

static float **d_test

with

static float *d_test[1];

as in the code below,

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static float *d_test[1];

inline void wrap_cudaGetLastError (const char *msg) 
/* Wrapper for cudaGetLastError function
Useful for debugging cudacalls without "polluting" the code 
*/
{
    cudaError_t err = cudaGetLastError ();
    if (cudaSuccess != err) { 
	fprintf (stdout, "Cuda error: %s: %s\n", msg, cudaGetErrorString (err)); 
	exit(0);   
    }
}


void wrap_cudamalloc(int size){
	cudaMalloc((void**) d_test, size);

	wrap_cudaGetLastError("wrap_cudamalloc after cudaMalloc");
}

be incorrect or dangerous, or cause future problems? (it worked for a saxpy test)

float **d_test; // d_test is a pointer to a memory location containing a pointer to a float object
float *d_test[1]; // d_test is an array (of size) one, each element of which is a pointer to a `float object

In the second case, when you pass d_test to a function, it “decays” into a pointer to d_test[0], so you are in fact passing a float **.

I would argue that the first variant is more idiomatic, and should thus be preferred, unless you really need an array of multiple pointers.

There is no magic here, you just have to become familiar with the C/C++ pointer concept that has been around for forty years. CUDA is a language in the C++ family. There is a program called cdecl that can parse declarations, there is even an online version at cdecl.org.

Ok!! Thanks