You are not checking error codes or you would see that you have not properly initialized the driver (cuInit, cuCtxCreate, …). If you start with a cudaFree(0) or similar runtime call that creates a context, your app will work properly. If you just start out with cuMemAlloc (and no error checking), you are going to get CUDA_ERROR_NOT_INITIALIZED for all subsequent calls.