Driver API and Runtime API interoperability

You are not checking error codes or you would see that you have not properly initialized the driver (cuInit, cuCtxCreate, …). If you start with a cudaFree(0) or similar runtime call that creates a context, your app will work properly. If you just start out with cuMemAlloc (and no error checking), you are going to get CUDA_ERROR_NOT_INITIALIZED for all subsequent calls.