Running sample gives `code=999(cudaErrorUnknown)`, how can this be solved?

Im trying to run CUDA 10.1. on the fluidsGL sample, but I always get this error:

CUDA error at fluidsGL.cpp:472 code=999(cudaErrorUnknown) "cudaGraphicsGLRegisterBuffer(&cuda_vbo_resource, vbo, cudaGraphicsMapFlagsNone)" 

I’ve got a Thinkpad T480s Laptop running 20.04. with dedicated Nvidia MX150 and onboard Intel UHD 620. Because of the dual-graphics (I want Intel running the display and Nvidia running heavy programs; I do this with off-loading), I figured it would be best to install CUDA 10.1. from the .run-file. So I followed the installation steps, and if I want to verify the installation with the sample run, I get the above error.

Too often I had either other errors, workarounds and glitches etc. that left me doing a timeshift and redoing the whole installation. Then I saw that people just use sudo apt-get install nvidia-cuda-toolkit to install CUDA, which I then did too. Typing nvcc -V did give me the correct CUDA version.

But then making one of the samples above errored with make: /usr/local/cuda/bin/nvcc: command not found.
I figured with which nvcc that it nvcc was instead installed in /usr/bin/nvcc, so, naively, I created a softlink:
sudo ln -s /usr/bin/nvcc /usr/local/cuda/bin/nvcc

This indeed left me making the sample error-free (after having installed freeglut3, since the error /usr/bin/ld: -lglut could not be found; collect2: error: ld returned 1 exit status! occured)

Running

import torch
torch.cuda.is_available()

in a ipynb-cell results in True, which leaves me thinking, CUDA works.

However, running the fluidGL example gives still gives me the above code=999(cudaErrorUnknown)-error. Making works.

Was this even feasible, what I all did? Or is there something I missed.

How can I solve the error?

Cheers

Solved the issue after 2 days of research.

Turns it its more simple than I thought, I just didn’t know all the bits and pieces of Nvidia prime (and off-loading in particular).
Setting up prime with on-demand will automatically offload the computational work to Nvidia GPU and let the Intel GPU simply display the computation. However this does not apply to graphical programs (like running that sample example), for this the off-loading needs to be properly configured (follow https://download.nvidia.com/XFree86/Linux-x86_64/430.40/README/randr14.html precisely; this guy’s video helped me https://www.youtube.com/watch?v=G2ZbIXvLGV8).
So whenever I want to run a graphical program and let the Nvidia gpu do all the heavy lifting, I need to run __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia <command> for running graphical program <command> with off-loading.