Nsight Systems does not collect CUDA events

This is an example

➜  CUDA_code nsys profile ./kernel_abc
End of file
➜  CUDA_code nsys profile --trace=cuda,cublas ./kernel_abc
Generating '/tmp/nsys-report-1d37.qdstrm'
[1/1] [========================100%] report6.nsys-rep
Generated:
    /home/alexandre/Code/CUDA_code/report6.nsys-rep
➜  CUDA_code nsys stats report6.nsys-rep 
Generating SQLite file report6.sqlite from report6.nsys-rep
Exporting 9521 events: [===================================================100%]
Using report6.sqlite for SQL queries.
Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/nvtxsum.py report6.sqlite]... SKIPPED: report6.sqlite does not contain NV Tools Extension (NVTX) data.

Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/osrtsum.py report6.sqlite]... SKIPPED: report6.sqlite does not contain OS Runtime trace data.

Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/cudaapisum.py report6.sqlite]... 

 Time (%)  Total Time (ns)  Num Calls     Avg (ns)         Med (ns)        Min (ns)       Max (ns)     StdDev (ns)           Name         
 --------  ---------------  ---------  ---------------  ---------------  -------------  -------------  ------------  ---------------------
     94.1    2,138,206,904          1  2,138,206,904.0  2,138,206,904.0  2,138,206,904  2,138,206,904           0.0  cudaDeviceSynchronize
      5.9      135,082,224          2     67,541,112.0     67,541,112.0        159,657    134,922,567  95,291,767.5  cudaMalloc           
      0.0           21,788          2         10,894.0         10,894.0          5,067         16,721       8,240.6  cudaMemset           
      0.0           14,896          3          4,965.3          3,223.0            666         11,007       5,386.2  cudaLaunchKernel     

Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/gpukernsum.py report6.sqlite]... 

 Time (%)  Total Time (ns)  Instances     Avg (ns)         Med (ns)        Min (ns)       Max (ns)     StdDev (ns)                   Name                  
 --------  ---------------  ---------  ---------------  ---------------  -------------  -------------  -----------  ---------------------------------------
     99.9    2,136,014,745          1  2,136,014,745.0  2,136,014,745.0  2,136,014,745  2,136,014,745          0.0  kernel_A(double *, int, int)           
      0.1        1,086,041          1      1,086,041.0      1,086,041.0      1,086,041      1,086,041          0.0  kernel_C(double *, const double *, int)

Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/gpumemtimesum.py report6.sqlite]... 

 Time (%)  Total Time (ns)  Count  Avg (ns)   Med (ns)   Min (ns)  Max (ns)  StdDev (ns)    Operation  
 --------  ---------------  -----  ---------  ---------  --------  --------  -----------  -------------
    100.0        1,121,392      2  560,696.0  560,696.0   543,084   578,308     24,907.1  [CUDA memset]

Running [/opt/nvidia/nsight-systems/2021.5.1/target-linux-x64/reports/gpumemsizesum.py report6.sqlite]... 

The binary executable was compiled from this code using nvcc.