Range profiling: "No ranges were profiled."

seabassokranetwork · March 2, 2023, 3:59pm

Hi,

I can’t get the Nsight Compute profiler to capture ranges, because calls to cudaProfilerStart/cudaProfilerStop seem to be ignored. Regular profiling with ncu works otherwise, and nsys can intercept the profiler calls correctly. What am I doing wrong? Is there a workaround?
I want to profile 2 kernels that must run in parallel.

Thanks a lot

Minimum example:

// hello_cuda.cu
#include <iostream>
#include "cuda_profiler_api.h"

__global__ void cuda_hello(){
    printf("Hello World from GPU!\n");
}

int main() {
    cudaProfilerStart();
    cuda_hello<<<1,1>>>();
    cudaDeviceSynchronize();
    cudaProfilerStop();
    printf("Hello from CPU\n");
    return 0;
}

Output:

$ nvcc -arch=sm_80 hello_cuda.cu -o hello_cuda
$ CUDA_VISIBLE_DEVICES=0 TMPDIR=. /scratch/XXXX/NVIDIA-Nsight-Compute-2023.1/ncu --export report.ncu-rep --force-overwrite --replay-mode range ./hello_cuda
==WARNING== Please consult the documentation for current range-based replay mode limitations and requirements.
==PROF== Connected to process 322593 (/scratch/XXXX/xformers/scripts/hello_cuda)
Hello World from GPU!
Hello from CPU
==PROF== Disconnected from process 322593
==WARNING== No ranges were profiled.
==WARNING== Profiling ranges launched by child processes requires the --target-processes all option.

Range-profiling with nsys working:

$ nsys profile --capture-range=cudaProfilerApi ./hello_cuda
Warning: LBR backtrace method is not supported on this platform. DWARF backtrace method will be used.
Capture range started in the application.
Hello World from GPU!
Generating '/tmp/nsys-report-4070.qdstrm'
Capture range ended in the application.
[1/1] [========================100%] report1.nsys-rep
Generated:
    /scratch/XXXX/xformers/scripts/report1.nsys-rep

Setup:

$ nvidia-smi -i 0
Thu Mar  2 15:55:29 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-SXM...  On   | 00000000:10:1C.0 Off |                    0 |
| N/A   27C    P0    52W / 400W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

felix_dt · March 6, 2023, 9:24am

The issue is likely that no CUDA context is active on this thread at the point where cudaProfilerStart is called. As suggested in the range replay documentation, you can try using driver API calls cuProfilerStart/Stop instead.

user13825 · August 7, 2024, 7:21am

When use cuda driver api, it also does not work. Could you please give some other advice?

Topic		Replies	Views
Profiler stuck while profiling a range Nsight Compute	1	2297	November 20, 2023
Nsight-compute print "the application returned an error code (249)" Nsight Compute	5	1584	February 13, 2023
Failed to access the following 9 metrics Nsight Compute	2	457	March 27, 2024
How to control profiling start time using Nsight System gui like --capture-range=cudaProfilerApi in cli Profiling Linux Targets nsight	12	4508	April 4, 2023
Profiling production server while it serves live requests Profiling Linux Targets cuda	7	693	January 9, 2024
Question about profiling nccl kernels with Nsight Compute Nsight Compute	23	5636	December 24, 2025
Nsight compute failed to profile with nvtx ranges in pytorch Nsight Compute pytorch , profiling	4	1518	September 19, 2023
Nsys can't capture anything (cuda programs only) Profiling Linux Targets	14	323	July 10, 2025
Question about ncu profiling Nsight Compute	2	622	March 2, 2022
Profiling one application having two concurent kernels Nsight Compute	3	709	June 8, 2023

Range profiling: "No ranges were profiled."

Related topics