Hello,
We are having some trouble profiling our code using the Nsight Systems GUI and were hoping someone can help us resolved them. We came across two problems, the first was not properly running nsys using the CLI and the second was the GUI application not identifying the usage of NVTX.
At first we ran the Nsight System program as administrator and under the Target application, “Command line and arguments” and “working directory” fields inputted the test script name (.\prof_test.py) and its path respectively. This resulted in an error. The full error log is brought bellow.
I understand this a very basic problem however we also tried running it in the same manner using a windows powershell from the script working directory after adding nsys to window’s environment Path variable using:
nsys profile -t nvtx,cuda --stats=true --force-overwrite=true --output=dp .\prof_test.py
This however resulted in the same error message .
Does using the GUI negates the manual addition of CLI arguments ? for example there’s a check box for collecting NVTX traces. Does the GUI replace the usage of other CLI arguments such as the “profile” command ?
Can you please give an example to the proper syntax that is compatible with the GUI ?
When running the following command using the GUI:
"nsys" profile "-t nvtx,cuda" --stats=true --force-overwrite=true --output=dp ./prof_test.py
the profiler produced a warning stating that “no NVTX events collected. Does the process uses NVTX” ?
The script (see attahced) prof_test.zip (468 Bytes) which is based on this blog post has NVTX annotations which are suppose to produce NVTX events.
We kindly appreciate the help and assistance on this.
An error occurred. Unknown executable format. Is the “#!” sequence missing?
If this error persists, please restart the app and/or reboot the target.
Version information: NVIDIA Nsight Systems, 2024.4.1.61-244134315967v0 Windows-x64
Full error information:
RuntimeError (120) {
RuntimeError (120) {
OriginalExceptionClass: struct boost::wrapexcept
OriginalFile: C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Host\Analysis\Clients\AnalysisHelper\AnalysisStatus.cpp
OriginalLine: 79
OriginalFunction: class Nvidia::QuadD::Analysis::Data::AnalysisStatusInfo __cdecl QuadDAnalysis::AnalysisHelper::AnalysisStatus::MakeFromErrorString(enum Nvidia::QuadD::Analysis::Data::AnalysisStatus,enum Nvidia::QuadD::Analysis::Data::AnalysisErrorType::Type,const class std::basic_string<char,struct std::char_traits,class std::allocator > &,const class boost::intrusive_ptr &)
ErrorText: Unknown executable format. Is the “#!” sequence missing?
}
}
Profiling options:
DeviceId: “Local”
EventTypes {
Items: CpuCycles
Items: Cuda
Items: NvtxEvents
}
RateHz: 1000
HowToStart: Immediate
HowToStop: Manual
DeviceType: Windows
DeviceDisplayName: “AmirTDell”
WindowsPerfOptions {
CollectThreadActivity: true
CollectThreadBacktrace: true
retainEtwFiles: true
SymbolSearchVerboseLog: false
AutomaticallyGenerateReportFileNames: false
}
Processes {
HowToAttach: LaunchAnother
Command: “.\prof_test.py”
WorkingDirectory: “C:\Users\amirt\test_again\Sim”
UserName: “amirt”
CollectNvtxTrace: true
CollectCudaTrace: true
CudaFlushPeriodically: true
CudaFlushPeriod: 10000000000
CudaSkipSomeApiCalls: true
CollectGPUMemoryUsage: false
CudaGraphTraceOptions {
Mode: Graph
TraceDeviceGraphLaunch: false
}
CudaFlushOnCudaProfilerStop: true
}
ShowBacktrace: true
IncludeChildren: true
GpuMetricsOptions {
SamplingFrequency: 10000
Gpus {
Id: 0
MetricSetIndex: 6
}
}
SymbolResolutionOptions {
ResolveSymbols: false
}