Using QdstrmImporter

hwilper · September 24, 2018, 4:47pm

When you run Nsight Systems using the command line, the CLI generates a .qdstrm file.

This .qdstrm file is an intermediate result file, not intended for multiple imports. It needs to be processed, either by importing it into the GUI or by using the standalone QdstrmImporter to generate an optimized .qdrep file. Use this .qdrep file when re-opening the result on the same machine, opening the result on a different machine, or sharing results with teammates.

The import of really large, multi-gigabyte, .qdstrm files may take up all of the memory on the host computer and lock up the system. We are working to improve this, but many users are working with setups where the target computer is much more powerful than the host they are using for visualization.

Using QdstrmImporter gives you the ability to script the .qdrep generation (on the host) or generate the .qdrep file where you have the most resources. Note that the CLI and QdstrmImporter versions must match to convert a .qdstrm file into a .qdrep file. This .qdrep file can then be opened in the same verion or more recent versions of the GUI.

To run QdstrmImporter on the host system, find the QdstrmImporter binary in the Host-x86_64 directory in your installation. QdstrmImporter is available for all host platforms. See options below.

To run QdstrmImporter on the target system, copy the Linux Host-x86_64 directory to the target Linux system or install Nsight Systems for Linux host directly on the target. The Windows or MacOS host QdstrmImporter will not work on a Linux Target. See options below.

QdstrmImporter Options:
-h or --help ---- Help message providing information about available options and their parameters.
-v or --version ---- Output QdstrmImporter version information
-i or --input-file [filename or path] ---- Import .qdstrm file and generate a .qdrep file with the same name and in the same location.

mengda.yang · November 7, 2018, 12:59am

Hi, I encountered this error importing a 800MB qdstrm file.

Importing...
Import Failed with unexpected exception: /build/agent/work/20a3cfcd1c25021d/QuadD/Host/QdstrmImporter/main.cpp(36): Throw in function void {anonymous}::RunImport(boost::filesystem::path)
Dynamic exception type: boost::exception_detail::clone_impl<QuadDCommon::RuntimeException>
std::exception::what: RuntimeException
[QuadDCommon::tag_error_text*] = Status: TargetProfilingFailed
Props {
  Items {
    Type: DeviceId
    Value: "Local (CLI)"
  }
}
Error {
  Type: RuntimeError
  SubError {
    Type: ProcessEventsError
    Props {
      Items {
        Type: ErrorText
        Value: "/build/agent/work/20a3cfcd1c25021d/QuadD/Host/Analysis/SymbolAnalyzer/SymbolAnalyzer.cpp(216): Throw in function virtual QuadDSymbolAnalyzer::SymbolInfoLight QuadDSymbolAnalyzer::SymbolAnalyzer::RemoteResolveSymbol(QuadDCommon::TransferrableProcessId, const QuadDTimestamp&, uint64_t, bool)\nDynamic exception type: boost::exception_detail::clone_impl<QuadDCommon::TimeoutException>\nstd::exception::what: TimeoutException\n[QuadDCommon::tag_error_text*] = Wait time to resolve symbol expired\n"
      }
    }
  }
}

hwilper · November 7, 2018, 2:46pm

There are two things that may have happened here.

Is this qdstrmImporter from the same version of the tool that the collection is run? The .qdstrm format is a temporary format and not backwards/forwards compatible. It should be converted (by the script or by importing into the GUI) using the same version of the product.
That is a big .qdstrm file, is it possible that your system ran out of memory while processing the file? This can especially happen if you do a long collection on a big system (DGX or cluster node) and try to process on a smaller system. How long was the run/how big was the system?

mengda.yang · November 7, 2018, 7:35pm

Hi hwilper,

Thanks for the quick response! To answer your questions:

Yes, they are definitely from the same version of the tool. The installation package filename is NVIDIA_Nsights_Systems_Linux_2018.3.1.29.run
At first, I thought that may be the case when I ran the application on my local machine. But the same error occurs when I run this on the computing server with 256GiB of RAM. Glancing at htop, the memory usage was far from 100% (around ~32GiB total usage, most by the importer) before the importer crashes.

hwilper · November 7, 2018, 7:58pm

Okay, you are hitting a timeout on the symbol resolution.

Is there any chance that you could put the .qdstrm someplace I could download it from? We’d like to try it on our end. We’d also like the exact CLI command line you used to generate it, if possible.

hwilper · November 7, 2018, 8:11pm

Alternatively,

Can you run for a sorter duration?
Can you run with fewer trace options (or skip sampling)?
You are hitting the timeout because of the OS runtime trace. Skipping just that trace will probably get you past this.

mengda.yang · November 7, 2018, 9:16pm

Hi hwilper,

Thanks for your help!
Unfortunately, due to our company security policies, I can not share the .qestrm file with you.

The (obfuscated) command is
LD_LIBRARY= ./nsys profile -o <filename.qdstrm>
This is the first time I use nsys, so I used default settings.

Also, for your precious suggestions:

Can you run for a sorter duration?
What does “sorter duration” mean and how do I run for one?
Can you run with fewer trace options (or skip sampling)?
Do you mean sample less frequently? I think this is worth a try if there is such flag.
My application is written in plain cuda. I don’t think --trace will be much helpful but I’ll rerun the cli with -t=cuda.
You are hitting the timeout because of the OS runtime trace. Skipping just that trace will probably get you past this.
How do I do that? What trace are you referring to?

Thanks again for the reply!

hwilper · November 7, 2018, 9:32pm

Okay. First of all “sorter duration” should have been “shorter duration” which would have made more sense …

But, since you are mostly interested in CUDA, I would change your command line so that instead of the default trace (which includes the OS Runtime trace, which is causing your issue) you use:

nsys profile --trace=cuda,nvtx -o my_test_output [application-arguments]

It will trace all the CUDA APIs (on the CPU and GPU) and do normal sampling.

skottapalli · November 7, 2018, 9:39pm

Hi mengda.yang,

The CLI command you have used launches the application and profiles until the app exits.
./nsys profile -o <filename.qdstrm>
You are relying on the default CLI options to profile your application here. By default, the CLI traces CUDA, OpenGL, NVTX, and osrt. Also, the CPU sampling is turned on by default. To see the list of CLI options and the defaults, see the output of ./nsys profile --help

In order to profile for a shorter duration, use the --duration=X switch. This was hwilper’s first suggestion.

In order to trace only CUDA APIs, use the --trace=cuda option. This turns off tracing of APIs from all the other libraries. To turn off CPU sampling, use the --sample=none. This was hwilper’s second suggestion.

Topic		Replies	Views
Nsys Importation error Profiling Linux Targets cuda , kernel , ubuntu , nsight	16	2552	April 29, 2024
Unable to install / locate Nvidia Nsight Systems CLI Profiling Linux Targets	4	3782	December 19, 2019
A nsys error Profiling Linux Targets	7	426	June 28, 2024
Call stack is visible/captured only for some CUDA kernels (broken backtraces) Profiling Linux Targets	5	1422	December 29, 2022
Importer error status: An unknown error occurred. without any reason Profiling Linux Targets cudnn	2	138	July 31, 2024
Nsys doesn't dump .sqlite even after using --export Profiling Linux Targets	10	2103	December 16, 2024
Unable to capture "Can't find UUID for CUDA device" Profiling Linux Targets	10	2428	November 9, 2023
Nsys not creating nsys-rep file Profiling Linux Targets cudnn	5	48	March 26, 2025
Profling a simple deep learning code : no python backtrace + cannot use cudnn trace Profiling x86 Windows Targets cudnn	19	1148	December 13, 2023
Generating CUPTI_* tables with nsys Profiling Linux Targets cuda	25	1678	January 12, 2023

Using QdstrmImporter

Related topics