Nsys not creating nsys-rep file

Hello,
I am trying to profile a simple python script for pytorch / deep learning. But instead of getting my .nsys-rep and sqlite file, i get a qdstrm

I run:

nsys profile \
    --stats=true \
    --force-overwrite true \
    --output output_name \
    --trace=cuda,nvtx,osrt,cudnn,cublas \
    --sample=cpu \
    python my_script.py

Evertything seems fine until the end:

Training: 100%|__________| 20/20 [01:06<00:00,  3.30s/it, loss=0.4385]
Generating '/workspace/diffusion_benchmark/tmp/nsys-report-48e1.qdstrm'
[1/8] [=============58%            ] output_name.nsys-rep
Importer error status: An unknown error occurred.
Generated:
        /workspace/diffusion_benchmark/output_name.qdstrm

There seems to be some error when generating the nsys-rep file (at 58%?)
I see the output is stored in output_name.qdstrmbut I cannot open this with Nsight system graphic interface so I guess I need the nsys-rep file. But it says unknown error.

I am running this in a rented machine in vast.ai. I am not sure if there could be any issues with permissions in tmp folder or something, I have set export TMPDIR=$(pwd)/tmp just in case.

I was trying to move back to a previous nsys version but I was not sure how to, since the archives docs share the same documentation for installation.

NVIDIA Nsight Systems version 2025.1.1.131-251135540420v0
Linux 623e89ef1779 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

I am running a multi-gpu script not sure if that matters.

the .qdstrm file is, in fact, the temporary file that Nsys generates during the run and which is usually immediately processed into the .nsys-rep file. I suspect that you are correct about it being a permissions issue.

What you can do is run the qdstrmimporter utility yourself on the host side. see User Guide — nsight-systems 2025.1 documentation for details on how to do that (I promise, that is a direct link to CLI troubleshooting, the forum software jjust munges the title to the top page and there isn’t a thing I can do about it).

Thank you for your reply. Unfortunately this showed similar issue, only now I get a little more info on the error (segmenatation fault error).

/opt/nvidia/nsight-systems/2025.1.1/host-linux-x64/QdstrmImporter -i output_name.qdstrm -o output_formatted
Processing [====================================59%                            ]Segmentation fault (core dumped)

EDIT: By host you mean the same machine I am running the profiling on?
Right now my workflow consists in running nsys in my server then scp-ing the output files to my local macbook where I inspect with the Nsight Systems Gui. Are you suggesting I run QdstrmImporter in my Macbook?

Host = where you launch from and read results
Target/Device = where the application is running or you are profiling.

So in this case we are trying to process on the target, and it is failing. Please try it on your Mac.

I tried to convert it in a different machine - Ubuntu machine and that worked, thanks.
However, I run into more issues. Firstly, the output file was only 100Kb, which I found suspicious. After, when I inspect with the GUI, It gives me some errors (see image), and the profiler diagram is empty.
I realised that the paranoid level for my machine was set to 3. I am not sure if this error is caused by this. Unfortunately it’s not letting me change the paranoid level due to being in a rented cloud machine.

There is a chart of what data is enabled at which paranoid level at User Guide — nsight-systems 2025.2 documentation (that’s a direct link to the CPU support portion, but the forum software munges the text in a way I cannot control). However, basically, since we get pretty much all the CPU information from the linux perf subsystem, we get nothing there at 3.

But all of your gpu/api trace things should be available. How big was the .qdstrm file that was converted?