I am new to nsys so just start with the classic profile to test my two-gpu distributed training model. I wrote a simple python file with part of my calculation, the report is successfully created but when I move to the general model to train, there is an importation problem.
*Generating '/tmp/nsys-report-51a3.qdstrm'*
*[1/6] [11% ] report10.nsys-rep*
*Importer error status: Importation failed.*
*Import Failed with unexpected exception: /dvs/p4/build/sw/devtools/Agora/Rel/QuadD_Main/QuadD/Host/QdstrmImporter/main.cpp(34): Throw in function {anonymous}::Importer::Importer(const boost::filesystem::path&, const boost::filesystem::path&)*
*Dynamic exception type: boost::wrapexcept<QuadDCommon::RuntimeException>*
*std::exception::what: RuntimeException*
*[QuadDCommon::tag_message*] = Status: AnalysisFailed*
*Error {*
* Type: RuntimeError*
* SubError {*
* Type: InvalidArgument*
* Props {*
* Items {*
* Type: OriginalExceptionClass*
* Value: "N5boost10wrapexceptIN11QuadDCommon24InvalidArgumentExceptionEEE"*
* }*
* Items {*
* Type: OriginalFile*
* Value: "/dvs/p4/build/sw/devtools/Agora/Rel/QuadD_Main/QuadD/Host/Analysis/Modules/EventCollection.cpp"*
* }*
* Items {*
* Type: OriginalLine*
* Value: "1055"*
* }*
* Items {*
* Type: OriginalFunction*
* Value: "void QuadDAnalysis::EventCollection::CheckOrder(QuadDAnalysis::EventCollectionHelper::EventContainer&, const QuadDAnalysis::ConstEvent&) const"*
* }*
* Items {*
* Type: ErrorText*
* Value: "Wrong event order has been detected when adding events to the collection:\nnew event ={ StartNs=46074551461 StopNs=46074555849 GlobalId=283992901438016 Event={ TraceProcessEvent=[{ Correlation=13351336 EventClass=0 TextId=69 ReturnValue=0 },] } Type=48 }\nlast event ={ StartNs=46418590268 StopNs=46418605526 GlobalId=283992901438016 Event={ TraceProcessEvent=[{ Correlation=13493697 EventClass=0 TextId=69 ReturnValue=0 },] } Type=48 }"*
* }*
* }*
* }*
*}*
There is a created qdstrm file but cannot be imported by GUI manually, even if I have tried to move to another Windows device with same version to open, there is another error: The report was possibly created with a newer version of NVIDIA Nsight Systems. Please upgrade to the latest version and try again.
But I think the CPU test works well when I set up the training completely on CPU. So it cannot import the result only when CUDA core is related no matter how many the CUDA core is using. And I tried to add trace--cuda, but there is no difference.
Even though I am using Ubuntu 22.10, but ubuntu 22.04 2024.1.1 is available for some successful case above. And other information is here:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
Hope to hear from you! Thank you so much for any help and suggestions.
Can you tell me what version of Nsys you are using? If it is just the one from the CUDA Toolkit, can you update to the newest version available at developer.nvidia.com/nsight-systems ?
Are you running this from the command line or the GUI? I am assuming the command line, can you get me the command line you generated the .qdstrm file with?
Hi hwilper, I am using the newest version 2024.1. I have used both CLI and GUI actually, both of them report the same error. The command line is to run a training script nsys profile --stats=true ./Train_Ranks_GPU_01.sh
Actually, I think I found a trick to get rid of this error. I manually stopped my training model and the report would be generated successfully. I have no idea why this works but hope this could help someone for future use.
It seems to be a problem with the nsight version. After I returned the nsight version to 2023.2.1.12, this problem disappeared. The version I am using now is 2024.1.1
That was during the profiling run, correct, can you try to post-process the qdstrm file.
Here is the section from the User Guide:
Create .nsys-rep Using QdstrmImporter
The CLI and QdstrmImporter versions must match to convert a .qdstrm file into a .nsys-rep file. This .nsys-rep file can then be opened in the same version or more recent versions of the GUI.
To run QdstrmImporter on the host system, find the QdstrmImporter binary in the Host-x86_64 directory in your installation. QdstrmImporter is available for all host platforms. See options below.
To run QdstrmImporter on the target system, copy the Linux Host-x86_64 directory to the target Linux system or install Nsight Systems for Linux host directly on the target. The Windows or macOS host QdstrmImporter will not work on a Linux Target. See options below.
Short
Long
Parameter
Description
-h
--help
Help message providing information about available options and their parameters.
-v
--version
Output QdstrmImporter version information
-i
--input-file
filename or path
Import .qdstrm file from this location.
-o
--output-file
filename or path
Provide a different file name or path for the resulting .nsys-rep file. Default is the same name and path as the .qdstrm file
Hello, what is the output of nvidia-smi command on the target system? We had a recent bug in CUPTI which caused this kind of out-of-order error. It is fixed in 2024.2 version of nsys. Could you try the 2024.2 version of nsys, please?