QdstrmImporter killed

Hi, I have a problem when I use “nsys profile” with large language model, such as BERT.

When it has done, nsys generated .qdstrm file and convert it to .nsys-rep file. But it seems to not be generated.

As I use QdstrmImporter to generate nsys-rep file from qdstrm file, Processing just has killed without noticing.

skku@pnode4:~/deepspeed/BERT/script/output$ /home/skku/cuda/cuda-12.1/nsight-systems-2023.1.2/host-linux-x64/QdstrmImporter myreport.qdstrm
Processing [====9% ]Killed

or

Processing [====9% ]terminate called without an active exception
Aborted (core dumped)

qdstrm file is too large (74 GB). Could you solve this problem please?

Thanks.

Dynamic exception type: struct boost::wrapexcept
std::exception::what: RuntimeException
[struct QuadDCommon::tag_message * ptr64] = Status: TargetProfilingFailed
Props {
Items {
Type: DeviceId
Value: “Local (CLI)”
}
}
Error {
Type: RuntimeError
SubError {
Type: ProcessEventsError
Props {
Items {
Type: ErrorText
Value: “C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Common\Core\FlatData.h(1692): Throw in function class FlatData::ConstObject<class QuadDCommon::FlatComm::Trace::TraceEvent,struct FlatData::SimpleAllocator> cdecl FlatData::Deserialize<class QuadDCommon::FlatComm::Trace::TraceEvent,struct FlatData::SimpleAllocator,class FlatData::ConstObject>(const void ***,unsigned **int64 &)\nDynamic exception type: struct boost::wrapexcept\nstd::exception::what: OutOfRangeException\n[struct QuadDCommon::tag_message * ptr64] = Supplied input buffer of size 4194447 is too small.\n”
}
}
}
}
Status: TargetProfilingFailed
Props {
Items {
Type: DeviceId
Value: “Local (CLI)”
}
}
Error {
Type: RuntimeError
SubError {
Type: ProcessEventsError
Props {
Items {
Type: ErrorText
Value: “C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Common\Core\FlatData.h(1692): Throw in function class FlatData::ConstObject<class QuadDCommon::FlatComm::Trace::TraceEvent,struct FlatData::SimpleAllocator> cdecl FlatData::Deserialize<class QuadDCommon::FlatComm::Trace::TraceEvent,struct FlatData::SimpleAllocator,class FlatData::ConstObject>(const void ***,unsigned **int64 &)\nDynamic exception type: struct boost::wrapexcept\nstd::exception::what: OutOfRangeException\n[struct QuadDCommon::tag_message * __ptr64] = Supplied input buffer of size 3568400 is too small.\n”
}
}
}
}
Status: TargetProfilingFailed
Props {
Items {
Type: DeviceId
Value: “Local (CLI)”
}
}
Error {
Type: RuntimeError
SubError {
Type: ProcessEventsError
Props {
Items {
Type: ErrorText
Value: “C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Common\Core\FlatData.h(1692): Throw in function class FlatData::ConstObject<class QuadDCommon::FlatComm::Files::Event,struct FlatData::SimpleAllocator> cdecl FlatData::Deserialize<class QuadDCommon::FlatComm::Files::Event,struct FlatData::SimpleAllocator,class FlatData::ConstObject>(const void *,unsigned **int64 &)\nDynamic exception type: struct boost::wrapexcept\nstd::exception::what: OutOfRangeException\n[struct QuadDCommon::tag_message * __ptr64] = Supplied input buffer of size 613224 is too small.\n”
}
}
}
}
Status: AnalysisFailed
Error {
Type: RuntimeError
SubError {
Type: OutOfRange
Props {
Items {
Type: OriginalExceptionClass
Value: “struct boost::wrapexcept”
}
Items {
Type: OriginalFile
Value: “C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Common\Core\FlatData.h”
}
Items {
Type: OriginalLine
Value: “1692”
}
Items {
Type: OriginalFunction
Value: “class FlatData::ConstObject<class QuadDCommon::FlatComm::Diagnostics::Event,struct FlatData::SimpleAllocator> cdecl FlatData::Deserialize<class QuadDCommon::FlatComm::Diagnostics::Event,struct FlatData::SimpleAllocator,class FlatData::ConstObject>(const void *,unsigned **int64 &)”
}
Items {
Type: ErrorText
Value: “Supplied input buffer of size 187 is too small.”
}
}
}
}

And this error message can be seen when I open qdstrm file with 2024.4 Nsight system GUI.

76G is really big, we may just be running out of RAM. What was your command line?

Hi. I have kinda the same problem. My nsys profile failed to created .nsys-rep file, so I tried to export qdstrm with QdstrmImporter, but it threw me an error:

Processing [===10%                                                             ]
Import Failed with unexpected exception: /dvs/p4/build/sw/devtools/Agora/Rel/QuadD_Main/QuadD/Host/QdstrmImporter/main.cpp(35): Throw in function {anonymous}::Importer::Importer(const boost::filesystem::path&, const boost::filesystem::path&, bool)
Dynamic exception type: boost::wrapexcept<QuadDCommon::RuntimeException>
std::exception::what: RuntimeException
[QuadDCommon::tag_message*] = Status: AnalysisFailed
Error {
  Type: RuntimeError
  SubError {
    Type: OutOfRange
    Props {
      Items {
        Type: OriginalExceptionClass
        Value: "N5boost10wrapexceptIN11QuadDCommon19OutOfRangeExceptionEEE"
      }
      Items {
        Type: OriginalFile
        Value: "/dvs/p4/build/sw/devtools/Agora/Rel/QuadD_Main/QuadD/Common/Core/FlatData.h"
      }
      Items {
        Type: OriginalLine
        Value: "1691"
      }
      Items {
        Type: OriginalFunction
        Value: "ResultObject<Class, Allocator> FlatData::Deserialize(const void*, size_t&) [with Class = QuadDCommon::FlatComm::Diagnostics::Event; Allocator = FlatData::SimpleAllocator; ResultObject = FlatData::ConstObject; size_t = long unsigned int]"
      }
      Items {
        Type: ErrorText
        Value: "Supplied input buffer of size 278 is too small."
      }
    }
  }
}


Aborted

My qdstrm file is 6.3 GB, and I definitely didn’t run out of memory, since I have 256GB RAM, and in btop it showed about 50% RAM consumption when it failed.

Was the qdstrmimporter from the same version of Nsys as generated the .qdstrm file?