Hi, I have a pytorch training workflow which when profiled through nsys (or through dlprof by adding extra line: import nvidia_dlprof_pytorch_nvtx as nvtx
and initiating training look within the context torch.autograd.profiler.emit_nvtx()
) which gives me the following error at the end of profiling:
Creating final output files...
Processing [===============================================================100%]
**** Analysis failed with:
Status: TargetProfilingFailed
Props {
Items {
Type: DeviceId
Value: "Local (CLI)"
}
}
Error {
Type: RuntimeError
SubError {
Type: ProcessEventsError
Props {
Items {
Type: ErrorText
Value: "/build/agent/work/20a3cfcd1c25021d/QuadD/Host/Analysis/EventHandler/PerfEventHandler.cpp(501): Throw in function void QuadDAnalysis::EventHandler::PerfEventHandler::PutCpuEvent(QuadDCommon::CpuId, QuadDAnalysis::EventHandler::PerfEventHandler::EventPtr)\nDynamic exception type: boost::exception_detail::clone_impl<QuadDAnalysis::ChronologicalOrderError>\nstd::exception::what: ChronologicalOrderError\n[QuadDCommon::tag_message*] = Cpu event chronological order was broken.\n"
}
}
}
}
These are the following version of installations:
- CUDA: 11.3
- nsys: 2021.3.2.12-9700a21
- dlprof: v1.8.0 built on 2021-12-01 08:22:18 (Build 29839685)
Even the output sqlite file is being recognised as an invalid DLprof database when profiled through dlprof. getting the same errors on two remote systems one with V100 and another with A100.