Make MLPerf closed/NVIDIA on AGX Orin gives dali function mismatching

Hi, I am trying to compile MLPerf-inference closed/NVIDIA on my Jetson AGX Orin, referring to https://forums.developer.nvidia.com/t/mlperf-2-1-on-jetson-agx-orin/231915/4
.
I have cuda11.4, cudnn, TensorRT 8.4.1 installed. And it is running the latest JetPack:

# R35 (release), REVISION: 1.0, GCID: 31346300, BOARD: t186ref, EABI: aarch64, DATE: Thu Aug 25 18:41:45 UTC 2022

.
And I also did pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist --upgrade nvidia-dali-cuda110 to install dali.

As I do make clean && make build -j12 in inference_results_v2.1/closed/NVIDIA, it says:

-- Generating done
-- Build files have been written to: /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness
make[2]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[2]: warning: -j0 forced in submake: resetting jobserver mode.
make[3]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
Scanning dependencies of target harness_bert
Scanning dependencies of target harness_dlrm
Scanning dependencies of target harness_rnnt
Scanning dependencies of target harness_3dunet
Scanning dependencies of target harness_triton_unified
Scanning dependencies of target lwis
Scanning dependencies of target harness_triton
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[  2%] Building CXX object lwis/CMakeFiles/lwis.dir/src/lwis.cpp.o
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 12%] Building CXX object CMakeFiles/harness_dlrm.dir/harness_dlrm/dlrm_server.cc.o
[ 12%] Building CXX object CMakeFiles/harness_dlrm.dir/harness_dlrm/main_dlrm.cc.o
[ 12%] Building CXX object CMakeFiles/harness_bert.dir/harness_bert/main_bert.cc.o
[ 12%] Building CXX object CMakeFiles/harness_bert.dir/harness_bert/bert_server.cc.o
[ 12%] Building CXX object CMakeFiles/harness_dlrm.dir/common/logger.cpp.o
[ 14%] Building CUDA object CMakeFiles/harness_dlrm.dir/harness_dlrm/dlrm_kernels.cu.o
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 16%] Building CXX object CMakeFiles/harness_dlrm.dir/harness_dlrm/batch_maker.cpp.o
[ 18%] Building CUDA object CMakeFiles/harness_rnnt.dir/harness_rnnt/rnnt_kernels.cu.o
[ 20%] Building CXX object CMakeFiles/harness_rnnt.dir/harness_rnnt/main_rnnt.cc.o
[ 24%] Building CXX object CMakeFiles/harness_bert.dir/common/logger.cpp.o
[ 24%] Building CXX object CMakeFiles/harness_bert.dir/harness_bert/bert_core_vs.cc.o
[ 28%] Building CXX object CMakeFiles/harness_rnnt.dir/common/logger.cpp.o
[ 28%] Building CXX object CMakeFiles/harness_3dunet.dir/harness_3dunet/main_3dunet.cc.o
[ 30%] Building CXX object CMakeFiles/harness_3dunet.dir/harness_3dunet/lwis_3dunet.cpp.o
[ 32%] Building CXX object CMakeFiles/harness_3dunet.dir/common/logger.cpp.o
[ 34%] Building CUDA object CMakeFiles/harness_3dunet.dir/harness_3dunet/unet3d_sw.cu.o
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[4]: Entering directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 36%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_callbacks.cpp.o
[ 38%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_request_pool.cpp.o
[ 40%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_device_gpu.cpp.o
[ 42%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/bert_concurrent_frontend.cpp.o
[ 46%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_device.cpp.o
[ 46%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/main_server.cc.o
[ 48%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_concurrent_sut.cpp.o
[ 51%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_workload.cpp.o
[ 53%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/dla_triton_frontend_helpers.cpp.o
[ 55%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/dlrm_triton_concurrent_frontend.cpp.o
[ 57%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/dla_concurrent_frontend.cpp.o
[ 59%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/triton_concurrent_frontend.cpp.o
[ 61%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/triton_helpers.cpp.o
[ 63%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/triton_frontend_server.cpp.o
[ 65%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/src/pinned_memory_pool.cpp.o
[ 67%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/triton_frontend_helpers.cpp.o
[ 71%] Building CXX object CMakeFiles/harness_triton_unified.dir/common/logger.cpp.o
[ 71%] Building CXX object CMakeFiles/harness_triton_unified.dir/harness_triton_unified/main_server.cpp.o
[ 75%] Building CXX object CMakeFiles/harness_triton_unified.dir/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/_deps/repo-core-build/triton-core/_deps/repo-common-build/protobuf/model_config.pb.cc.o
[ 75%] Building CXX object CMakeFiles/harness_triton.dir/harness_triton/src/pinned_memory_pool.cpp.o
[ 77%] Building CXX object CMakeFiles/harness_triton.dir/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/_deps/repo-core-build/triton-core/_deps/repo-common-build/protobuf/model_config.pb.cc.o
[ 79%] Building CXX object CMakeFiles/harness_triton.dir/common/logger.cpp.o
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp: In constructor 'DaliPipeline::DaliPipeline(int, std::string&, std::shared_ptr<AudioBufferManagement>, size_t, device_type_t, size_t, size_t, size_t)':
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:661:95: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, long unsigned int&)'
  661 |         mScatterGatherH2D = new dali::kernels::ScatterGatherGPU(mSGBlockSize, estimatedNBlocks);
      |                                                                                               ^
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t)'
  119 |   explicit ScatterGatherGPU(size_t max_size_per_block) : ScatterGatherBase(max_size_per_block) {}
      |            ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU()'
  117 |   ScatterGatherGPU() = default;
      |   ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note:   candidate expects 0 arguments, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(const dali::kernels::ScatterGatherGPU&)'
  113 | class DLL_PUBLIC ScatterGatherGPU : public ScatterGatherBase {
      |                  ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(dali::kernels::ScatterGatherGPU&&)'
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:662:119: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t, size_t&)'
  662 |         mScatterGatherD2DData = new dali::kernels::ScatterGatherGPU(mAudioBufManager->getBufLineSize(), mDaliBatchSize);
      |                                                                                                                       ^
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t)'
  119 |   explicit ScatterGatherGPU(size_t max_size_per_block) : ScatterGatherBase(max_size_per_block) {}
      |            ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU()'
  117 |   ScatterGatherGPU() = default;
      |   ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note:   candidate expects 0 arguments, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(const dali::kernels::ScatterGatherGPU&)'
  113 | class DLL_PUBLIC ScatterGatherGPU : public ScatterGatherBase {
      |                  ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(dali::kernels::ScatterGatherGPU&&)'
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:663:102: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(long unsigned int, size_t&)'
  663 |         mScatterGatherD2DSeqLen = new dali::kernels::ScatterGatherGPU(sizeof(int64_t), mDaliBatchSize);
      |                                                                                                      ^
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t)'
  119 |   explicit ScatterGatherGPU(size_t max_size_per_block) : ScatterGatherBase(max_size_per_block) {}
      |            ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU()'
  117 |   ScatterGatherGPU() = default;
      |   ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note:   candidate expects 0 arguments, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(const dali::kernels::ScatterGatherGPU&)'
  113 | class DLL_PUBLIC ScatterGatherGPU : public ScatterGatherBase {
      |                  ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(dali::kernels::ScatterGatherGPU&&)'
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc: In constructor 'Stream::Stream(int, std::shared_ptr<qsl::SampleLibrary>, std::shared_ptr<WarmupSampleLibrary>, std::shared_ptr<SyncWorkQueue>, std::shared_ptr<AudioBufferManagement>, const string&, size_t)':
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:2081:87: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, size_t&)'
 2081 |     mScatterGatherD2HData = new dali::kernels::ScatterGatherGPU(sampleSize, mBatchSize);
      |                                                                                       ^
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t)'
  119 |   explicit ScatterGatherGPU(size_t max_size_per_block) : ScatterGatherBase(max_size_per_block) {}
      |            ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU()'
  117 |   ScatterGatherGPU() = default;
      |   ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note:   candidate expects 0 arguments, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(const dali::kernels::ScatterGatherGPU&)'
  113 | class DLL_PUBLIC ScatterGatherGPU : public ScatterGatherBase {
      |                  ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(dali::kernels::ScatterGatherGPU&&)'
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:2082:87: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, size_t&)'
 2082 |     mScatterGatherD2DData = new dali::kernels::ScatterGatherGPU(sampleSize, mBatchSize);
      |                                                                                       ^
In file included from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/metadata.hpp:28,
                 from /root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:56:
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t)'
  119 |   explicit ScatterGatherGPU(size_t max_size_per_block) : ScatterGatherBase(max_size_per_block) {}
      |            ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:119:12: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU()'
  117 |   ScatterGatherGPU() = default;
      |   ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:117:3: note:   candidate expects 0 arguments, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(const dali::kernels::ScatterGatherGPU&)'
  113 | class DLL_PUBLIC ScatterGatherGPU : public ScatterGatherBase {
      |                  ^~~~~~~~~~~~~~~~
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note: candidate: 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(dali::kernels::ScatterGatherGPU&&)'
/usr/local/lib/python3.8/dist-packages/nvidia/dali/include/dali/kernels/common/scatter_gather.h:113:18: note:   candidate expects 1 argument, 2 provided
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_triton_unified/src/triton_device.cpp: In member function 'virtual void triton_frontend::ITritonDevice::Completion(TRITONSERVER_InferenceResponse*, const triton_frontend::ResponseMetaData*)':
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_triton_unified/src/triton_device.cpp:385:95: warning: narrowing conversion of 'batch1_output_size' from 'int32_t' {aka 'int'} to 'size_t' {aka 'long unsigned int'} [-Wnarrowing]
  385 |             mlperf::QuerySampleResponse{(response_metadata->m_ResponseId)[i], output0_result, batch1_output_size});
      |                                                                                               ^~~~~~~~~~~~~~~~~~
make[4]: *** [CMakeFiles/harness_rnnt.dir/build.make:82: CMakeFiles/harness_rnnt.dir/harness_rnnt/main_rnnt.cc.o] Error 1
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[3]: *** [CMakeFiles/Makefile2:208: CMakeFiles/harness_rnnt.dir/all] Error 2
make[3]: *** Waiting for unfinished jobs....
[ 81%] Linking CXX static library liblwis.a
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 81%] Built target lwis
[ 83%] Linking CXX executable /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/bin/harness_bert
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 83%] Built target harness_bert
[ 85%] Linking CXX executable /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/bin/harness_dlrm
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 85%] Built target harness_dlrm
[ 87%] Linking CXX executable /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/bin/harness_3dunet
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 87%] Built target harness_3dunet
[ 89%] Linking CXX executable /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/bin/harness_triton
[ 91%] Linking CXX executable /root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/bin/harness_triton_unified
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 91%] Built target harness_triton
make[4]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
[ 91%] Built target harness_triton_unified
make[3]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[2]: *** [Makefile:103: all] Error 2
make[2]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA/build/harness'
make[1]: *** [Makefile:625: build_harness] Error 2
make[1]: Leaving directory '/root/ssd/library/inference_results_v2.1/closed/NVIDIA'
make: *** [Makefile:513: build] Error 2

As you notice, the errors are about dali functions:
/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:661:95: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, long unsigned int&)' 661 | mScatterGatherH2D = new dali::kernels::ScatterGatherGPU(mSGBlockSize, estimatedNBlocks);

/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:662:119: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t, size_t&)' 662 | mScatterGatherD2DData = new dali::kernels::ScatterGatherGPU(mAudioBufManager->getBufLineSize(), mDaliBatchSize);

/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/preprocessing.hpp:663:102: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(long unsigned int, size_t&)' 663 | mScatterGatherD2DSeqLen = new dali::kernels::ScatterGatherGPU(sizeof(int64_t), mDaliBatchSize);

/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:2081:87: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, size_t&)' 2081 | mScatterGatherD2HData = new dali::kernels::ScatterGatherGPU(sampleSize, mBatchSize);

/root/ssd/library/inference_results_v2.1/closed/NVIDIA/code/harness/harness_rnnt/main_rnnt.cc:2082:87: error: no matching function for call to 'dali::kernels::ScatterGatherGPU::ScatterGatherGPU(size_t&, size_t&)' 2082 | mScatterGatherD2DData = new dali::kernels::ScatterGatherGPU(sampleSize, mBatchSize);

I am wondering what could this problem relate to. Like do I need to build Dali from source on my Jetson Orin?

Thanks.

Hi,

Do you want to reproduce the MLPerf 2.1 results listed below:

If yes, please setup Orin with CUDA-X AI DP software which can be found in the below link:
(TensorRT 8.5 + CUDA 11.4)

Thanks.

Thanks a lot, it works!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.