The trt exec could not predict the image properly with resNet50.onnx model

Continuing the discussion from Wanted to know about how to make use of --loadInputs=spec option in trtexec:

Hi all, this is w.r.t the previous discussion I had for which I was late to reply which is now in a closed state.

I tried modifying the trt_exec source code to dump predicted output values and I succeeded in it. But the output values I got are the same for all the predicted classes.

I want to give a context to it. I have inferred ResNet50.onnx model for object detection.
This is the command I have used
./trtexec --onnx=…/data/resnet50/ResNet50.onnx --int8 --loadInputs= /home/nagaraj/input_tensor.dat --dumpOutput

The pretrained onnx file ResNet50.onnx is part of the tensorrt SDK.

The input test file input_tensor.dat contains image data converted to .dat format

As trtexec was not displaying the actual predicted values, I have modified the file sampleUtils.h to dump the output as part of the option --dumpOutput

The model produces an output of shape 1 X 1000

I tried dumping it but for each of the predicted class it displayed the same probability values. I am pasting a few of them and all the values present in the attached file.

Prob 0 0.0010 Class 0:
[01/03/2024-07:53:30] [I] Prob 1 0.0010 Class 1:
[01/03/2024-07:53:30] [I] Prob 2 0.0010 Class 2:
[01/03/2024-07:53:30] [I] Prob 3 0.0010 Class 3:
[01/03/2024-07:53:30] [I] Prob 4 0.0010 Class 4:
[01/03/2024-07:53:30] [I] Prob 5 0.0010 Class 5:

As you can see the value 0.0010 is same for all the classes, where the classes are from Class 1 - Class 1000

In the attached file trt_exec_resnet50_infer_output.txt all the values are present.

I have modified the api
inline void dumpBuffer() from the file “sampleUtils.h” and I am also attaching it. You can take the entire api dumpBuffer() and replace the one at your side with its contents and verify.

Attached files

  1. input_tensor.dat - It contents the image data fed as test input with option --loadInputs

  2. ResNet50.onnx - It is the model file used for inference. If you want you can use your own ResNet50.onnx model

  3. trt_exec_resnet50_infer_output - It contains the predicted output values displayed from the function dumpBuffer()

.4. dump_buffer.tx - It contains the api dumpBuffer() which I have modified in the file /usr/src/tensorrt/samples/common/sampleUtils.h
You can replace dumpBuffer() api at your side from its contents

Attached is the zip file containing all these files.
trt_exec_issue.zip (91.0 MB)

I request you to verify it at your side. If you feel that I have created a wrong input_tensor.dat file then you can use your own .dat file for verification.

Please provide me an early response as it is very much required as my thesis submission date is nearing.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
I will check repro the issue. Meanwhile, could you check if TensorRT/samples/sampleINT8API/sampleINT8API.cpp at release/8.6 · NVIDIA/TensorRT · GitHub uses same model as yours. If so, you can verify if the input/output to model in this sample and your trtexec code is matching. Please see verify_output and prepare input methods. Let me know if it helps.

Hi SivaRamaKrishnan, thank you for your reply. I will look into this code. I have done inference of this binary and found the output. Thank you for that.
But I want to be clear about the following line which I got from the logs
[TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.
Let me know what does it mean.

Also I have found a line of statement
// Asynchronously enqueue the inference work
if (!context->enqueueV2(buffers.getDeviceBindings().data(), stream, nullptr))
{
return sample::Logger::TestResult::kFAILED;
}

in the method sample::Logger::TestResult SampleINT8API::infer()

I feel this is the right place where we can add cuda graph. Can you please help me in giving me a modified code in this method/api infer() to implement the cuda graph.

If I get this solution then my work will complete because I could see layer fusion information in the log messages. If CUDA graph works with this file then I get everything what I wanted.

Thanks and Regards

Nagaraj Trivedi

Hi SivaRamaKrishnan, I have taken the code from the method/API
bool SampleINT8API::verifyOutput(const samplesCommon::BufferManager& buffers) const

from the file /usr/src/tensorrt/samples/sampleINT8API/sampleINT8API.cpp

Though it displayed the inference results in the human readable format, but the trtexec could not infer the image correctly. I have provided the input image related to the tabby_tiger.dat but it inferred it as probable images
Inference result: Detected:
[01/06/2024-10:36:08] [I] [1] spotlight
[01/06/2024-10:36:08] [I] [2] wall clock
[01/06/2024-10:36:08] [I] [3] lampshade
[01/06/2024-10:36:08] [I] [4] television
[01/06/2024-10:36:08] [I] [5] radiator

This is the command I have executed
/usr/src/tensorrt/bin$ ./trtexec --onnx=…/data/resnet50/ResNet50.onnx --int8 --loadInputs=tabby_tiger.dat --dumpOutput
I request you to please infer the ResNet50.onnx model at your side with your own set of .dat files.
In order for you to see the infer results in a human readable form I am attaching the sampleUtils.h file which you can place it directly in the directory
/usr/src/tensorrt/samples/common/
I am also attaching another file that converts an input image to a .dat file. This you can verify whether I have done conversion of image file to a .dat format properly or not. If you feel that I have not done the conversion to the .dat file properly then you can use your own .dat files.

The attached zip file contains both sampleUtils.h and create_dat.py files.
sampleUtils.zip (5.1 KB)

I request you to perform inference of the ResNet50.onnx which will be available as part of the tensorrt SDK, but use the sampleUtils.h file attached. Let me know what results you get.

Thanks and Regards

Nagaraj Trivedi

Hi @trivedi.nagaraj ,
I could run the trtexec binary with tabby_tiger.data and got below output.

nvidia@tegra-ubuntu:~/siva/sampleUtils$ /usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx --loadInputs=gpu_0/data_0:tabby_tiger_new.dat --dumpOutput
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx --loadInputs=gpu_0/data_0:tabby_tiger_new.dat --dumpOutput
[01/06/2024-21:17:35] [I] === Model Options ===
[01/06/2024-21:17:35] [I] Format: ONNX
[01/06/2024-21:17:35] [I] Model: /usr/src/tensorrt/data/resnet50/ResNet50.onnx
[01/06/2024-21:17:35] [I] Output:
[01/06/2024-21:17:35] [I] === Build Options ===
[01/06/2024-21:17:35] [I] Max batch: explicit batch
[01/06/2024-21:17:35] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/06/2024-21:17:35] [I] minTiming: 1
[01/06/2024-21:17:35] [I] avgTiming: 8
[01/06/2024-21:17:35] [I] Precision: FP32
[01/06/2024-21:17:35] [I] LayerPrecisions:
[01/06/2024-21:17:35] [I] Calibration:
[01/06/2024-21:17:35] [I] Refit: Disabled
[01/06/2024-21:17:35] [I] Sparsity: Disabled
[01/06/2024-21:17:35] [I] Safe mode: Disabled
[01/06/2024-21:17:35] [I] DirectIO mode: Disabled
[01/06/2024-21:17:35] [I] Restricted mode: Disabled
[01/06/2024-21:17:35] [I] Build only: Disabled
[01/06/2024-21:17:35] [I] Save engine:
[01/06/2024-21:17:35] [I] Load engine:
[01/06/2024-21:17:35] [I] Profiling verbosity: 0
[01/06/2024-21:17:35] [I] Tactic sources: Using default tactic sources
[01/06/2024-21:17:35] [I] timingCacheMode: local
[01/06/2024-21:17:35] [I] timingCacheFile:
[01/06/2024-21:17:35] [I] Heuristic: Disabled
[01/06/2024-21:17:35] [I] Preview Features: Use default preview flags.
[01/06/2024-21:17:35] [I] Input(s)s format: fp32:CHW
[01/06/2024-21:17:35] [I] Output(s)s format: fp32:CHW
[01/06/2024-21:17:35] [I] Input build shapes: model
[01/06/2024-21:17:35] [I] Input calibration shapes: model
[01/06/2024-21:17:35] [I] === System Options ===
[01/06/2024-21:17:35] [I] Device: 0
[01/06/2024-21:17:35] [I] DLACore:
[01/06/2024-21:17:35] [I] Plugins:
[01/06/2024-21:17:35] [I] === Inference Options ===
[01/06/2024-21:17:35] [I] Batch: Explicit
[01/06/2024-21:17:35] [I] Input inference shapes: model
[01/06/2024-21:17:35] [I] Iterations: 10
[01/06/2024-21:17:35] [I] Duration: 3s (+ 200ms warm up)
[01/06/2024-21:17:35] [I] Sleep time: 0ms
[01/06/2024-21:17:35] [I] Idle time: 0ms
[01/06/2024-21:17:35] [I] Streams: 1
[01/06/2024-21:17:35] [I] ExposeDMA: Disabled
[01/06/2024-21:17:35] [I] Data transfers: Enabled
[01/06/2024-21:17:35] [I] Spin-wait: Disabled
[01/06/2024-21:17:35] [I] Multithreading: Disabled
[01/06/2024-21:17:35] [I] CUDA Graph: Disabled
[01/06/2024-21:17:35] [I] Separate profiling: Disabled
[01/06/2024-21:17:35] [I] Time Deserialize: Disabled
[01/06/2024-21:17:35] [I] Time Refit: Disabled
[01/06/2024-21:17:35] [I] NVTX verbosity: 0
[01/06/2024-21:17:35] [I] Persistent Cache Ratio: 0
[01/06/2024-21:17:35] [I] Inputs:
[01/06/2024-21:17:35] [I] gpu_0/data_0<-tabby_tiger_new.dat
[01/06/2024-21:17:35] [I] === Reporting Options ===
[01/06/2024-21:17:35] [I] Verbose: Disabled
[01/06/2024-21:17:35] [I] Averages: 10 inferences
[01/06/2024-21:17:35] [I] Percentiles: 90,95,99
[01/06/2024-21:17:35] [I] Dump refittable layers:Disabled
[01/06/2024-21:17:35] [I] Dump output: Enabled
[01/06/2024-21:17:35] [I] Profile: Disabled
[01/06/2024-21:17:35] [I] Export timing to JSON file:
[01/06/2024-21:17:35] [I] Export output to JSON file:
[01/06/2024-21:17:35] [I] Export profile to JSON file:
[01/06/2024-21:17:35] [I]
[01/06/2024-21:17:35] [I] === Device Information ===
[01/06/2024-21:17:35] [I] Selected Device: Xavier
[01/06/2024-21:17:35] [I] Compute Capability: 7.2
[01/06/2024-21:17:35] [I] SMs: 8
[01/06/2024-21:17:35] [I] Compute Clock Rate: 1.377 GHz
[01/06/2024-21:17:35] [I] Device Global Memory: 31010 MiB
[01/06/2024-21:17:35] [I] Shared Memory per SM: 96 KiB
[01/06/2024-21:17:35] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/06/2024-21:17:35] [I] Memory Clock Rate: 1.377 GHz
[01/06/2024-21:17:35] [I]
[01/06/2024-21:17:35] [I] TensorRT version: 8.5.2
[01/06/2024-21:17:35] [I] [TRT] [MemUsageChange] Init CUDA: CPU +187, GPU +0, now: CPU 216, GPU 7473 (MiB)
[01/06/2024-21:17:37] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +106, GPU +100, now: CPU 344, GPU 7594 (MiB)
[01/06/2024-21:17:37] [I] Start parsing network model
[01/06/2024-21:17:37] [I] [TRT] ----------------------------------------------------------------
[01/06/2024-21:17:37] [I] [TRT] Input filename:   /usr/src/tensorrt/data/resnet50/ResNet50.onnx
[01/06/2024-21:17:37] [I] [TRT] ONNX IR version:  0.0.3
[01/06/2024-21:17:37] [I] [TRT] Opset version:    9
[01/06/2024-21:17:37] [I] [TRT] Producer name:    onnx-caffe2
[01/06/2024-21:17:37] [I] [TRT] Producer version:
[01/06/2024-21:17:37] [I] [TRT] Domain:
[01/06/2024-21:17:37] [I] [TRT] Model version:    0
[01/06/2024-21:17:37] [I] [TRT] Doc string:
[01/06/2024-21:17:37] [I] [TRT] ----------------------------------------------------------------
[01/06/2024-21:17:37] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/06/2024-21:17:37] [I] Finish parsing network model
[01/06/2024-21:17:37] [I] [TRT] ---------- Layers Running on DLA ----------
[01/06/2024-21:17:37] [I] [TRT] ---------- Layers Running on GPU ----------
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/conv1_1 + node_of_gpu_0/res_conv1_bn_1 + node_of_gpu_0/res_conv1_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] POOLING: node_of_gpu_0/pool1_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2a_1 + node_of_gpu_0/res2_0_branch2a_bn_1 + node_of_gpu_0/res2_0_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2b_1 + node_of_gpu_0/res2_0_branch2b_bn_1 + node_of_gpu_0/res2_0_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch1_1 + node_of_gpu_0/res2_0_branch1_bn_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2c_1 + node_of_gpu_0/res2_0_branch2c_bn_1 + node_of_gpu_0/res2_0_branch2c_bn_2 + node_of_gpu_0/res2_0_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2a_1 + node_of_gpu_0/res2_1_branch2a_bn_1 + node_of_gpu_0/res2_1_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2b_1 + node_of_gpu_0/res2_1_branch2b_bn_1 + node_of_gpu_0/res2_1_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2c_1 + node_of_gpu_0/res2_1_branch2c_bn_1 + node_of_gpu_0/res2_1_branch2c_bn_2 + node_of_gpu_0/res2_1_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2a_1 + node_of_gpu_0/res2_2_branch2a_bn_1 + node_of_gpu_0/res2_2_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2b_1 + node_of_gpu_0/res2_2_branch2b_bn_1 + node_of_gpu_0/res2_2_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2c_1 + node_of_gpu_0/res2_2_branch2c_bn_1 + node_of_gpu_0/res2_2_branch2c_bn_2 + node_of_gpu_0/res2_2_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2a_1 + node_of_gpu_0/res3_0_branch2a_bn_1 + node_of_gpu_0/res3_0_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2b_1 + node_of_gpu_0/res3_0_branch2b_bn_1 + node_of_gpu_0/res3_0_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch1_1 + node_of_gpu_0/res3_0_branch1_bn_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2c_1 + node_of_gpu_0/res3_0_branch2c_bn_1 + node_of_gpu_0/res3_0_branch2c_bn_2 + node_of_gpu_0/res3_0_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2a_1 + node_of_gpu_0/res3_1_branch2a_bn_1 + node_of_gpu_0/res3_1_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2b_1 + node_of_gpu_0/res3_1_branch2b_bn_1 + node_of_gpu_0/res3_1_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2c_1 + node_of_gpu_0/res3_1_branch2c_bn_1 + node_of_gpu_0/res3_1_branch2c_bn_2 + node_of_gpu_0/res3_1_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2a_1 + node_of_gpu_0/res3_2_branch2a_bn_1 + node_of_gpu_0/res3_2_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2b_1 + node_of_gpu_0/res3_2_branch2b_bn_1 + node_of_gpu_0/res3_2_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2c_1 + node_of_gpu_0/res3_2_branch2c_bn_1 + node_of_gpu_0/res3_2_branch2c_bn_2 + node_of_gpu_0/res3_2_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2a_1 + node_of_gpu_0/res3_3_branch2a_bn_1 + node_of_gpu_0/res3_3_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2b_1 + node_of_gpu_0/res3_3_branch2b_bn_1 + node_of_gpu_0/res3_3_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2c_1 + node_of_gpu_0/res3_3_branch2c_bn_1 + node_of_gpu_0/res3_3_branch2c_bn_2 + node_of_gpu_0/res3_3_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2a_1 + node_of_gpu_0/res4_0_branch2a_bn_1 + node_of_gpu_0/res4_0_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2b_1 + node_of_gpu_0/res4_0_branch2b_bn_1 + node_of_gpu_0/res4_0_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch1_1 + node_of_gpu_0/res4_0_branch1_bn_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2c_1 + node_of_gpu_0/res4_0_branch2c_bn_1 + node_of_gpu_0/res4_0_branch2c_bn_2 + node_of_gpu_0/res4_0_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2a_1 + node_of_gpu_0/res4_1_branch2a_bn_1 + node_of_gpu_0/res4_1_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2b_1 + node_of_gpu_0/res4_1_branch2b_bn_1 + node_of_gpu_0/res4_1_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2c_1 + node_of_gpu_0/res4_1_branch2c_bn_1 + node_of_gpu_0/res4_1_branch2c_bn_2 + node_of_gpu_0/res4_1_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2a_1 + node_of_gpu_0/res4_2_branch2a_bn_1 + node_of_gpu_0/res4_2_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2b_1 + node_of_gpu_0/res4_2_branch2b_bn_1 + node_of_gpu_0/res4_2_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2c_1 + node_of_gpu_0/res4_2_branch2c_bn_1 + node_of_gpu_0/res4_2_branch2c_bn_2 + node_of_gpu_0/res4_2_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2a_1 + node_of_gpu_0/res4_3_branch2a_bn_1 + node_of_gpu_0/res4_3_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2b_1 + node_of_gpu_0/res4_3_branch2b_bn_1 + node_of_gpu_0/res4_3_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2c_1 + node_of_gpu_0/res4_3_branch2c_bn_1 + node_of_gpu_0/res4_3_branch2c_bn_2 + node_of_gpu_0/res4_3_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2a_1 + node_of_gpu_0/res4_4_branch2a_bn_1 + node_of_gpu_0/res4_4_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2b_1 + node_of_gpu_0/res4_4_branch2b_bn_1 + node_of_gpu_0/res4_4_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2c_1 + node_of_gpu_0/res4_4_branch2c_bn_1 + node_of_gpu_0/res4_4_branch2c_bn_2 + node_of_gpu_0/res4_4_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2a_1 + node_of_gpu_0/res4_5_branch2a_bn_1 + node_of_gpu_0/res4_5_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2b_1 + node_of_gpu_0/res4_5_branch2b_bn_1 + node_of_gpu_0/res4_5_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2c_1 + node_of_gpu_0/res4_5_branch2c_bn_1 + node_of_gpu_0/res4_5_branch2c_bn_2 + node_of_gpu_0/res4_5_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2a_1 + node_of_gpu_0/res5_0_branch2a_bn_1 + node_of_gpu_0/res5_0_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2b_1 + node_of_gpu_0/res5_0_branch2b_bn_1 + node_of_gpu_0/res5_0_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch1_1 + node_of_gpu_0/res5_0_branch1_bn_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2c_1 + node_of_gpu_0/res5_0_branch2c_bn_1 + node_of_gpu_0/res5_0_branch2c_bn_2 + node_of_gpu_0/res5_0_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2a_1 + node_of_gpu_0/res5_1_branch2a_bn_1 + node_of_gpu_0/res5_1_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2b_1 + node_of_gpu_0/res5_1_branch2b_bn_1 + node_of_gpu_0/res5_1_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2c_1 + node_of_gpu_0/res5_1_branch2c_bn_1 + node_of_gpu_0/res5_1_branch2c_bn_2 + node_of_gpu_0/res5_1_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2a_1 + node_of_gpu_0/res5_2_branch2a_bn_1 + node_of_gpu_0/res5_2_branch2a_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2b_1 + node_of_gpu_0/res5_2_branch2b_bn_1 + node_of_gpu_0/res5_2_branch2b_bn_2
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2c_1 + node_of_gpu_0/res5_2_branch2c_bn_1 + node_of_gpu_0/res5_2_branch2c_bn_2 + node_of_gpu_0/res5_2_branch2c_bn_3
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] POOLING: node_of_gpu_0/pool5_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/pred_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] SHUFFLE: reshape_after_node_of_gpu_0/pred_1
[01/06/2024-21:17:37] [I] [TRT] [GpuLayer] SOFTMAX: (Unnamed Layer* 180) [Softmax]
[01/06/2024-21:17:39] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +261, GPU +239, now: CPU 794, GPU 8020 (MiB)
[01/06/2024-21:17:39] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +88, now: CPU 876, GPU 8108 (MiB)
[01/06/2024-21:17:39] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/06/2024-21:18:06] [I] [TRT] Total Activation Memory: 32563448320
[01/06/2024-21:18:06] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[01/06/2024-21:18:07] [I] [TRT] Total Host Persistent Memory: 134960
[01/06/2024-21:18:07] [I] [TRT] Total Device Persistent Memory: 291328
[01/06/2024-21:18:07] [I] [TRT] Total Scratch Memory: 0
[01/06/2024-21:18:07] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 105 MiB, GPU 4520 MiB
[01/06/2024-21:18:07] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 59 steps to complete.
[01/06/2024-21:18:07] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 2.50731ms to assign 3 blocks to 59 nodes requiring 7225344 bytes.
[01/06/2024-21:18:07] [I] [TRT] Total Activation Memory: 7225344
[01/06/2024-21:18:07] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +89, GPU +128, now: CPU 89, GPU 128 (MiB)
[01/06/2024-21:18:07] [I] Engine built in 32.0571 sec.
[01/06/2024-21:18:07] [I] [TRT] Loaded engine size: 108 MiB
[01/06/2024-21:18:07] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +108, now: CPU 0, GPU 108 (MiB)
[01/06/2024-21:18:07] [I] Engine deserialized in 0.0264037 sec.
[01/06/2024-21:18:07] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +7, now: CPU 0, GPU 115 (MiB)
[01/06/2024-21:18:07] [I] Setting persistentCacheLimit to 0 bytes.
[01/06/2024-21:18:07] [I] Using values loaded from tabby_tiger_new.dat for input gpu_0/data_0
[01/06/2024-21:18:07] [I] Created input binding for gpu_0/data_0 with dimensions 1x3x224x224
[01/06/2024-21:18:07] [I] Using random values for output gpu_0/softmax_1
[01/06/2024-21:18:07] [I] Created output binding for gpu_0/softmax_1 with dimensions 1x1000
[01/06/2024-21:18:07] [I] Starting inference
[01/06/2024-21:18:10] [I] Warmup completed 25 queries over 200 ms
[01/06/2024-21:18:10] [I] Timing trace has 368 queries over 3.02551 s
[01/06/2024-21:18:10] [I]
[01/06/2024-21:18:10] [I] === Trace details ===
[01/06/2024-21:18:10] [I] Trace averages of 10 runs:
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18617 ms - Host latency: 8.24604 ms (enqueue 0.898984 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.17549 ms - Host latency: 8.239 ms (enqueue 0.969962 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.1696 ms - Host latency: 8.22351 ms (enqueue 0.956189 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18177 ms - Host latency: 8.23889 ms (enqueue 0.94278 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.1959 ms - Host latency: 8.25107 ms (enqueue 0.843054 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.199 ms - Host latency: 8.26624 ms (enqueue 0.873383 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.16205 ms - Host latency: 8.2147 ms (enqueue 0.786859 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18692 ms - Host latency: 8.23406 ms (enqueue 0.847974 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.19838 ms - Host latency: 8.24911 ms (enqueue 0.793134 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.20027 ms - Host latency: 8.25245 ms (enqueue 0.882941 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.1801 ms - Host latency: 8.2303 ms (enqueue 0.759869 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.17186 ms - Host latency: 8.22761 ms (enqueue 0.798047 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18611 ms - Host latency: 8.23773 ms (enqueue 0.716687 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.1741 ms - Host latency: 8.21832 ms (enqueue 0.814709 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.17648 ms - Host latency: 8.22267 ms (enqueue 0.80946 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.24866 ms - Host latency: 8.30702 ms (enqueue 0.788538 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.2257 ms - Host latency: 8.28545 ms (enqueue 0.790173 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.19059 ms - Host latency: 8.24176 ms (enqueue 0.774915 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.21903 ms - Host latency: 8.26468 ms (enqueue 0.81355 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18147 ms - Host latency: 8.22831 ms (enqueue 0.792749 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.18661 ms - Host latency: 8.2334 ms (enqueue 0.860803 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.23063 ms - Host latency: 8.2746 ms (enqueue 0.752869 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.24011 ms - Host latency: 8.28914 ms (enqueue 0.793591 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.17722 ms - Host latency: 8.23274 ms (enqueue 0.794751 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.12122 ms - Host latency: 8.17615 ms (enqueue 0.749243 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.13369 ms - Host latency: 8.1853 ms (enqueue 0.788184 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.20225 ms - Host latency: 8.24851 ms (enqueue 0.759424 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.2021 ms - Host latency: 8.25151 ms (enqueue 0.748193 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.20247 ms - Host latency: 8.24915 ms (enqueue 0.745776 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.22815 ms - Host latency: 8.2773 ms (enqueue 0.811865 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.20508 ms - Host latency: 8.25596 ms (enqueue 0.792871 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.20718 ms - Host latency: 8.25557 ms (enqueue 0.78208 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.22878 ms - Host latency: 8.27707 ms (enqueue 0.7448 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.24929 ms - Host latency: 8.29609 ms (enqueue 0.744263 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.21614 ms - Host latency: 8.26428 ms (enqueue 0.7698 ms)
[01/06/2024-21:18:10] [I] Average on 10 runs - GPU latency: 8.22415 ms - Host latency: 8.26838 ms (enqueue 0.753809 ms)
[01/06/2024-21:18:10] [I]
[01/06/2024-21:18:10] [I] === Performance summary ===
[01/06/2024-21:18:10] [I] Throughput: 121.632 qps
[01/06/2024-21:18:10] [I] Latency: min = 8.08862 ms, max = 8.38208 ms, mean = 8.25002 ms, median = 8.24704 ms, percentile(90%) = 8.30786 ms, percentile(95%) = 8.33105 ms, percentile(99%) = 8.3739 ms
[01/06/2024-21:18:10] [I] Enqueue Time: min = 0.549316 ms, max = 1.31122 ms, mean = 0.805704 ms, median = 0.701355 ms, percentile(90%) = 1.09424 ms, percentile(95%) = 1.15619 ms, percentile(99%) = 1.28442 ms
[01/06/2024-21:18:10] [I] H2D Latency: min = 0.0306396 ms, max = 0.107422 ms, mean = 0.048564 ms, median = 0.0427246 ms, percentile(90%) = 0.0692139 ms, percentile(95%) = 0.0818176 ms, percentile(99%) = 0.0960693 ms
[01/06/2024-21:18:10] [I] GPU Compute Time: min = 8.05054 ms, max = 8.33057 ms, mean = 8.19876 ms, median = 8.1947 ms, percentile(90%) = 8.25244 ms, percentile(95%) = 8.27917 ms, percentile(99%) = 8.32007 ms
[01/06/2024-21:18:10] [I] D2H Latency: min = 0.00146484 ms, max = 0.00512695 ms, mean = 0.00269583 ms, median = 0.00262451 ms, percentile(90%) = 0.00317383 ms, percentile(95%) = 0.00415039 ms, percentile(99%) = 0.00482178 ms
[01/06/2024-21:18:10] [I] Total Host Walltime: 3.02551 s
[01/06/2024-21:18:10] [I] Total GPU Compute Time: 3.01714 s
[01/06/2024-21:18:10] [I] Explanations of the performance metrics are printed in the verbose logs.
[01/06/2024-21:18:10] [I]
[01/06/2024-21:18:10] [I] Output Tensors:
[01/06/2024-21:18:10] [I] gpu_0/softmax_1: (1x1000)
[01/06/2024-21:18:10] [I] 4.31219e-09 3.27232e-08 6.03102e-09 7.18321e-08 8.16481e-09 1.81365e-08 5.16522e-09 8.57955e-08 5.11415e-07 1.60392e-06 6.45662e-08 4.51378e-08 1.54967e-07 3.57323e-08 1.50919e-07 2.02957e-07 7.65038e-08 1.90281e-06 1.74306e-07 2.85584e-08 3.43507e-08 6.44649e-07 1.99142e-07 5.56715e-07 3.7378e-06 3.41553e-08 1.00956e-08 6.22791e-08 2.47408e-08 3.43196e-08 1.43985e-07 5.16512e-08 3.22884e-08 2.4746e-08 1.70658e-07 2.76592e-07 2.26479e-07 2.64818e-07 1.41553e-07 1.57544e-06 2.76394e-08 1.1691e-06 8.99859e-08 2.21168e-07 3.4883e-07 8.93799e-08 2.68247e-07 6.48447e-08 2.82481e-07 2.70294e-08 2.67875e-08 2.96821e-07 9.61721e-07 7.48387e-08 5.23036e-07 5.61955e-08 6.25804e-08 2.33416e-08 1.12503e-07 3.18197e-07 5.57254e-07 2.07291e-08 4.99568e-08 1.10107e-07 1.06207e-07 3.59789e-08 3.21541e-07 1.17124e-06 1.77877e-07 3.40172e-07 4.75466e-08 4.20288e-07 9.99052e-08 4.06495e-08 1.96554e-07 2.68796e-08 1.06138e-06 8.2752e-08 4.29385e-07 8.97167e-07 4.39071e-07 3.73672e-08 1.31639e-06 3.46636e-07 1.13394e-06 4.0564e-07 6.47102e-07 1.07446e-06 1.2565e-06 1.03746e-08 1.33707e-07 1.38764e-08 1.8123e-08 3.42457e-08 1.41741e-08 3.81502e-09 1.96164e-08 2.63373e-08 1.21651e-08 8.68709e-08 6.31099e-08 1.69948e-08 7.01094e-08 5.38871e-08 8.37287e-08 4.00074e-08 2.50307e-06 5.6799e-09 3.3906e-08 1.0181e-07 5.49512e-08 5.5593e-08 3.17932e-07 1.6816e-07 3.77826e-07 1.34098e-08 8.32268e-09 4.16739e-07 5.34258e-08 1.38523e-08 1.24534e-08 2.18278e-08 2.71875e-08 5.56683e-08 5.21856e-07 3.22795e-07 1.49402e-07 1.26212e-09 2.03107e-09 2.55109e-09 1.27947e-08 5.4118e-08 1.63819e-07 1.64926e-07 1.58095e-08 3.3948e-08 5.78881e-08 3.12724e-08 4.36633e-07 3.09818e-08 5.42449e-08 1.85306e-08 3.83025e-08 4.3818e-09 6.02894e-09 1.10333e-07 3.40501e-08 1.84608e-08 8.44784e-09 1.04001e-08 6.81149e-07 2.27807e-07 1.06525e-06 1.47929e-07 9.68131e-08 6.87125e-08 1.15556e-07 9.35603e-08 2.39817e-08 2.31128e-07 5.07244e-08 3.7144e-08 1.44836e-08 2.6576e-07 2.87379e-07 1.75414e-08 9.6001e-09 1.25015e-09 2.60274e-07 4.35331e-08 7.18117e-08 1.66704e-07 7.62806e-08 1.68137e-07 1.1277e-07 4.46677e-08 2.51944e-08 1.3368e-07 3.34865e-07 8.15782e-08 9.76842e-08 1.33335e-07 5.22618e-08 5.91053e-07 6.17981e-08 4.13658e-07 1.49203e-06 4.72318e-07 4.43317e-08 5.23025e-07 7.41547e-08 3.99494e-08 7.95538e-08 1.12667e-06 3.15917e-09 1.98114e-07 1.49298e-07 1.24126e-07 1.64913e-07 2.40972e-07 2.59106e-07 7.38615e-08 7.8473e-08 1.31038e-06 1.66646e-07 1.14513e-07 6.65491e-08 1.61169e-07 4.4621e-07 9.9237e-09 8.12422e-08 3.87257e-07 1.49381e-08 5.202e-08 2.28715e-07 7.87144e-08 1.19345e-07 5.88824e-08 5.04929e-08 1.05746e-07 4.39331e-08 5.89527e-08 7.00098e-08 5.36868e-07 1.29429e-06 6.34282e-08 7.08734e-08 3.25876e-07 2.49778e-08 1.33292e-07 2.13743e-08 5.03121e-08 1.65417e-06 1.3774e-06 1.84975e-07 5.55522e-07 1.33338e-07 2.3248e-08 3.4907e-08 1.14445e-07 7.24208e-08 7.43851e-08 2.9075e-07 5.99565e-07 3.40904e-08 7.90543e-08 3.7215e-07 1.78741e-08 1.167e-06 1.44256e-07 2.30053e-07 4.95958e-07 4.53147e-08 2.10106e-07 6.59327e-08 3.44226e-07 3.3919e-07 2.51628e-08 1.68144e-07 1.12057e-07 2.69027e-07 7.50873e-08 1.3233e-07 5.31296e-08 8.91849e-08 1.24447e-07 4.44719e-08 1.53876e-08 1.87676e-07 5.07761e-08 5.52412e-08 2.49056e-08 8.85184e-08 2.82941e-08 1.31065e-07 3.28248e-08 1.41085e-06 4.26582e-07 1.38685e-06 1.69765e-06 6.41466e-06 0.222164 0.649406 0.000265587 2.77833e-05 0.118299 2.35406e-05 0.00494512 0.000353588 0.000128664 4.27559e-05 7.96535e-06 0.0016923 7.9745e-07 1.81176e-08 2.7363e-08 1.32207e-08 6.3004e-08 2.14228e-06 7.05765e-06 5.43203e-08 7.3703e-08 6.37532e-08 2.01735e-07 6.12086e-09 1.12721e-07 3.1454e-07 5.10996e-08 2.344e-08 5.03263e-07 3.63736e-07 5.61937e-07 2.68328e-07 3.18183e-07 1.19748e-07 1.60634e-07 3.15427e-06 1.38529e-06 1.23261e-07 8.09803e-08 3.88762e-08 2.09217e-08 1.48431e-07 7.79221e-09 5.48808e-07 2.108e-08 1.58163e-06 9.35706e-08 5.03974e-08 1.09431e-08 1.50153e-07 5.40322e-07 7.59874e-07 4.30646e-07 3.85417e-08 1.20151e-07 2.48362e-08 1.1377e-07 1.29412e-07 7.6831e-09 0.000145024 2.39234e-07 1.18882e-07 1.26118e-07 1.63536e-08 8.83463e-08 3.54368e-08 1.2187e-08 2.03324e-08 1.05163e-08 1.93692e-08 3.95044e-08 7.40076e-08 5.42306e-08 1.98142e-08 1.8415e-07 2.08151e-06 1.24005e-06 1.42602e-06 1.76945e-07 9.70648e-07 7.16566e-07 8.0947e-07 9.00147e-07 8.18857e-09 8.99452e-08 3.16986e-07 7.41196e-08 5.48573e-09 1.22034e-08 4.96608e-07 1.08288e-07 4.89107e-08 6.31607e-07 1.28074e-07 1.02883e-08 1.04286e-08 2.10383e-07 1.10675e-08 9.83445e-09 1.34309e-07 7.04894e-09 1.61061e-08 4.00807e-06 2.61122e-07 1.35763e-07 1.08121e-07 9.20258e-08 2.74398e-08 5.57276e-09 8.3922e-08 4.32769e-07 3.37907e-08 7.07499e-09 3.25664e-08 6.72846e-08 1.46607e-08 1.31611e-08 4.43623e-07 2.61208e-07 5.22855e-08 7.40795e-08 1.15638e-07 3.40772e-09 2.36093e-09 5.5517e-09 4.81737e-08 5.2435e-09 1.48066e-07 2.05642e-06 2.83374e-08 6.31107e-08 6.43683e-06 4.46948e-07 9.72298e-07 6.55227e-08 2.40795e-07 2.57961e-08 3.251e-07 6.85664e-06 5.10334e-07 2.05943e-07 6.87812e-08 6.4403e-09 2.31994e-08 1.278e-08 6.71909e-07 8.00414e-08 5.60545e-06 7.52245e-06 3.30366e-08 3.14636e-08 2.76163e-08 1.84685e-07 2.31083e-06 5.14467e-07 3.52963e-08 1.26886e-08 4.29366e-07 5.01386e-07 5.95346e-07 3.26838e-07 5.24194e-08 8.48795e-07 7.83283e-08 2.44987e-07 1.01231e-07 2.11385e-07 4.59941e-08 2.99574e-08 7.32672e-09 1.31519e-07 8.63965e-07 4.51796e-06 2.55771e-07 2.2358e-06 1.43611e-07 4.68767e-06 5.54229e-07 8.72966e-07 2.3307e-07 6.94223e-08 1.52769e-06 7.60646e-06 5.10959e-07 2.35479e-07 1.43391e-08 1.29262e-08 1.43641e-07 7.22168e-07 4.21727e-07 7.44233e-08 1.44281e-07 1.35441e-07 5.333e-07 5.01447e-07 8.81705e-09 7.42319e-08 3.26115e-05 6.89304e-07 5.16655e-08 1.03141e-07 1.05753e-08 2.02052e-07 4.97318e-08 1.88022e-08 4.88042e-08 1.2219e-06 1.05701e-07 5.96129e-07 1.17827e-06 4.65682e-08 1.07844e-07 4.72346e-08 4.77986e-07 3.65281e-08 1.60299e-07 1.8169e-08 2.05875e-08 3.21797e-07 1.69961e-09 2.22464e-06 4.00111e-07 6.7832e-07 2.27729e-06 5.08218e-08 8.98886e-07 5.11845e-07 2.20215e-06 3.93419e-08 1.27971e-08 1.47525e-09 3.5842e-07 2.9207e-08 9.53878e-06 6.27738e-06 5.2522e-08 1.48063e-07 1.28023e-06 2.25112e-06 2.70763e-07 4.22606e-08 1.8201e-07 4.31097e-07 7.04557e-08 6.30349e-08 2.50493e-07 2.34575e-07 9.92349e-08 2.22665e-06 3.47416e-07 1.49393e-07 5.91125e-07 1.87736e-07 9.69455e-07 3.18923e-08 3.75779e-08 2.48435e-08 2.65921e-08 4.20734e-05 4.29164e-09 1.25327e-07 1.52806e-06 4.51428e-06 7.61211e-08 2.55164e-07 4.47047e-08 2.0341e-09 4.07301e-08 1.17877e-07 1.86437e-08 8.28405e-08 3.22565e-06 3.46128e-06 1.30752e-10 3.44816e-09 5.05364e-08 4.86761e-08 3.2116e-07 4.18953e-07 7.2722e-08 4.18564e-08 8.55961e-07 2.87197e-07 1.56506e-07 5.14765e-08 6.6774e-07 1.41904e-07 2.31405e-07 1.70621e-08 1.36616e-07 1.25413e-08 6.51188e-06 1.9827e-08 2.87298e-07 2.18521e-08 6.62322e-08 2.68944e-07 1.42141e-07 1.83672e-08 1.91614e-07 1.50257e-07 3.74416e-08 1.89776e-08 4.23657e-08 4.79333e-07 1.0926e-07 1.6216e-06 1.14329e-06 1.34294e-07 1.04939e-08 1.07619e-06 1.99505e-07 1.39986e-06 1.07741e-07 6.27208e-08 4.03869e-06 4.54529e-08 4.6653e-08 6.08309e-07 8.17019e-07 6.28137e-08 2.89075e-07 1.14106e-07 2.50617e-07 7.47386e-07 1.22178e-06 6.25064e-08 0.000401968 1.995e-08 8.0799e-07 1.81799e-05 6.74927e-08 8.15525e-08 1.54894e-07 7.19469e-07 1.00188e-07 3.43813e-07 4.36554e-07 5.30778e-07 4.98392e-06 5.38992e-07 0.000510889 4.49317e-07 5.84905e-08 5.6472e-09 4.88459e-06 1.51236e-08 4.43269e-09 1.34801e-07 1.88732e-07 6.88306e-06 1.25287e-07 4.29881e-07 2.15576e-08 1.68197e-08 5.77935e-08 2.65805e-06 9.51291e-07 1.79138e-06 1.02712e-07 7.92065e-08 6.17835e-08 2.07746e-07 1.19137e-07 8.93553e-09 1.36442e-07 1.69639e-07 3.54148e-07 2.53757e-08 1.60009e-07 9.06202e-08 3.82748e-06 9.70912e-08 1.18824e-08 4.13489e-07 3.10697e-08 4.50127e-08 1.65702e-07 6.16994e-07 2.45009e-07 1.57708e-08 1.47974e-07 2.83974e-08 9.61385e-07 4.57889e-07 2.87782e-08 1.07784e-07 8.28538e-09 5.68251e-07 1.0362e-06 2.63081e-07 6.69241e-08 1.46807e-05 4.25193e-06 2.40326e-08 2.43369e-07 4.72804e-06 6.44753e-08 1.24654e-06 9.60749e-07 2.665e-06 3.06748e-07 4.22162e-08 2.92427e-07 3.25192e-09 2.96528e-08 8.33966e-09 6.63828e-08 1.06968e-07 9.00791e-09 7.35332e-08 1.85577e-07 2.68262e-08 2.46429e-08 2.93231e-07 1.01367e-06 1.20959e-06 5.22922e-08 1.69608e-07 1.36372e-05 2.8331e-07 3.12253e-07 4.90316e-06 3.17488e-07 1.42068e-08 2.129e-08 1.18458e-07 1.58725e-06 1.99419e-07 8.66641e-08 1.70225e-06 6.07794e-08 6.14967e-07 4.31817e-07 2.98631e-07 1.01794e-06 3.27476e-08 9.34612e-08 1.422e-07 3.08467e-06 4.1963e-07 3.22064e-05 4.64545e-06 6.38533e-09 3.44031e-06 3.20264e-09 5.47952e-09 0.000508376 4.04581e-08 1.1234e-07 1.17033e-06 2.57371e-08 1.73591e-06 1.32384e-08 9.25029e-08 4.6047e-07 7.26198e-07 9.6773e-07 4.91101e-08 5.24099e-07 3.4001e-06 2.08103e-06 6.28604e-07 1.2712e-07 2.645e-08 3.99737e-07 3.7949e-06 3.96355e-08 9.1404e-08 1.45296e-05 4.03064e-08 9.10027e-08 1.87741e-05 4.65326e-07 2.73036e-08 1.13201e-06 1.3518e-07 2.4601e-07 7.95868e-08 2.02313e-06 9.92311e-05 8.18532e-09 7.17072e-07 2.94609e-07 1.41133e-07 7.069e-09 9.73874e-08 2.29791e-06 2.35752e-05 1.8678e-05 9.41717e-09 3.27657e-07 8.5668e-07 3.25826e-07 1.63054e-06 4.30985e-08 1.36001e-07 1.16965e-06 2.12713e-08 2.99713e-09 8.14538e-08 2.85041e-06 7.47095e-06 7.25229e-07 3.23639e-08 1.03043e-07 1.84035e-06 1.17752e-07 1.75386e-07 7.34599e-08 1.78589e-08 6.76068e-07 4.76833e-07 1.45566e-05 1.60689e-07 1.28724e-07 1.01615e-06 2.44808e-08 4.90249e-06 2.35065e-08 1.2155e-07 2.73978e-08 4.22545e-08 1.20204e-06 8.49469e-06 0.000110935 1.47411e-08 1.7354e-06 8.90351e-07 4.84894e-08 4.525e-06 5.31328e-09 1.49173e-06 1.845e-08 2.27357e-07 2.98362e-07 1.81945e-08 1.81187e-06 1.81624e-08 8.46259e-09 2.20169e-08 1.4298e-08 2.21175e-07 1.06652e-07 1.09752e-07 1.98087e-07 2.29561e-07 4.17271e-07 1.89741e-08 8.16455e-07 7.98417e-07 5.70521e-09 2.26532e-08 7.90286e-07 4.12424e-06 1.00397e-06 1.771e-06 2.30236e-06 2.27294e-08 5.22434e-06 2.00706e-05 8.904e-07 6.1891e-07 1.74583e-07 1.3251e-05 2.08552e-07 8.7916e-09 2.82986e-08 7.18589e-08 1.65902e-07 7.60535e-07 6.43047e-06 2.04698e-08 5.72088e-08 1.26994e-06 1.1042e-08 1.02745e-07 1.65413e-06 9.2257e-08 1.67806e-08 2.19027e-06 3.1811e-07 8.77542e-08 2.21991e-08 5.99528e-08 2.62875e-07 1.94251e-07 4.18546e-08 4.09085e-07 1.74893e-06 7.53293e-08 2.81697e-06 3.86782e-08 5.46469e-09 8.96562e-08 3.53197e-06 3.09246e-08 4.60778e-08 6.10538e-06 2.47808e-07 2.08653e-07 2.94466e-07 2.35702e-06 2.14683e-08 6.43639e-07 1.06576e-07 3.97759e-08 2.1646e-08 4.0269e-07 1.52702e-07 1.33815e-08 7.36706e-08 8.30909e-08 1.85075e-06 1.07603e-08 9.49883e-07 2.50373e-06 4.67766e-06 2.00213e-06 1.45547e-08 3.71791e-08 7.28418e-07 2.36864e-07 3.34534e-05 7.90807e-07 1.10756e-07 3.31662e-07 1.49803e-08 2.12927e-07 2.73291e-07 2.73512e-07 9.27414e-08 2.73608e-08 8.97993e-09 9.94292e-09 5.26084e-07 3.28558e-07 4.1904e-05 1.70534e-07 2.15817e-08 4.75436e-07 4.26341e-08 5.15595e-08 1.16952e-07 3.27501e-09 3.05879e-09 2.58485e-08 2.0689e-08 4.9584e-06 3.05773e-08 3.75785e-08 1.18929e-07 1.53957e-09 1.93365e-08 1.62997e-08 1.06609e-07 1.11543e-06 1.63069e-07 6.40267e-07 4.10568e-08 1.128e-08 1.67669e-07 5.65517e-07 7.37952e-08 2.78064e-07 5.35779e-08 3.93165e-07 4.64912e-07 6.59557e-08 8.85602e-07 1.08205e-07 8.13449e-08 3.4025e-07 4.81834e-07 9.92183e-08 1.66807e-07 9.00434e-08 1.8549e-06 1.12568e-08 1.14591e-08 6.0587e-07 1.3412e-08 1.67332e-08 1.64203e-08 1.11066e-08 3.31439e-07 1.3843e-07 9.92739e-07 7.68445e-08 2.95164e-08 7.48974e-06 1.90854e-08 1.8991e-08 3.28791e-08 1.05048e-07 1.91254e-08 5.89584e-07 5.12448e-07 1.78304e-08 6.61151e-09 7.97469e-07 4.15751e-08 1.09356e-08 1.43087e-07 4.54704e-08 1.89151e-08 6.62792e-07 3.76557e-07 1.95258e-08 9.73641e-08 2.4037e-08 1.33256e-08 9.10277e-09 6.11777e-09 5.82621e-08 5.62718e-08 1.01834e-08 7.20226e-07 1.05142e-06Inference result: Detected:
[01/06/2024-21:18:10] [I] [1]  tiger cat
[01/06/2024-21:18:10] [I] [2]  tabby
[01/06/2024-21:18:10] [I] [3]  Egyptian cat
[01/06/2024-21:18:10] [I] [4]  lynx
[01/06/2024-21:18:10] [I] [5]  tiger

Firstly, I need to clarify few things
In general, trtexec app is used to know the performance of a model. So by default it feeds some random data to all input values.
Each model will have different set of pre-processing and post processing. The trtexec binary does not any preprocessing or post processing for a model.
When you try to perform inference using a input image with trtexec, we need to make sure the input data that feed into model should be preprocessed already.
So for ResNet50 onnx model below preprocessing(Resize to 224*224 → HWC to CHW conversion → normalization) is required. Please modify the DAT file preparation like below

import numpy as np
from PIL import Image
img = Image.open('/usr/src/tensorrt/data/resnet50/tabby_tiger_cat.jpg')
image_arr = (
            np.asarray(img.resize((224, 224), Image.ANTIALIAS))
            .transpose([2, 0, 1])
            .astype(np.float32)
            .ravel()
        )
print(image_arr.size)
img_norm= (image_arr / 255.0 - 0.45) / 0.225
print(img_norm[0:10])
dat_file_path = 'tabby_tiger_new.dat'
with open(dat_file_path, "wb") as fp:
    img_norm.tofile(fp)

Hi SivaRamaKrishna, thank you for your quick response.

I took the code you have given for properly creating the .dat file and created the .dat file.

But even then I have not got the correct prediction. I request you please send the resent50.onnx model which you have at your side. So that we both are in sync in using the correct resnet50.onnx model too.

Here is the output inference I got.
Inference result: Detected:
[01/07/2024-00:54:17] [I] [1] spotlight
[01/07/2024-00:54:17] [I] [2] wall clock
[01/07/2024-00:54:17] [I] [3] lampshade
[01/07/2024-00:54:17] [I] [4] television
[01/07/2024-00:54:17] [I] [5] radiator
[01/07/2024-00:54:17] [I]

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
I used the ResNet model from the onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx. I used FP32 precision with trtexec. Have you used the same command as mine? I used jetpack 5.1 release.

Hi SivaRamaKrishna, yes except Jet pack version it is same.
Mine is JetPack 4.6.1

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
This is strange. Could you attach your model or share any Google drive link with your model. Let me verify with jetpack 5.1 release.

I just modified dumpOutput function similar to yours in sampleUtils.h and also created new DAT file.

Hi SivaRamaKrishna, find attached the model file. It is in the zip format, you can unzip it.
If this doesn’t work at your side then please give your model, I will infer on my machine.

Thanks and Regards

Nagaraj Trivedi
ResNet50.zip (90.7 MB)

Dear @trivedi.nagaraj,
Both the model are same and I could get the output.


nvidia@tegra-ubuntu:~/siva$ diff /usr/src/tensorrt/data/resnet50/ResNet50.onnx ResNet50.onnx
nvidia@tegra-ubuntu:~/siva$
nvidia@tegra-ubuntu:~/siva$ cd sampleUtils/
nvidia@tegra-ubuntu:~/siva/sampleUtils$ ls
create_dat.py  sampleUtils.h  tabby_tiger.dat  tabby_tiger_new.dat
nvidia@tegra-ubuntu:~/siva/sampleUtils$ /usr/src/tensorrt/bin/trtexec --onnx=/home/nvidia/siva/ResNet50.onnx --loadInputs=gpu_0/data_0:tabby_tiger_new.dat --dumpOutput
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=/home/nvidia/siva/ResNet50.onnx --loadInputs=gpu_0/data_0:tabby_tiger_new.dat --dumpOutput
[01/07/2024-09:08:22] [I] === Model Options ===
[01/07/2024-09:08:22] [I] Format: ONNX
[01/07/2024-09:08:22] [I] Model: /home/nvidia/siva/ResNet50.onnx
[01/07/2024-09:08:22] [I] Output:
[01/07/2024-09:08:22] [I] === Build Options ===
[01/07/2024-09:08:22] [I] Max batch: explicit batch
[01/07/2024-09:08:22] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/07/2024-09:08:22] [I] minTiming: 1
[01/07/2024-09:08:22] [I] avgTiming: 8
[01/07/2024-09:08:22] [I] Precision: FP32
[01/07/2024-09:08:22] [I] LayerPrecisions:
[01/07/2024-09:08:22] [I] Calibration:
[01/07/2024-09:08:22] [I] Refit: Disabled
[01/07/2024-09:08:22] [I] Sparsity: Disabled
[01/07/2024-09:08:22] [I] Safe mode: Disabled
[01/07/2024-09:08:22] [I] DirectIO mode: Disabled
[01/07/2024-09:08:22] [I] Restricted mode: Disabled
[01/07/2024-09:08:22] [I] Build only: Disabled
[01/07/2024-09:08:22] [I] Save engine:
[01/07/2024-09:08:22] [I] Load engine:
[01/07/2024-09:08:22] [I] Profiling verbosity: 0
[01/07/2024-09:08:22] [I] Tactic sources: Using default tactic sources
[01/07/2024-09:08:22] [I] timingCacheMode: local
[01/07/2024-09:08:22] [I] timingCacheFile:
[01/07/2024-09:08:22] [I] Heuristic: Disabled
[01/07/2024-09:08:22] [I] Preview Features: Use default preview flags.
[01/07/2024-09:08:22] [I] Input(s)s format: fp32:CHW
[01/07/2024-09:08:22] [I] Output(s)s format: fp32:CHW
[01/07/2024-09:08:22] [I] Input build shapes: model
[01/07/2024-09:08:22] [I] Input calibration shapes: model
[01/07/2024-09:08:22] [I] === System Options ===
[01/07/2024-09:08:22] [I] Device: 0
[01/07/2024-09:08:22] [I] DLACore:
[01/07/2024-09:08:22] [I] Plugins:
[01/07/2024-09:08:22] [I] === Inference Options ===
[01/07/2024-09:08:22] [I] Batch: Explicit
[01/07/2024-09:08:22] [I] Input inference shapes: model
[01/07/2024-09:08:22] [I] Iterations: 10
[01/07/2024-09:08:22] [I] Duration: 3s (+ 200ms warm up)
[01/07/2024-09:08:22] [I] Sleep time: 0ms
[01/07/2024-09:08:22] [I] Idle time: 0ms
[01/07/2024-09:08:22] [I] Streams: 1
[01/07/2024-09:08:22] [I] ExposeDMA: Disabled
[01/07/2024-09:08:22] [I] Data transfers: Enabled
[01/07/2024-09:08:22] [I] Spin-wait: Disabled
[01/07/2024-09:08:22] [I] Multithreading: Disabled
[01/07/2024-09:08:22] [I] CUDA Graph: Disabled
[01/07/2024-09:08:22] [I] Separate profiling: Disabled
[01/07/2024-09:08:22] [I] Time Deserialize: Disabled
[01/07/2024-09:08:22] [I] Time Refit: Disabled
[01/07/2024-09:08:22] [I] NVTX verbosity: 0
[01/07/2024-09:08:22] [I] Persistent Cache Ratio: 0
[01/07/2024-09:08:22] [I] Inputs:
[01/07/2024-09:08:22] [I] gpu_0/data_0<-tabby_tiger_new.dat
[01/07/2024-09:08:22] [I] === Reporting Options ===
[01/07/2024-09:08:22] [I] Verbose: Disabled
[01/07/2024-09:08:22] [I] Averages: 10 inferences
[01/07/2024-09:08:22] [I] Percentiles: 90,95,99
[01/07/2024-09:08:22] [I] Dump refittable layers:Disabled
[01/07/2024-09:08:22] [I] Dump output: Enabled
[01/07/2024-09:08:22] [I] Profile: Disabled
[01/07/2024-09:08:22] [I] Export timing to JSON file:
[01/07/2024-09:08:22] [I] Export output to JSON file:
[01/07/2024-09:08:22] [I] Export profile to JSON file:
[01/07/2024-09:08:22] [I]
[01/07/2024-09:08:22] [I] === Device Information ===
[01/07/2024-09:08:22] [I] Selected Device: Xavier
[01/07/2024-09:08:22] [I] Compute Capability: 7.2
[01/07/2024-09:08:22] [I] SMs: 8
[01/07/2024-09:08:22] [I] Compute Clock Rate: 1.377 GHz
[01/07/2024-09:08:22] [I] Device Global Memory: 31010 MiB
[01/07/2024-09:08:22] [I] Shared Memory per SM: 96 KiB
[01/07/2024-09:08:22] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/07/2024-09:08:22] [I] Memory Clock Rate: 1.377 GHz
[01/07/2024-09:08:22] [I]
[01/07/2024-09:08:22] [I] TensorRT version: 8.5.2
[01/07/2024-09:08:22] [I] [TRT] [MemUsageChange] Init CUDA: CPU +187, GPU +0, now: CPU 216, GPU 7479 (MiB)
[01/07/2024-09:08:24] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +106, GPU +100, now: CPU 344, GPU 7594 (MiB)
[01/07/2024-09:08:24] [I] Start parsing network model
[01/07/2024-09:08:24] [I] [TRT] ----------------------------------------------------------------
[01/07/2024-09:08:24] [I] [TRT] Input filename:   /home/nvidia/siva/ResNet50.onnx
[01/07/2024-09:08:24] [I] [TRT] ONNX IR version:  0.0.3
[01/07/2024-09:08:24] [I] [TRT] Opset version:    9
[01/07/2024-09:08:24] [I] [TRT] Producer name:    onnx-caffe2
[01/07/2024-09:08:24] [I] [TRT] Producer version:
[01/07/2024-09:08:24] [I] [TRT] Domain:
[01/07/2024-09:08:24] [I] [TRT] Model version:    0
[01/07/2024-09:08:24] [I] [TRT] Doc string:
[01/07/2024-09:08:24] [I] [TRT] ----------------------------------------------------------------
[01/07/2024-09:08:24] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/07/2024-09:08:24] [I] Finish parsing network model
[01/07/2024-09:08:24] [I] [TRT] ---------- Layers Running on DLA ----------
[01/07/2024-09:08:24] [I] [TRT] ---------- Layers Running on GPU ----------
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/conv1_1 + node_of_gpu_0/res_conv1_bn_1 + node_of_gpu_0/res_conv1_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] POOLING: node_of_gpu_0/pool1_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2a_1 + node_of_gpu_0/res2_0_branch2a_bn_1 + node_of_gpu_0/res2_0_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2b_1 + node_of_gpu_0/res2_0_branch2b_bn_1 + node_of_gpu_0/res2_0_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch1_1 + node_of_gpu_0/res2_0_branch1_bn_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_0_branch2c_1 + node_of_gpu_0/res2_0_branch2c_bn_1 + node_of_gpu_0/res2_0_branch2c_bn_2 + node_of_gpu_0/res2_0_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2a_1 + node_of_gpu_0/res2_1_branch2a_bn_1 + node_of_gpu_0/res2_1_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2b_1 + node_of_gpu_0/res2_1_branch2b_bn_1 + node_of_gpu_0/res2_1_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_1_branch2c_1 + node_of_gpu_0/res2_1_branch2c_bn_1 + node_of_gpu_0/res2_1_branch2c_bn_2 + node_of_gpu_0/res2_1_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2a_1 + node_of_gpu_0/res2_2_branch2a_bn_1 + node_of_gpu_0/res2_2_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2b_1 + node_of_gpu_0/res2_2_branch2b_bn_1 + node_of_gpu_0/res2_2_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res2_2_branch2c_1 + node_of_gpu_0/res2_2_branch2c_bn_1 + node_of_gpu_0/res2_2_branch2c_bn_2 + node_of_gpu_0/res2_2_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2a_1 + node_of_gpu_0/res3_0_branch2a_bn_1 + node_of_gpu_0/res3_0_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2b_1 + node_of_gpu_0/res3_0_branch2b_bn_1 + node_of_gpu_0/res3_0_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch1_1 + node_of_gpu_0/res3_0_branch1_bn_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_0_branch2c_1 + node_of_gpu_0/res3_0_branch2c_bn_1 + node_of_gpu_0/res3_0_branch2c_bn_2 + node_of_gpu_0/res3_0_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2a_1 + node_of_gpu_0/res3_1_branch2a_bn_1 + node_of_gpu_0/res3_1_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2b_1 + node_of_gpu_0/res3_1_branch2b_bn_1 + node_of_gpu_0/res3_1_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_1_branch2c_1 + node_of_gpu_0/res3_1_branch2c_bn_1 + node_of_gpu_0/res3_1_branch2c_bn_2 + node_of_gpu_0/res3_1_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2a_1 + node_of_gpu_0/res3_2_branch2a_bn_1 + node_of_gpu_0/res3_2_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2b_1 + node_of_gpu_0/res3_2_branch2b_bn_1 + node_of_gpu_0/res3_2_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_2_branch2c_1 + node_of_gpu_0/res3_2_branch2c_bn_1 + node_of_gpu_0/res3_2_branch2c_bn_2 + node_of_gpu_0/res3_2_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2a_1 + node_of_gpu_0/res3_3_branch2a_bn_1 + node_of_gpu_0/res3_3_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2b_1 + node_of_gpu_0/res3_3_branch2b_bn_1 + node_of_gpu_0/res3_3_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res3_3_branch2c_1 + node_of_gpu_0/res3_3_branch2c_bn_1 + node_of_gpu_0/res3_3_branch2c_bn_2 + node_of_gpu_0/res3_3_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2a_1 + node_of_gpu_0/res4_0_branch2a_bn_1 + node_of_gpu_0/res4_0_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2b_1 + node_of_gpu_0/res4_0_branch2b_bn_1 + node_of_gpu_0/res4_0_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch1_1 + node_of_gpu_0/res4_0_branch1_bn_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_0_branch2c_1 + node_of_gpu_0/res4_0_branch2c_bn_1 + node_of_gpu_0/res4_0_branch2c_bn_2 + node_of_gpu_0/res4_0_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2a_1 + node_of_gpu_0/res4_1_branch2a_bn_1 + node_of_gpu_0/res4_1_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2b_1 + node_of_gpu_0/res4_1_branch2b_bn_1 + node_of_gpu_0/res4_1_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_1_branch2c_1 + node_of_gpu_0/res4_1_branch2c_bn_1 + node_of_gpu_0/res4_1_branch2c_bn_2 + node_of_gpu_0/res4_1_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2a_1 + node_of_gpu_0/res4_2_branch2a_bn_1 + node_of_gpu_0/res4_2_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2b_1 + node_of_gpu_0/res4_2_branch2b_bn_1 + node_of_gpu_0/res4_2_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_2_branch2c_1 + node_of_gpu_0/res4_2_branch2c_bn_1 + node_of_gpu_0/res4_2_branch2c_bn_2 + node_of_gpu_0/res4_2_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2a_1 + node_of_gpu_0/res4_3_branch2a_bn_1 + node_of_gpu_0/res4_3_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2b_1 + node_of_gpu_0/res4_3_branch2b_bn_1 + node_of_gpu_0/res4_3_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_3_branch2c_1 + node_of_gpu_0/res4_3_branch2c_bn_1 + node_of_gpu_0/res4_3_branch2c_bn_2 + node_of_gpu_0/res4_3_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2a_1 + node_of_gpu_0/res4_4_branch2a_bn_1 + node_of_gpu_0/res4_4_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2b_1 + node_of_gpu_0/res4_4_branch2b_bn_1 + node_of_gpu_0/res4_4_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_4_branch2c_1 + node_of_gpu_0/res4_4_branch2c_bn_1 + node_of_gpu_0/res4_4_branch2c_bn_2 + node_of_gpu_0/res4_4_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2a_1 + node_of_gpu_0/res4_5_branch2a_bn_1 + node_of_gpu_0/res4_5_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2b_1 + node_of_gpu_0/res4_5_branch2b_bn_1 + node_of_gpu_0/res4_5_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res4_5_branch2c_1 + node_of_gpu_0/res4_5_branch2c_bn_1 + node_of_gpu_0/res4_5_branch2c_bn_2 + node_of_gpu_0/res4_5_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2a_1 + node_of_gpu_0/res5_0_branch2a_bn_1 + node_of_gpu_0/res5_0_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2b_1 + node_of_gpu_0/res5_0_branch2b_bn_1 + node_of_gpu_0/res5_0_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch1_1 + node_of_gpu_0/res5_0_branch1_bn_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_0_branch2c_1 + node_of_gpu_0/res5_0_branch2c_bn_1 + node_of_gpu_0/res5_0_branch2c_bn_2 + node_of_gpu_0/res5_0_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2a_1 + node_of_gpu_0/res5_1_branch2a_bn_1 + node_of_gpu_0/res5_1_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2b_1 + node_of_gpu_0/res5_1_branch2b_bn_1 + node_of_gpu_0/res5_1_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_1_branch2c_1 + node_of_gpu_0/res5_1_branch2c_bn_1 + node_of_gpu_0/res5_1_branch2c_bn_2 + node_of_gpu_0/res5_1_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2a_1 + node_of_gpu_0/res5_2_branch2a_bn_1 + node_of_gpu_0/res5_2_branch2a_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2b_1 + node_of_gpu_0/res5_2_branch2b_bn_1 + node_of_gpu_0/res5_2_branch2b_bn_2
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/res5_2_branch2c_1 + node_of_gpu_0/res5_2_branch2c_bn_1 + node_of_gpu_0/res5_2_branch2c_bn_2 + node_of_gpu_0/res5_2_branch2c_bn_3
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] POOLING: node_of_gpu_0/pool5_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] CONVOLUTION: node_of_gpu_0/pred_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] SHUFFLE: reshape_after_node_of_gpu_0/pred_1
[01/07/2024-09:08:24] [I] [TRT] [GpuLayer] SOFTMAX: (Unnamed Layer* 180) [Softmax]
[01/07/2024-09:08:26] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +261, GPU +241, now: CPU 794, GPU 8021 (MiB)
[01/07/2024-09:08:26] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +78, now: CPU 876, GPU 8099 (MiB)
[01/07/2024-09:08:26] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/07/2024-09:08:54] [I] [TRT] Total Activation Memory: 32563449344
[01/07/2024-09:08:54] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[01/07/2024-09:08:54] [I] [TRT] Total Host Persistent Memory: 131376
[01/07/2024-09:08:54] [I] [TRT] Total Device Persistent Memory: 291328
[01/07/2024-09:08:54] [I] [TRT] Total Scratch Memory: 0
[01/07/2024-09:08:54] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 105 MiB, GPU 4520 MiB
[01/07/2024-09:08:54] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 61 steps to complete.
[01/07/2024-09:08:54] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 2.66409ms to assign 4 blocks to 61 nodes requiring 7225856 bytes.
[01/07/2024-09:08:54] [I] [TRT] Total Activation Memory: 7225856
[01/07/2024-09:08:54] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +89, GPU +128, now: CPU 89, GPU 128 (MiB)
[01/07/2024-09:08:54] [I] Engine built in 32.4063 sec.
[01/07/2024-09:08:54] [I] [TRT] Loaded engine size: 108 MiB
[01/07/2024-09:08:54] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +108, now: CPU 0, GPU 108 (MiB)
[01/07/2024-09:08:54] [I] Engine deserialized in 0.0278916 sec.
[01/07/2024-09:08:54] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +7, now: CPU 0, GPU 115 (MiB)
[01/07/2024-09:08:54] [I] Setting persistentCacheLimit to 0 bytes.
[01/07/2024-09:08:54] [I] Using values loaded from tabby_tiger_new.dat for input gpu_0/data_0
[01/07/2024-09:08:54] [I] Created input binding for gpu_0/data_0 with dimensions 1x3x224x224
[01/07/2024-09:08:54] [I] Using random values for output gpu_0/softmax_1
[01/07/2024-09:08:54] [I] Created output binding for gpu_0/softmax_1 with dimensions 1x1000
[01/07/2024-09:08:54] [I] Starting inference
[01/07/2024-09:08:57] [I] Warmup completed 25 queries over 200 ms
[01/07/2024-09:08:57] [I] Timing trace has 366 queries over 3.02557 s
[01/07/2024-09:08:57] [I]
[01/07/2024-09:08:57] [I] === Trace details ===
[01/07/2024-09:08:57] [I] Trace averages of 10 runs:
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.26744 ms - Host latency: 8.32415 ms (enqueue 0.861551 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.26538 ms - Host latency: 8.32699 ms (enqueue 0.820032 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.2719 ms - Host latency: 8.33449 ms (enqueue 0.803571 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.30131 ms - Host latency: 8.3721 ms (enqueue 0.78364 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.29454 ms - Host latency: 8.37748 ms (enqueue 0.830572 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.29045 ms - Host latency: 8.3752 ms (enqueue 0.767267 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.26724 ms - Host latency: 8.35527 ms (enqueue 0.804541 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24507 ms - Host latency: 8.31821 ms (enqueue 0.732751 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24948 ms - Host latency: 8.3255 ms (enqueue 0.843036 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.23842 ms - Host latency: 8.30631 ms (enqueue 0.747308 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24622 ms - Host latency: 8.31119 ms (enqueue 0.771924 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.26063 ms - Host latency: 8.32003 ms (enqueue 0.825635 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24962 ms - Host latency: 8.31108 ms (enqueue 0.793933 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.22375 ms - Host latency: 8.30591 ms (enqueue 0.811487 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.19908 ms - Host latency: 8.26205 ms (enqueue 0.800855 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24227 ms - Host latency: 8.31268 ms (enqueue 0.843237 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.19852 ms - Host latency: 8.27793 ms (enqueue 0.773608 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24619 ms - Host latency: 8.30687 ms (enqueue 0.781409 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24457 ms - Host latency: 8.3178 ms (enqueue 0.776697 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.20758 ms - Host latency: 8.26478 ms (enqueue 0.820703 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.2282 ms - Host latency: 8.30696 ms (enqueue 0.799976 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.16896 ms - Host latency: 8.21548 ms (enqueue 0.779773 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.13727 ms - Host latency: 8.1792 ms (enqueue 0.682019 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.19653 ms - Host latency: 8.246 ms (enqueue 0.74751 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.2238 ms - Host latency: 8.27656 ms (enqueue 0.770459 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24417 ms - Host latency: 8.28811 ms (enqueue 0.743799 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.24888 ms - Host latency: 8.29795 ms (enqueue 0.806372 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.27292 ms - Host latency: 8.32629 ms (enqueue 0.739722 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.2606 ms - Host latency: 8.30967 ms (enqueue 0.788745 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.25369 ms - Host latency: 8.30713 ms (enqueue 0.757227 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.28364 ms - Host latency: 8.33052 ms (enqueue 0.73335 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.28318 ms - Host latency: 8.33523 ms (enqueue 0.691406 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.26958 ms - Host latency: 8.31211 ms (enqueue 0.734375 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.22397 ms - Host latency: 8.26929 ms (enqueue 0.695459 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.23301 ms - Host latency: 8.27856 ms (enqueue 0.683594 ms)
[01/07/2024-09:08:57] [I] Average on 10 runs - GPU latency: 8.22695 ms - Host latency: 8.29902 ms (enqueue 0.739551 ms)
[01/07/2024-09:08:57] [I]
[01/07/2024-09:08:57] [I] === Performance summary ===
[01/07/2024-09:08:57] [I] Throughput: 120.969 qps
[01/07/2024-09:08:57] [I] Latency: min = 8.13953 ms, max = 8.45911 ms, mean = 8.30552 ms, median = 8.30487 ms, percentile(90%) = 8.3735 ms, percentile(95%) = 8.39703 ms, percentile(99%) = 8.43799 ms
[01/07/2024-09:08:57] [I] Enqueue Time: min = 0.52655 ms, max = 1.24701 ms, mean = 0.775497 ms, median = 0.66655 ms, percentile(90%) = 1.06494 ms, percentile(95%) = 1.09521 ms, percentile(99%) = 1.16895 ms
[01/07/2024-09:08:57] [I] H2D Latency: min = 0.0300293 ms, max = 0.240601 ms, mean = 0.0590574 ms, median = 0.0438843 ms, percentile(90%) = 0.103271 ms, percentile(95%) = 0.12146 ms, percentile(99%) = 0.14209 ms
[01/07/2024-09:08:57] [I] GPU Compute Time: min = 8.09607 ms, max = 8.34741 ms, mean = 8.24374 ms, median = 8.24905 ms, percentile(90%) = 8.29913 ms, percentile(95%) = 8.31563 ms, percentile(99%) = 8.33475 ms
[01/07/2024-09:08:57] [I] D2H Latency: min = 0.0012207 ms, max = 0.00549316 ms, mean = 0.00271911 ms, median = 0.00257874 ms, percentile(90%) = 0.00357056 ms, percentile(95%) = 0.00415039 ms, percentile(99%) = 0.00488281 ms
[01/07/2024-09:08:57] [I] Total Host Walltime: 3.02557 s
[01/07/2024-09:08:57] [I] Total GPU Compute Time: 3.01721 s
[01/07/2024-09:08:57] [I] Explanations of the performance metrics are printed in the verbose logs.
[01/07/2024-09:08:57] [I]
[01/07/2024-09:08:57] [I] Output Tensors:
[01/07/2024-09:08:57] [I] gpu_0/softmax_1: (1x1000)
[01/07/2024-09:08:57] [I] 4.31219e-09 3.27232e-08 6.03102e-09 7.18321e-08 8.16481e-09 1.81365e-08 5.16522e-09 8.57955e-08 5.11415e-07 1.60392e-06 6.45662e-08 4.51378e-08 1.54967e-07 3.57323e-08 1.50919e-07 2.02957e-07 7.65038e-08 1.90281e-06 1.74306e-07 2.85584e-08 3.43507e-08 6.44649e-07 1.99142e-07 5.56715e-07 3.7378e-06 3.41553e-08 1.00956e-08 6.22791e-08 2.47408e-08 3.43196e-08 1.43985e-07 5.16512e-08 3.22884e-08 2.4746e-08 1.70658e-07 2.76592e-07 2.26479e-07 2.64818e-07 1.41553e-07 1.57544e-06 2.76394e-08 1.1691e-06 8.99859e-08 2.21168e-07 3.4883e-07 8.93799e-08 2.68247e-07 6.48447e-08 2.82481e-07 2.70294e-08 2.67875e-08 2.96821e-07 9.61721e-07 7.48387e-08 5.23036e-07 5.61955e-08 6.25804e-08 2.33416e-08 1.12503e-07 3.18197e-07 5.57254e-07 2.07291e-08 4.99568e-08 1.10107e-07 1.06207e-07 3.59789e-08 3.21541e-07 1.17124e-06 1.77877e-07 3.40172e-07 4.75466e-08 4.20288e-07 9.99052e-08 4.06495e-08 1.96554e-07 2.68796e-08 1.06138e-06 8.2752e-08 4.29385e-07 8.97167e-07 4.39071e-07 3.73672e-08 1.31639e-06 3.46636e-07 1.13394e-06 4.0564e-07 6.47102e-07 1.07446e-06 1.2565e-06 1.03746e-08 1.33707e-07 1.38764e-08 1.8123e-08 3.42457e-08 1.41741e-08 3.81502e-09 1.96164e-08 2.63373e-08 1.21651e-08 8.68709e-08 6.31099e-08 1.69948e-08 7.01094e-08 5.38871e-08 8.37287e-08 4.00074e-08 2.50307e-06 5.6799e-09 3.3906e-08 1.0181e-07 5.49512e-08 5.5593e-08 3.17932e-07 1.6816e-07 3.77826e-07 1.34098e-08 8.32268e-09 4.16739e-07 5.34258e-08 1.38523e-08 1.24534e-08 2.18278e-08 2.71875e-08 5.56683e-08 5.21856e-07 3.22795e-07 1.49402e-07 1.26212e-09 2.03107e-09 2.55109e-09 1.27947e-08 5.4118e-08 1.63819e-07 1.64926e-07 1.58095e-08 3.3948e-08 5.78881e-08 3.12724e-08 4.36633e-07 3.09818e-08 5.42449e-08 1.85306e-08 3.83025e-08 4.3818e-09 6.02894e-09 1.10333e-07 3.40501e-08 1.84608e-08 8.44784e-09 1.04001e-08 6.81149e-07 2.27807e-07 1.06525e-06 1.47929e-07 9.68131e-08 6.87125e-08 1.15556e-07 9.35603e-08 2.39817e-08 2.31128e-07 5.07244e-08 3.7144e-08 1.44836e-08 2.6576e-07 2.87379e-07 1.75414e-08 9.6001e-09 1.25015e-09 2.60274e-07 4.35331e-08 7.18117e-08 1.66704e-07 7.62806e-08 1.68137e-07 1.1277e-07 4.46677e-08 2.51944e-08 1.3368e-07 3.34865e-07 8.15782e-08 9.76842e-08 1.33335e-07 5.22618e-08 5.91053e-07 6.17981e-08 4.13658e-07 1.49203e-06 4.72318e-07 4.43317e-08 5.23025e-07 7.41547e-08 3.99494e-08 7.95538e-08 1.12667e-06 3.15917e-09 1.98114e-07 1.49298e-07 1.24126e-07 1.64913e-07 2.40972e-07 2.59106e-07 7.38615e-08 7.8473e-08 1.31038e-06 1.66646e-07 1.14513e-07 6.65491e-08 1.61169e-07 4.4621e-07 9.9237e-09 8.12422e-08 3.87257e-07 1.49381e-08 5.202e-08 2.28715e-07 7.87144e-08 1.19345e-07 5.88824e-08 5.04929e-08 1.05746e-07 4.39331e-08 5.89527e-08 7.00098e-08 5.36868e-07 1.29429e-06 6.34282e-08 7.08734e-08 3.25876e-07 2.49778e-08 1.33292e-07 2.13743e-08 5.03121e-08 1.65417e-06 1.3774e-06 1.84975e-07 5.55522e-07 1.33338e-07 2.3248e-08 3.4907e-08 1.14445e-07 7.24208e-08 7.43851e-08 2.9075e-07 5.99565e-07 3.40904e-08 7.90543e-08 3.7215e-07 1.78741e-08 1.167e-06 1.44256e-07 2.30053e-07 4.95958e-07 4.53147e-08 2.10106e-07 6.59327e-08 3.44226e-07 3.3919e-07 2.51628e-08 1.68144e-07 1.12057e-07 2.69027e-07 7.50873e-08 1.3233e-07 5.31296e-08 8.91849e-08 1.24447e-07 4.44719e-08 1.53876e-08 1.87676e-07 5.07761e-08 5.52412e-08 2.49056e-08 8.85184e-08 2.82941e-08 1.31065e-07 3.28248e-08 1.41085e-06 4.26582e-07 1.38685e-06 1.69765e-06 6.41466e-06 0.222164 0.649406 0.000265587 2.77833e-05 0.118299 2.35406e-05 0.00494512 0.000353588 0.000128664 4.27559e-05 7.96535e-06 0.0016923 7.9745e-07 1.81176e-08 2.7363e-08 1.32207e-08 6.3004e-08 2.14228e-06 7.05765e-06 5.43203e-08 7.3703e-08 6.37532e-08 2.01735e-07 6.12086e-09 1.12721e-07 3.1454e-07 5.10996e-08 2.344e-08 5.03263e-07 3.63736e-07 5.61937e-07 2.68328e-07 3.18183e-07 1.19748e-07 1.60634e-07 3.15427e-06 1.38529e-06 1.23261e-07 8.09803e-08 3.88762e-08 2.09217e-08 1.48431e-07 7.79221e-09 5.48808e-07 2.108e-08 1.58163e-06 9.35706e-08 5.03974e-08 1.09431e-08 1.50153e-07 5.40322e-07 7.59874e-07 4.30646e-07 3.85417e-08 1.20151e-07 2.48362e-08 1.1377e-07 1.29412e-07 7.6831e-09 0.000145024 2.39234e-07 1.18882e-07 1.26118e-07 1.63536e-08 8.83463e-08 3.54368e-08 1.2187e-08 2.03324e-08 1.05163e-08 1.93692e-08 3.95044e-08 7.40076e-08 5.42306e-08 1.98142e-08 1.8415e-07 2.08151e-06 1.24005e-06 1.42602e-06 1.76945e-07 9.70648e-07 7.16566e-07 8.0947e-07 9.00147e-07 8.18857e-09 8.99452e-08 3.16986e-07 7.41196e-08 5.48573e-09 1.22034e-08 4.96608e-07 1.08288e-07 4.89107e-08 6.31607e-07 1.28074e-07 1.02883e-08 1.04286e-08 2.10383e-07 1.10675e-08 9.83445e-09 1.34309e-07 7.04894e-09 1.61061e-08 4.00807e-06 2.61122e-07 1.35763e-07 1.08121e-07 9.20258e-08 2.74398e-08 5.57276e-09 8.3922e-08 4.32769e-07 3.37907e-08 7.07499e-09 3.25664e-08 6.72846e-08 1.46607e-08 1.31611e-08 4.43623e-07 2.61208e-07 5.22855e-08 7.40795e-08 1.15638e-07 3.40772e-09 2.36093e-09 5.5517e-09 4.81737e-08 5.2435e-09 1.48066e-07 2.05642e-06 2.83374e-08 6.31107e-08 6.43683e-06 4.46948e-07 9.72298e-07 6.55227e-08 2.40795e-07 2.57961e-08 3.251e-07 6.85664e-06 5.10334e-07 2.05943e-07 6.87812e-08 6.4403e-09 2.31994e-08 1.278e-08 6.71909e-07 8.00414e-08 5.60545e-06 7.52245e-06 3.30366e-08 3.14636e-08 2.76163e-08 1.84685e-07 2.31083e-06 5.14467e-07 3.52963e-08 1.26886e-08 4.29366e-07 5.01386e-07 5.95346e-07 3.26838e-07 5.24194e-08 8.48795e-07 7.83283e-08 2.44987e-07 1.01231e-07 2.11385e-07 4.59941e-08 2.99574e-08 7.32672e-09 1.31519e-07 8.63965e-07 4.51796e-06 2.55771e-07 2.2358e-06 1.43611e-07 4.68767e-06 5.54229e-07 8.72966e-07 2.3307e-07 6.94223e-08 1.52769e-06 7.60646e-06 5.10959e-07 2.35479e-07 1.43391e-08 1.29262e-08 1.43641e-07 7.22168e-07 4.21727e-07 7.44233e-08 1.44281e-07 1.35441e-07 5.333e-07 5.01447e-07 8.81705e-09 7.42319e-08 3.26115e-05 6.89304e-07 5.16655e-08 1.03141e-07 1.05753e-08 2.02052e-07 4.97318e-08 1.88022e-08 4.88042e-08 1.2219e-06 1.05701e-07 5.96129e-07 1.17827e-06 4.65682e-08 1.07844e-07 4.72346e-08 4.77986e-07 3.65281e-08 1.60299e-07 1.8169e-08 2.05875e-08 3.21797e-07 1.69961e-09 2.22464e-06 4.00111e-07 6.7832e-07 2.27729e-06 5.08218e-08 8.98886e-07 5.11845e-07 2.20215e-06 3.93419e-08 1.27971e-08 1.47525e-09 3.5842e-07 2.9207e-08 9.53878e-06 6.27738e-06 5.2522e-08 1.48063e-07 1.28023e-06 2.25112e-06 2.70763e-07 4.22606e-08 1.8201e-07 4.31097e-07 7.04557e-08 6.30349e-08 2.50493e-07 2.34575e-07 9.92349e-08 2.22665e-06 3.47416e-07 1.49393e-07 5.91125e-07 1.87736e-07 9.69455e-07 3.18923e-08 3.75779e-08 2.48435e-08 2.65921e-08 4.20734e-05 4.29164e-09 1.25327e-07 1.52806e-06 4.51428e-06 7.61211e-08 2.55164e-07 4.47047e-08 2.0341e-09 4.07301e-08 1.17877e-07 1.86437e-08 8.28405e-08 3.22565e-06 3.46128e-06 1.30752e-10 3.44816e-09 5.05364e-08 4.86761e-08 3.2116e-07 4.18953e-07 7.2722e-08 4.18564e-08 8.55961e-07 2.87197e-07 1.56506e-07 5.14765e-08 6.6774e-07 1.41904e-07 2.31405e-07 1.70621e-08 1.36616e-07 1.25413e-08 6.51188e-06 1.9827e-08 2.87298e-07 2.18521e-08 6.62322e-08 2.68944e-07 1.42141e-07 1.83672e-08 1.91614e-07 1.50257e-07 3.74416e-08 1.89776e-08 4.23657e-08 4.79333e-07 1.0926e-07 1.6216e-06 1.14329e-06 1.34294e-07 1.04939e-08 1.07619e-06 1.99505e-07 1.39986e-06 1.07741e-07 6.27208e-08 4.03869e-06 4.54529e-08 4.6653e-08 6.08309e-07 8.17019e-07 6.28137e-08 2.89075e-07 1.14106e-07 2.50617e-07 7.47386e-07 1.22178e-06 6.25064e-08 0.000401968 1.995e-08 8.0799e-07 1.81799e-05 6.74927e-08 8.15525e-08 1.54894e-07 7.19469e-07 1.00188e-07 3.43813e-07 4.36554e-07 5.30778e-07 4.98392e-06 5.38992e-07 0.000510889 4.49317e-07 5.84905e-08 5.6472e-09 4.88459e-06 1.51236e-08 4.43269e-09 1.34801e-07 1.88732e-07 6.88306e-06 1.25287e-07 4.29881e-07 2.15576e-08 1.68197e-08 5.77935e-08 2.65805e-06 9.51291e-07 1.79138e-06 1.02712e-07 7.92065e-08 6.17835e-08 2.07746e-07 1.19137e-07 8.93553e-09 1.36442e-07 1.69639e-07 3.54148e-07 2.53757e-08 1.60009e-07 9.06202e-08 3.82748e-06 9.70912e-08 1.18824e-08 4.13489e-07 3.10697e-08 4.50127e-08 1.65702e-07 6.16994e-07 2.45009e-07 1.57708e-08 1.47974e-07 2.83974e-08 9.61385e-07 4.57889e-07 2.87782e-08 1.07784e-07 8.28538e-09 5.68251e-07 1.0362e-06 2.63081e-07 6.69241e-08 1.46807e-05 4.25193e-06 2.40326e-08 2.43369e-07 4.72804e-06 6.44753e-08 1.24654e-06 9.60749e-07 2.665e-06 3.06748e-07 4.22162e-08 2.92427e-07 3.25192e-09 2.96528e-08 8.33966e-09 6.63828e-08 1.06968e-07 9.00791e-09 7.35332e-08 1.85577e-07 2.68262e-08 2.46429e-08 2.93231e-07 1.01367e-06 1.20959e-06 5.22922e-08 1.69608e-07 1.36372e-05 2.8331e-07 3.12253e-07 4.90316e-06 3.17488e-07 1.42068e-08 2.129e-08 1.18458e-07 1.58725e-06 1.99419e-07 8.66641e-08 1.70225e-06 6.07794e-08 6.14967e-07 4.31817e-07 2.98631e-07 1.01794e-06 3.27476e-08 9.34612e-08 1.422e-07 3.08467e-06 4.1963e-07 3.22064e-05 4.64545e-06 6.38533e-09 3.44031e-06 3.20264e-09 5.47952e-09 0.000508376 4.04581e-08 1.1234e-07 1.17033e-06 2.57371e-08 1.73591e-06 1.32384e-08 9.25029e-08 4.6047e-07 7.26198e-07 9.6773e-07 4.91101e-08 5.24099e-07 3.4001e-06 2.08103e-06 6.28604e-07 1.2712e-07 2.645e-08 3.99737e-07 3.7949e-06 3.96355e-08 9.1404e-08 1.45296e-05 4.03064e-08 9.10027e-08 1.87741e-05 4.65326e-07 2.73036e-08 1.13201e-06 1.3518e-07 2.4601e-07 7.95868e-08 2.02313e-06 9.92311e-05 8.18532e-09 7.17072e-07 2.94609e-07 1.41133e-07 7.069e-09 9.73874e-08 2.29791e-06 2.35752e-05 1.8678e-05 9.41717e-09 3.27657e-07 8.5668e-07 3.25826e-07 1.63054e-06 4.30985e-08 1.36001e-07 1.16965e-06 2.12713e-08 2.99713e-09 8.14538e-08 2.85041e-06 7.47095e-06 7.25229e-07 3.23639e-08 1.03043e-07 1.84035e-06 1.17752e-07 1.75386e-07 7.34599e-08 1.78589e-08 6.76068e-07 4.76833e-07 1.45566e-05 1.60689e-07 1.28724e-07 1.01615e-06 2.44808e-08 4.90249e-06 2.35065e-08 1.2155e-07 2.73978e-08 4.22545e-08 1.20204e-06 8.49469e-06 0.000110935 1.47411e-08 1.7354e-06 8.90351e-07 4.84894e-08 4.525e-06 5.31328e-09 1.49173e-06 1.845e-08 2.27357e-07 2.98362e-07 1.81945e-08 1.81187e-06 1.81624e-08 8.46259e-09 2.20169e-08 1.4298e-08 2.21175e-07 1.06652e-07 1.09752e-07 1.98087e-07 2.29561e-07 4.17271e-07 1.89741e-08 8.16455e-07 7.98417e-07 5.70521e-09 2.26532e-08 7.90286e-07 4.12424e-06 1.00397e-06 1.771e-06 2.30236e-06 2.27294e-08 5.22434e-06 2.00706e-05 8.904e-07 6.1891e-07 1.74583e-07 1.3251e-05 2.08552e-07 8.7916e-09 2.82986e-08 7.18589e-08 1.65902e-07 7.60535e-07 6.43047e-06 2.04698e-08 5.72088e-08 1.26994e-06 1.1042e-08 1.02745e-07 1.65413e-06 9.2257e-08 1.67806e-08 2.19027e-06 3.1811e-07 8.77542e-08 2.21991e-08 5.99528e-08 2.62875e-07 1.94251e-07 4.18546e-08 4.09085e-07 1.74893e-06 7.53293e-08 2.81697e-06 3.86782e-08 5.46469e-09 8.96562e-08 3.53197e-06 3.09246e-08 4.60778e-08 6.10538e-06 2.47808e-07 2.08653e-07 2.94466e-07 2.35702e-06 2.14683e-08 6.43639e-07 1.06576e-07 3.97759e-08 2.1646e-08 4.0269e-07 1.52702e-07 1.33815e-08 7.36706e-08 8.30909e-08 1.85075e-06 1.07603e-08 9.49883e-07 2.50373e-06 4.67766e-06 2.00213e-06 1.45547e-08 3.71791e-08 7.28418e-07 2.36864e-07 3.34534e-05 7.90807e-07 1.10756e-07 3.31662e-07 1.49803e-08 2.12927e-07 2.73291e-07 2.73512e-07 9.27414e-08 2.73608e-08 8.97993e-09 9.94292e-09 5.26084e-07 3.28558e-07 4.1904e-05 1.70534e-07 2.15817e-08 4.75436e-07 4.26341e-08 5.15595e-08 1.16952e-07 3.27501e-09 3.05879e-09 2.58485e-08 2.0689e-08 4.9584e-06 3.05773e-08 3.75785e-08 1.18929e-07 1.53957e-09 1.93365e-08 1.62997e-08 1.06609e-07 1.11543e-06 1.63069e-07 6.40267e-07 4.10568e-08 1.128e-08 1.67669e-07 5.65517e-07 7.37952e-08 2.78064e-07 5.35779e-08 3.93165e-07 4.64912e-07 6.59557e-08 8.85602e-07 1.08205e-07 8.13449e-08 3.4025e-07 4.81834e-07 9.92183e-08 1.66807e-07 9.00434e-08 1.8549e-06 1.12568e-08 1.14591e-08 6.0587e-07 1.3412e-08 1.67332e-08 1.64203e-08 1.11066e-08 3.31439e-07 1.3843e-07 9.92739e-07 7.68445e-08 2.95164e-08 7.48974e-06 1.90854e-08 1.8991e-08 3.28791e-08 1.05048e-07 1.91254e-08 5.89584e-07 5.12448e-07 1.78304e-08 6.61151e-09 7.97469e-07 4.15751e-08 1.09356e-08 1.43087e-07 4.54704e-08 1.89151e-08 6.62792e-07 3.76557e-07 1.95258e-08 9.73641e-08 2.4037e-08 1.33256e-08 9.10277e-09 6.11777e-09 5.82621e-08 5.62718e-08 1.01834e-08 7.20226e-07 1.05142e-06Inference result: Detected:
[01/07/2024-09:08:57] [I] [1]  tiger cat
[01/07/2024-09:08:57] [I] [2]  tabby
[01/07/2024-09:08:57] [I] [3]  Egyptian cat
[01/07/2024-09:08:57] [I] [4]  lynx
[01/07/2024-09:08:57] [I] [5]  tiger
[01/07/2024-09:08:57] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=/home/nvidia/siva/ResNet50.onnx --loadInputs=gpu_0/data_0:tabby_tiger_new.dat --dumpOutput

attaching the DAT file
tabby_tiger_new.dat (588 KB)

Hi SivaramaKrishna, thank you for verifying the model at your side. I have verified now with the .dat file you attached but result is same it could not predict properly.

I have very limited time now and without cuda graph and layer fusion I cannot complete my thesis.
I want to check few options with you.

  1. At your side (NVIDIA) do have any cloud access so that we can use it for inferencing
  2. Can anyone help me over google meet for installation of JetPack 5.0 , I will share my screen.
  3. I live in Bengaluru. Is it possible that in Bengaluru office any engineer can help me in installing latest jetpack. The reason is I am facing errors while installing the Jetpack 5.0 about which another discussion is going on with you with another issue.

Please let me know soon.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
Is there any update for suggestion at Please provide a sample file making use of cuda graph - #29 by SivaRamaKrishnaNV?

What is the issue you notice with sdkmanager. I am assuming you are trying to install sdkmanager on host and not on target. What is the host OS?

Hi ShivaRamaKrishna, yes I am still facing issue with the suggestion you provided with the issue

The apt-get command is not taking the option --fix-broken

I am installing it on the target itself. I have a carrier board which is built using NVIDIA SoC/SOM(GPU Jetson Xavier)

The OS is
Linux ubuntu 4.9.253-tegra #2 SMP PREEMPT Tue Nov 29 18:32:41 IST 2022 aarch64 aarch64 aarch64 GNU/Linux

If you want I can share my screen using google meet and you can verify it so that it becomes clear.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj
sdkmanager has to be installed on host and not target. That could be the issue.

Hi ShivaRamaKrishna, is there a way where we can upgrade JetPack 4.6.1 to 5.x version.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
Upgrade to jetpack 5.1 is not possible from target. Please use host to flash jetpack using sdkmanager.

Dear @trivedi.nagaraj,
Could you attach the input DAT file generated using preprocessing on your side. Let me check with your input as well.

Hi ShivaRamaKrishna, please find the attached data file.

Thanks and Regards

Nagaraj Trivedi
tabby_tiger_08_01.zip (154.3 KB)

Hi ShivaRamaKrishna, otherwise please send me the trtexec binary which you have. I will use it for inferencing on my machine. Hope that should solve my problem.

Thanks and Regards

Nagaraj Trivedi