&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --loadEngine=./typenet_bs8.onnx_b8_gpu0_fp16.engine --fp16 [02/10/2023-10:16:58] [I] === Model Options === [02/10/2023-10:16:58] [I] Format: * [02/10/2023-10:16:58] [I] Model: [02/10/2023-10:16:58] [I] Output: [02/10/2023-10:16:58] [I] === Build Options === [02/10/2023-10:16:58] [I] Max batch: 1 [02/10/2023-10:16:58] [I] Workspace: 16 MiB [02/10/2023-10:16:58] [I] minTiming: 1 [02/10/2023-10:16:58] [I] avgTiming: 8 [02/10/2023-10:16:58] [I] Precision: FP32+FP16 [02/10/2023-10:16:58] [I] Calibration: [02/10/2023-10:16:58] [I] Refit: Disabled [02/10/2023-10:16:58] [I] Sparsity: Disabled [02/10/2023-10:16:58] [I] Safe mode: Disabled [02/10/2023-10:16:58] [I] Restricted mode: Disabled [02/10/2023-10:16:58] [I] Save engine: [02/10/2023-10:16:58] [I] Load engine: ./typenet_bs8.onnx_b8_gpu0_fp16.engine [02/10/2023-10:16:58] [I] NVTX verbosity: 0 [02/10/2023-10:16:58] [I] Tactic sources: Using default tactic sources [02/10/2023-10:16:58] [I] timingCacheMode: local [02/10/2023-10:16:58] [I] timingCacheFile: [02/10/2023-10:16:58] [I] Input(s)s format: fp32:CHW [02/10/2023-10:16:58] [I] Output(s)s format: fp32:CHW [02/10/2023-10:16:58] [I] Input build shapes: model [02/10/2023-10:16:58] [I] Input calibration shapes: model [02/10/2023-10:16:58] [I] === System Options === [02/10/2023-10:16:58] [I] Device: 0 [02/10/2023-10:16:58] [I] DLACore: [02/10/2023-10:16:58] [I] Plugins: [02/10/2023-10:16:58] [I] === Inference Options === [02/10/2023-10:16:58] [I] Batch: 1 [02/10/2023-10:16:58] [I] Input inference shapes: model [02/10/2023-10:16:58] [I] Iterations: 10 [02/10/2023-10:16:58] [I] Duration: 3s (+ 200ms warm up) [02/10/2023-10:16:58] [I] Sleep time: 0ms [02/10/2023-10:16:58] [I] Streams: 1 [02/10/2023-10:16:58] [I] ExposeDMA: Disabled [02/10/2023-10:16:58] [I] Data transfers: Enabled [02/10/2023-10:16:58] [I] Spin-wait: Disabled [02/10/2023-10:16:58] [I] Multithreading: Disabled [02/10/2023-10:16:58] [I] CUDA Graph: Disabled [02/10/2023-10:16:58] [I] Separate profiling: Disabled [02/10/2023-10:16:58] [I] Time Deserialize: Disabled [02/10/2023-10:16:58] [I] Time Refit: Disabled [02/10/2023-10:16:58] [I] Skip inference: Disabled [02/10/2023-10:16:58] [I] Inputs: [02/10/2023-10:16:58] [I] === Reporting Options === [02/10/2023-10:16:58] [I] Verbose: Disabled [02/10/2023-10:16:58] [I] Averages: 10 inferences [02/10/2023-10:16:58] [I] Percentile: 99 [02/10/2023-10:16:58] [I] Dump refittable layers:Disabled [02/10/2023-10:16:58] [I] Dump output: Disabled [02/10/2023-10:16:58] [I] Profile: Disabled [02/10/2023-10:16:58] [I] Export timing to JSON file: [02/10/2023-10:16:58] [I] Export output to JSON file: [02/10/2023-10:16:58] [I] Export profile to JSON file: [02/10/2023-10:16:58] [I] [02/10/2023-10:16:58] [I] === Device Information === [02/10/2023-10:16:58] [I] Selected Device: Xavier [02/10/2023-10:16:58] [I] Compute Capability: 7.2 [02/10/2023-10:16:58] [I] SMs: 8 [02/10/2023-10:16:58] [I] Compute Clock Rate: 1.377 GHz [02/10/2023-10:16:58] [I] Device Global Memory: 31920 MiB [02/10/2023-10:16:58] [I] Shared Memory per SM: 96 KiB [02/10/2023-10:16:58] [I] Memory Bus Width: 256 bits (ECC disabled) [02/10/2023-10:16:58] [I] Memory Clock Rate: 1.377 GHz [02/10/2023-10:16:58] [I] [02/10/2023-10:16:58] [I] TensorRT version: 8001 [02/10/2023-10:16:59] [I] [TRT] [MemUsageChange] Init CUDA: CPU +353, GPU +0, now: CPU 393, GPU 21886 (MiB) [02/10/2023-10:16:59] [I] [TRT] Loaded engine size: 21 MB [02/10/2023-10:16:59] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 393 MiB, GPU 21886 MiB [02/10/2023-10:17:00] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +227, GPU +289, now: CPU 620, GPU 22197 (MiB) [02/10/2023-10:17:01] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +307, GPU +396, now: CPU 927, GPU 22593 (MiB) [02/10/2023-10:17:01] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 927, GPU 22583 (MiB) [02/10/2023-10:17:01] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine end: CPU 927 MiB, GPU 22583 MiB [02/10/2023-10:17:01] [I] Engine loaded in 2.674 sec. [02/10/2023-10:17:01] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation begin: CPU 905 MiB, GPU 22561 MiB [02/10/2023-10:17:01] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +1, now: CPU 905, GPU 22562 (MiB) [02/10/2023-10:17:01] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 905, GPU 22572 (MiB) [02/10/2023-10:17:01] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation end: CPU 905 MiB, GPU 22611 MiB [02/10/2023-10:17:01] [02/10/2023-10:17:01] [02/10/2023-10:17:01] [I] Created input binding for images with dimensions 1x3x224x224 [02/10/2023-10:17:01] [02/10/2023-10:17:01] [02/10/2023-10:17:01] [I] Created output binding for output with dimensions [02/10/2023-10:17:01] [I] Starting inference [02/10/2023-10:17:01] [02/10/2023-10:17:01] [02/10/2023-10:17:01] &&&& FAILED TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --loadEngine=./typenet_bs8.onnx_b8_gpu0_fp16.engine --fp16 [02/10/2023-10:17:01] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 905, GPU 22579 (MiB)