Hi,
here is the result on the host for the same command trtexec --onnx=rn50crop.onnx
greenshield@greenshield-Precision-Tower-3620:~/annoted_images/training_cropped_img/rn50fpn$ /usr/src/tensorrt/bin/trtexec --onnx=rn50crop.onnx
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=rn50crop.onnx
[05/27/2020-09:45:29] [I] === Model Options ===
[05/27/2020-09:45:29] [I] Format: ONNX
[05/27/2020-09:45:29] [I] Model: rn50crop.onnx
[05/27/2020-09:45:29] [I] Output:
[05/27/2020-09:45:29] [I] === Build Options ===
[05/27/2020-09:45:29] [I] Max batch: 1
[05/27/2020-09:45:29] [I] Workspace: 16 MB
[05/27/2020-09:45:29] [I] minTiming: 1
[05/27/2020-09:45:29] [I] avgTiming: 8
[05/27/2020-09:45:29] [I] Precision: FP32
[05/27/2020-09:45:29] [I] Calibration:
[05/27/2020-09:45:29] [I] Safe mode: Disabled
[05/27/2020-09:45:29] [I] Save engine:
[05/27/2020-09:45:29] [I] Load engine:
[05/27/2020-09:45:29] [I] Inputs format: fp32:CHW
[05/27/2020-09:45:29] [I] Outputs format: fp32:CHW
[05/27/2020-09:45:29] [I] Input build shapes: model
[05/27/2020-09:45:29] [I] === System Options ===
[05/27/2020-09:45:29] [I] Device: 0
[05/27/2020-09:45:29] [I] DLACore:
[05/27/2020-09:45:29] [I] Plugins:
[05/27/2020-09:45:29] [I] === Inference Options ===
[05/27/2020-09:45:29] [I] Batch: 1
[05/27/2020-09:45:29] [I] Iterations: 10
[05/27/2020-09:45:29] [I] Duration: 3s (+ 200ms warm up)
[05/27/2020-09:45:29] [I] Sleep time: 0ms
[05/27/2020-09:45:29] [I] Streams: 1
[05/27/2020-09:45:29] [I] ExposeDMA: Disabled
[05/27/2020-09:45:29] [I] Spin-wait: Disabled
[05/27/2020-09:45:29] [I] Multithreading: Disabled
[05/27/2020-09:45:29] [I] CUDA Graph: Disabled
[05/27/2020-09:45:29] [I] Skip inference: Disabled
[05/27/2020-09:45:29] [I] Input inference shapes: model
[05/27/2020-09:45:29] [I] Inputs:
[05/27/2020-09:45:29] [I] === Reporting Options ===
[05/27/2020-09:45:29] [I] Verbose: Disabled
[05/27/2020-09:45:29] [I] Averages: 10 inferences
[05/27/2020-09:45:29] [I] Percentile: 99
[05/27/2020-09:45:29] [I] Dump output: Disabled
[05/27/2020-09:45:29] [I] Profile: Disabled
[05/27/2020-09:45:29] [I] Export timing to JSON file:
[05/27/2020-09:45:29] [I] Export output to JSON file:
[05/27/2020-09:45:29] [I] Export profile to JSON file:
[05/27/2020-09:45:29] [I]
----------------------------------------------------------------
Input filename: rn50crop.onnx
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch
Producer version: 1.3
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:30] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/27/2020-09:45:33] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[05/27/2020-09:48:06] [I] [TRT] Detected 1 inputs and 10 output network tensors.
[05/27/2020-09:48:06] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
[05/27/2020-09:48:06] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:06] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:07] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:07] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:07] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:07] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:08] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:08] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:08] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:08] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:08] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:09] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:09] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:09] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:09] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:10] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:10] [W] [TRT] Explicit batch network detected and batch size specified, use enqueue without batch size instead.
[05/27/2020-09:48:10] [I] Warmup completed 1 queries over 200 ms
[05/27/2020-09:48:10] [I] Timing trace has 16 queries over 3.85763 s
[05/27/2020-09:48:10] [I] Trace averages of 10 runs:
[05/27/2020-09:48:10] [I] Average on 10 runs - GPU latency: 226.286 ms - Host latency: 228.955 ms (end to end 453.144 ms)
[05/27/2020-09:48:10] [I] Host latency
[05/27/2020-09:48:10] [I] min: 228.139 ms (end to end 450.754 ms)
[05/27/2020-09:48:10] [I] max: 230.185 ms (end to end 460.878 ms)
[05/27/2020-09:48:10] [I] mean: 229.05 ms (end to end 452.939 ms)
[05/27/2020-09:48:10] [I] median: 228.955 ms (end to end 452.665 ms)
[05/27/2020-09:48:10] [I] percentile: 230.185 ms at 99% (end to end 460.878 ms at 99%)
[05/27/2020-09:48:10] [I] throughput: 4.14763 qps
[05/27/2020-09:48:10] [I] walltime: 3.85763 s
[05/27/2020-09:48:10] [I] GPU Compute
[05/27/2020-09:48:10] [I] min: 225.492 ms
[05/27/2020-09:48:10] [I] max: 227.546 ms
[05/27/2020-09:48:10] [I] mean: 226.39 ms
[05/27/2020-09:48:10] [I] median: 226.316 ms
[05/27/2020-09:48:10] [I] percentile: 227.546 ms at 99%
[05/27/2020-09:48:10] [I] total compute time: 3.62224 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=rn50crop.onnx
Seems I have better result on the host than on the Jetson…even if I have only one Quadro4000 in it.
For the batch question, I was more asking how to know what is the maximum batch we can put on the Jetson ? Is it hardwared defined or can I pretty much put any number to increase speed ?
Thanks
[EDIT]:
I tested the following command and here is what I get:
nvidia@nvidia-desktop:~$ /usr/src/tensorrt/bin/trtexec --loadEngine=rn50engine.trt --fp16 --batch=64
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --loadEngine=rn50engine.trt --fp16 --batch=64
[05/27/2020-10:01:30] [I] === Model Options ===
[05/27/2020-10:01:30] [I] Format: *
[05/27/2020-10:01:30] [I] Model:
[05/27/2020-10:01:30] [I] Output:
[05/27/2020-10:01:30] [I] === Build Options ===
[05/27/2020-10:01:30] [I] Max batch: 64
[05/27/2020-10:01:30] [I] Workspace: 16 MB
[05/27/2020-10:01:30] [I] minTiming: 1
[05/27/2020-10:01:30] [I] avgTiming: 8
[05/27/2020-10:01:30] [I] Precision: FP32+FP16
[05/27/2020-10:01:30] [I] Calibration:
[05/27/2020-10:01:30] [I] Safe mode: Disabled
[05/27/2020-10:01:30] [I] Save engine:
[05/27/2020-10:01:30] [I] Load engine: rn50engine.trt
[05/27/2020-10:01:30] [I] Builder Cache: Enabled
[05/27/2020-10:01:30] [I] NVTX verbosity: 0
[05/27/2020-10:01:30] [I] Inputs format: fp32:CHW
[05/27/2020-10:01:30] [I] Outputs format: fp32:CHW
[05/27/2020-10:01:30] [I] Input build shapes: model
[05/27/2020-10:01:30] [I] Input calibration shapes: model
[05/27/2020-10:01:30] [I] === System Options ===
[05/27/2020-10:01:30] [I] Device: 0
[05/27/2020-10:01:30] [I] DLACore:
[05/27/2020-10:01:30] [I] Plugins:
[05/27/2020-10:01:30] [I] === Inference Options ===
[05/27/2020-10:01:30] [I] Batch: 64
[05/27/2020-10:01:30] [I] Input inference shapes: model
[05/27/2020-10:01:30] [I] Iterations: 10
[05/27/2020-10:01:30] [I] Duration: 3s (+ 200ms warm up)
[05/27/2020-10:01:30] [I] Sleep time: 0ms
[05/27/2020-10:01:30] [I] Streams: 1
[05/27/2020-10:01:30] [I] ExposeDMA: Disabled
[05/27/2020-10:01:30] [I] Spin-wait: Disabled
[05/27/2020-10:01:30] [I] Multithreading: Disabled
[05/27/2020-10:01:30] [I] CUDA Graph: Disabled
[05/27/2020-10:01:30] [I] Skip inference: Disabled
[05/27/2020-10:01:30] [I] Inputs:
[05/27/2020-10:01:30] [I] === Reporting Options ===
[05/27/2020-10:01:30] [I] Verbose: Disabled
[05/27/2020-10:01:30] [I] Averages: 10 inferences
[05/27/2020-10:01:30] [I] Percentile: 99
[05/27/2020-10:01:30] [I] Dump output: Disabled
[05/27/2020-10:01:30] [I] Profile: Disabled
[05/27/2020-10:01:30] [I] Export timing to JSON file:
[05/27/2020-10:01:30] [I] Export output to JSON file:
[05/27/2020-10:01:30] [I] Export profile to JSON file:
[05/27/2020-10:01:30] [I]
[05/27/2020-10:01:34] [W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[05/27/2020-10:01:42] [I] Starting inference threads
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:42] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:43] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:44] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 64, but engine max batch size was: 1
[05/27/2020-10:01:45] [I] Warmup completed 128 queries over 200 ms
[05/27/2020-10:01:45] [I] Timing trace has 2496 queries over 3.16174 s
[05/27/2020-10:01:45] [I] Trace averages of 10 runs:
[05/27/2020-10:01:45] [I] Average on 10 runs - GPU latency: 0.000634766 ms - Host latency: 81.3747 ms (end to end 81.3827 ms)
[05/27/2020-10:01:45] [I] Average on 10 runs - GPU latency: 0.000720215 ms - Host latency: 81.268 ms (end to end 81.2813 ms)
[05/27/2020-10:01:45] [I] Average on 10 runs - GPU latency: 0.000720215 ms - Host latency: 80.7876 ms (end to end 80.8008 ms)
[05/27/2020-10:01:45] [I] Host latency
[05/27/2020-10:01:45] [I] min: 80.1501 ms (end to end 80.1582 ms)
[05/27/2020-10:01:45] [I] max: 82.7965 ms (end to end 82.804 ms)
[05/27/2020-10:01:45] [I] mean: 81.0573 ms (end to end 81.0694 ms)
[05/27/2020-10:01:45] [I] median: 81.134 ms (end to end 81.1419 ms)
[05/27/2020-10:01:45] [I] percentile: 82.7965 ms at 99% (end to end 82.804 ms at 99%)
[05/27/2020-10:01:45] [I] throughput: 789.44 qps
[05/27/2020-10:01:45] [I] walltime: 3.16174 s
[05/27/2020-10:01:45] [I] GPU Compute
[05/27/2020-10:01:45] [I] min: 0.000457764 ms
[05/27/2020-10:01:45] [I] max: 0.0012207 ms
[05/27/2020-10:01:45] [I] mean: 0.000726162 ms
[05/27/2020-10:01:45] [I] median: 0.000610352 ms
[05/27/2020-10:01:45] [I] percentile: 0.0012207 ms at 99%
[05/27/2020-10:01:45] [I] total compute time: 2.83203e-05 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --loadEngine=rn50engine.trt --fp16 --batch=64
I used the .trt file this time and set the batch to 64 (but if I use batch=1 I have the same old result of 10s for 10 runs)