Hi SunilJB,
I upgrade to JetPack 4.3 and try on TRT 6 with “my_model174.onnx”, but report error about unknown option.
Commend
[code]nvidia@nvidia:~/Downloads$ trtexec --onnx=my_model.onnx --output=idx:174_activation --int8 --batch=1 --device=0
Information:
&&&& RUNNING TensorRT.trtexec # trtexec --onnx=my_model.onnx --output=idx:174_activation --int8 --batch=1 --device=0
[11/20/2019-15:57:41] [E] Unknown option: --output idx:174_activation
=== Model Options ===
--uff=<file> UFF model
--onnx=<file> ONNX model
--model=<file> Caffe model (default = no model, random weights used)
--deploy=<file> Caffe prototxt file
--output=<name>[,<name>]* Output names (it can be specified multiple times); at least one output is required for UFF and Caffe
--uffInput=<name>,X,Y,Z Input blob name and its dimensions (X,Y,Z=C,H,W), it can be specified multiple times; at least one is required for UFF models
--uffNHWC Set if inputs are in the NHWC layout instead of NCHW (use X,Y,Z=H,W,C order in --uffInput)
=== Build Options ===
--maxBatch Set max batch size and build an implicit batch engine (default = 1)
--explicitBatch Use explicit batch sizes when building the engine (default = implicit)
--minShapes=spec Build with dynamic shapes using a profile with the min shapes provided
--optShapes=spec Build with dynamic shapes using a profile with the opt shapes provided
--maxShapes=spec Build with dynamic shapes using a profile with the max shapes provided
Note: if any of min/max/opt is missing, the profile will be completed using the shapes
provided and assuming that opt will be equal to max unless they are both specified;
partially specified shapes are applied starting from the batch size;
dynamic shapes imply explicit batch
Input shapes spec ::= Ishp[","spec]
Ishp ::= name":"shape
shape ::= N[["x"N]*"*"]
--inputIOFormats=spec Type and formats of the input tensors (default = all inputs in fp32:chw)
--outputIOFormats=spec Type and formats of the output tensors (default = all outputs in fp32:chw)
IO Formats: spec ::= IOfmt[","spec]
IOfmt ::= type:fmt
type ::= "fp32"|"fp16"|"int32"|"int8"
fmt ::= ("chw"|"chw2"|"chw4"|"hwc8"|"chw16"|"chw32")["+"fmt]
--workspace=N Set workspace size in megabytes (default = 16)
--minTiming=M Set the minimum number of iterations used in kernel selection (default = 1)
--avgTiming=M Set the number of times averaged in each iteration for kernel selection (default = 8)
--fp16 Enable fp16 mode (default = disabled)
--int8 Run in int8 mode (default = disabled)
--calib=<file> Read INT8 calibration cache file
--safe Only test the functionality available in safety restricted flows
--saveEngine=<file> Save the serialized engine
--loadEngine=<file> Load a serialized engine
=== Inference Options ===
--batch=N Set batch size for implicit batch engines (default = 1)
--shapes=spec Set input shapes for explicit batch and dynamic shapes inputs
Input shapes spec ::= Ishp[","spec]
Ishp ::= name":"shape
shape ::= N[["x"N]*"*"]
--iterations=N Run at least N inference iterations (default = 10)
--warmUp=N Run for N milliseconds to warmup before measuring performance (default = 200)
--duration=N Run performance measurements for at least N seconds wallclock time (default = 10)
--sleepTime=N Delay inference start with a gap of N milliseconds between launch and compute (default = 0)
--streams=N Instantiate N engines to use concurrently (default = 1)
--useSpinWait Actively synchronize on GPU events. This option may decrease synchronization time but increase CPU usage and power (default = false)
--threads Enable multithreading to drive engines with independent threads (default = disabled)
--useCudaGraph Use cuda graph to capture engine execution and then launch inference (default = false)
--buildOnly Skip inference perf measurement (default = disabled)
=== Build and Inference Batch Options ===
When using implicit batch, the max batch size of the engine, if not given,
is set to the inference batch size;
when using explicit batch, if shapes are specified only for inference, they
will be used also as min/opt/max in the build profile; if shapes are
specified only for the build, the opt shapes will be used also for inference;
if both are specified, they must be compatible; and if explicit batch is
enabled but neither is specified, the model must provide complete static
dimensions, including batch size, for all inputs
=== Reporting Options ===
--verbose Use verbose logging (default = false)
--avgRuns=N Report performance measurements averaged over N consecutive iterations (default = 10)
--percentile=P Report performance for the P percentage (0<=P<=100, 0 representing max perf, and 100 representing min perf; (default = 99%)
--dumpOutput Print the output tensor(s) of the last inference iteration (default = disabled)
--dumpProfile Print profile information per layer (default = disabled)
--exportTimes=<file> Write the timing results in a json file (default = disabled)
--exportProfile=<file> Write the profile information per layer in a json file (default = disabled)
=== System Options ===
--device=N Select cuda device N (default = 0)
--useDLACore=N Select DLA core N for layers that support DLA (default = none)
--allowGPUFallback When DLA is enabled, allow GPU fallback for unsupported layers (default = disabled)
--plugins Plugin library (.so) to load (can be specified multiple times)
=== Help ===
--help Print this message
Note: the following options are not fully supported in trtexec: dynamic shapes, multistream/threads, cuda graphs, json logs, and actual data IO
&&&& FAILED TensorRT.trtexec # trtexec --onnx=my_model.onnx --output=idx:174_activation --int8 --batch=1 --device=0
If not get --output, will prompt “Network must have at least one output” error.
Options is change of TRT6? How use --output?