Jetson-inference: mageNet.Classify() encountered an error

Hi,

I have generated onnx model and wanted to use this with imageNet.console.py. I am getting the below error. Let me know what is the problem here…
If I use googlenet as model as input instead of the onnx model which I generated, it works fine.

Error message:-

[TRT] /home/jetbot/shankar/resnet18.onnx loaded
imageNet – loaded 2 class info entries
/home/jetbot/shankar/resnet18.onnx initialized.
Traceback (most recent call last):
File “imagenet-console.py”, line 52, in
class_idx, confidence = net.Classify(img, width, height)
Exception: jetson.inference – imageNet.Classify() encountered an error classifying the image
jetson.utils – freeing CUDA mapped memory
PyTensorNet_Dealloc()

Regards,
Shankar

Hi Shankar, are you able to run your model with the imagenet-console C++ program? (should be same arguments, just without the .py extension)

Perhaps that will print out more error info about what is failing. Post the log of it if you can please.

Hi Dusty,

With C++ imagenet-console, the logs …

[TRT] binding to output 0 output_0 dims (b=1 c=2 h=1 w=1) size=8
device GPU, /home/jetbot/shankar/resnet18.onnx initialized.
[TRT] /home/jetbot/shankar/resnet18.onnx loaded
imageNet – loaded 2 class info entries
/home/jetbot/shankar/resnet18.onnx initialized.
[image] loaded ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’ (620 x 410, 1 channels)
imagenet-console: failed to classify ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’ (result=-1)
imagenet-console: shutting down…
imagenet-console: shutdown complete

One more thing: For load the recognition network, it is using Googlenet as default value. I think since I am providing the onnx model, why is it still using Googlenet as network?

net = jetson.inference.imageNet(opt.network, sys.argv)

Regard,
Shankar

Complete logs:-

jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$ python3.6 imagenet-console.py --model=/home/jetbot/shankar/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog_limited/labels.txt /home/jetbot/datasets/cat_dog_limited/test/02.jpg cat.jpg
jetson.inference.init.py
jetson.inference – initializing Python 3.6 bindings…
jetson.inference – registering module types…
jetson.inference – done registering module types
jetson.inference – done Python 3.6 binding initialization
jetson.utils.init.py
jetson.utils – initializing Python 3.6 bindings…
jetson.utils – registering module functions…
jetson.utils – done registering module functions
jetson.utils – registering module types…
jetson.utils – done registering module types
jetson.utils – done Python 3.6 binding initialization
[image] loaded ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’ (620 x 410, 1 channels)
jetson.inference – PyTensorNet_New()
jetson.inference – PyImageNet_Init()
jetson.inference – imageNet loading network using argv command line params
jetson.inference – imageNet.init() argv[0] = ‘imagenet-console.py’
jetson.inference – imageNet.init() argv[1] = ‘–model=/home/jetbot/shankar/resnet18.onnx’
jetson.inference – imageNet.init() argv[2] = ‘–input_blob=input_0’
jetson.inference – imageNet.init() argv[3] = ‘–output_blob=output_0’
jetson.inference – imageNet.init() argv[4] = ‘–labels=/home/jetbot/datasets/cat_dog_limited/labels.txt’
jetson.inference – imageNet.init() argv[5] = ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’
jetson.inference – imageNet.init() argv[6] = ‘cat.jpg’

imageNet – loading classification network model from:
– prototxt (null)
– model /home/jetbot/shankar/resnet18.onnx
– class_labels /home/jetbot/datasets/cat_dog_limited/labels.txt
– input_blob ‘input_0’
– output_blob ‘output_0’
– batch_size 1

[TRT] TensorRT version 6.0.1
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - GridAnchorRect_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] Could not register plugin creator: FlattenConcat_TRT in namespace:
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine
[TRT] loading network profile from engine cache… /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine
[TRT] device GPU, /home/jetbot/shankar/resnet18.onnx loaded
[TRT] Deserialize required 6357218 microseconds.
[TRT] device GPU, CUDA engine context initialized with 2 bindings
[TRT] binding – index 0
– name ‘input_0’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (SPATIAL)
– dim #1 224 (SPATIAL)
– dim #2 224 (SPATIAL)
[TRT] binding – index 1
– name ‘output_0’
– type FP32
– in/out OUTPUT
– # dims 1
– dim #0 2 (SPATIAL)
[TRT] binding to input 0 input_0 binding index: 0
[TRT] binding to input 0 input_0 dims (b=1 c=3 h=224 w=224) size=602112
[TRT] binding to output 0 output_0 binding index: 1
[TRT] binding to output 0 output_0 dims (b=1 c=2 h=1 w=1) size=8
device GPU, /home/jetbot/shankar/resnet18.onnx initialized.
[TRT] /home/jetbot/shankar/resnet18.onnx loaded
imageNet – loaded 2 class info entries
/home/jetbot/shankar/resnet18.onnx initialized.

Network is: googlenet

Traceback (most recent call last):
File “imagenet-console.py”, line 53, in
class_idx, confidence = net.Classify(img, width, height)
Exception: jetson.inference – imageNet.Classify() encountered an error classifying the image
jetson.utils – freeing CUDA mapped memory
PyTensorNet_Dealloc()
jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$

OK, so from the lack of other error messages during the execution, the network seems to be outputting negative confidence values (they should be positive). If they were positive, the class would be printed out in the log.

When you trained the model in PyTorch, what was the accuracy of the model?

Also, try commenting out this line of code, which will result in all the outputs being printed so you can debug them:
https://github.com/dusty-nv/jetson-inference/blob/8846dcb274ff812880ef7bf9d42eb1dd75c9e7b7/c/imageNet.cpp#L548

After commenting out this if statement, run ‘make’ and ‘sudo make install’ again.

Hi Dusty,

Did comment out the line you mentioned and rebuild and collected the logs. Logs shared below…
How to attach the log file here?

Regarding the accuracy of the model: I think accuracy will be very low since I run training for 5 epochs only. how to get the accuracy of the model?

The logs:-

jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$ python3.6 imagenet-console.py --model=/home/jetbot/shankar/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog_limited/labels.txt /home/jetbot/datasets/cat_dog_limited/test/02.jpg cat.jpg
jetson.inference.init.py
jetson.inference – initializing Python 3.6 bindings…
jetson.inference – registering module types…
jetson.inference – done registering module types
jetson.inference – done Python 3.6 binding initialization
jetson.utils.init.py
jetson.utils – initializing Python 3.6 bindings…
jetson.utils – registering module functions…
jetson.utils – done registering module functions
jetson.utils – registering module types…
jetson.utils – done registering module types
jetson.utils – done Python 3.6 binding initialization
[image] loaded ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’ (620 x 410, 1 channels)

Image loaded is: 620 X 410

jetson.inference – PyTensorNet_New()
jetson.inference – PyImageNet_Init()
jetson.inference – imageNet loading network using argv command line params
jetson.inference – imageNet.init() argv[0] = ‘imagenet-console.py’
jetson.inference – imageNet.init() argv[1] = ‘–model=/home/jetbot/shankar/resnet18.onnx’
jetson.inference – imageNet.init() argv[2] = ‘–input_blob=input_0’
jetson.inference – imageNet.init() argv[3] = ‘–output_blob=output_0’
jetson.inference – imageNet.init() argv[4] = ‘–labels=/home/jetbot/datasets/cat_dog_limited/labels.txt’
jetson.inference – imageNet.init() argv[5] = ‘/home/jetbot/datasets/cat_dog_limited/test/02.jpg’
jetson.inference – imageNet.init() argv[6] = ‘cat.jpg’

imageNet – loading classification network model from:
– prototxt (null)
– model /home/jetbot/shankar/resnet18.onnx
– class_labels /home/jetbot/datasets/cat_dog_limited/labels.txt
– input_blob ‘input_0’
– output_blob ‘output_0’
– batch_size 1

[TRT] TensorRT version 6.0.1
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - GridAnchorRect_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] Could not register plugin creator: FlattenConcat_TRT in namespace:
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine
[TRT] cache file not found, profiling network model on device GPU
[TRT] device GPU, loading /usr/bin/ /home/jetbot/shankar/resnet18.onnx

Input filename: /home/jetbot/shankar/resnet18.onnx
ONNX IR version: 0.0.4
Opset version: 9
Producer name: pytorch
Producer version: 1.2
Domain:
Model version: 0
Doc string:

WARNING: ONNX model has a newer ir_version (0.0.4) than this parser was built against (0.0.3).
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (3, 224, 224)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (7, 7), strides: (2, 2), padding: (3, 3), dilations: (1, 1), numOutputs: 64
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (64, 112, 112)
[TRT] 123:Conv → (64, 112, 112)
[TRT] 124:BatchNormalization → (64, 112, 112)
[TRT] 125:Relu → (64, 112, 112)
[TRT] 126:MaxPool → (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 64
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (64, 56, 56)
[TRT] 127:Conv → (64, 56, 56)
[TRT] 128:BatchNormalization → (64, 56, 56)
[TRT] 129:Relu → (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 64
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (64, 56, 56)
[TRT] 130:Conv → (64, 56, 56)
[TRT] 131:BatchNormalization → (64, 56, 56)
[TRT] 132:Add → (64, 56, 56)
[TRT] 133:Relu → (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 64
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (64, 56, 56)
[TRT] 134:Conv → (64, 56, 56)
[TRT] 135:BatchNormalization → (64, 56, 56)
[TRT] 136:Relu → (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 64
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (64, 56, 56)
[TRT] 137:Conv → (64, 56, 56)
[TRT] 138:BatchNormalization → (64, 56, 56)
[TRT] 139:Add → (64, 56, 56)
[TRT] 140:Relu → (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (2, 2), padding: (1, 1), dilations: (1, 1), numOutputs: 128
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (128, 28, 28)
[TRT] 141:Conv → (128, 28, 28)
[TRT] 142:BatchNormalization → (128, 28, 28)
[TRT] 143:Relu → (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 128
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (128, 28, 28)
[TRT] 144:Conv → (128, 28, 28)
[TRT] 145:BatchNormalization → (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (64, 56, 56)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (1, 1), strides: (2, 2), padding: (0, 0), dilations: (1, 1), numOutputs: 128
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (128, 28, 28)
[TRT] 146:Conv → (128, 28, 28)
[TRT] 147:BatchNormalization → (128, 28, 28)
[TRT] 148:Add → (128, 28, 28)
[TRT] 149:Relu → (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 128
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (128, 28, 28)
[TRT] 150:Conv → (128, 28, 28)
[TRT] 151:BatchNormalization → (128, 28, 28)
[TRT] 152:Relu → (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 128
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (128, 28, 28)
[TRT] 153:Conv → (128, 28, 28)
[TRT] 154:BatchNormalization → (128, 28, 28)
[TRT] 155:Add → (128, 28, 28)
[TRT] 156:Relu → (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (2, 2), padding: (1, 1), dilations: (1, 1), numOutputs: 256
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (256, 14, 14)
[TRT] 157:Conv → (256, 14, 14)
[TRT] 158:BatchNormalization → (256, 14, 14)
[TRT] 159:Relu → (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 256
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (256, 14, 14)
[TRT] 160:Conv → (256, 14, 14)
[TRT] 161:BatchNormalization → (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (128, 28, 28)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (1, 1), strides: (2, 2), padding: (0, 0), dilations: (1, 1), numOutputs: 256
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (256, 14, 14)
[TRT] 162:Conv → (256, 14, 14)
[TRT] 163:BatchNormalization → (256, 14, 14)
[TRT] 164:Add → (256, 14, 14)
[TRT] 165:Relu → (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 256
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (256, 14, 14)
[TRT] 166:Conv → (256, 14, 14)
[TRT] 167:BatchNormalization → (256, 14, 14)
[TRT] 168:Relu → (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 256
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (256, 14, 14)
[TRT] 169:Conv → (256, 14, 14)
[TRT] 170:BatchNormalization → (256, 14, 14)
[TRT] 171:Add → (256, 14, 14)
[TRT] 172:Relu → (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (2, 2), padding: (1, 1), dilations: (1, 1), numOutputs: 512
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (512, 7, 7)
[TRT] 173:Conv → (512, 7, 7)
[TRT] 174:BatchNormalization → (512, 7, 7)
[TRT] 175:Relu → (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 512
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (512, 7, 7)
[TRT] 176:Conv → (512, 7, 7)
[TRT] 177:BatchNormalization → (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (256, 14, 14)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (1, 1), strides: (2, 2), padding: (0, 0), dilations: (1, 1), numOutputs: 512
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (512, 7, 7)
[TRT] 178:Conv → (512, 7, 7)
[TRT] 179:BatchNormalization → (512, 7, 7)
[TRT] 180:Add → (512, 7, 7)
[TRT] 181:Relu → (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 512
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (512, 7, 7)
[TRT] 182:Conv → (512, 7, 7)
[TRT] 183:BatchNormalization → (512, 7, 7)
[TRT] 184:Relu → (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:773: Convolution input dimensions: (512, 7, 7)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:840: Using kernel: (3, 3), strides: (1, 1), padding: (1, 1), dilations: (1, 1), numOutputs: 512
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:841: Convolution output dimensions: (512, 7, 7)
[TRT] 185:Conv → (512, 7, 7)
[TRT] 186:BatchNormalization → (512, 7, 7)
[TRT] 187:Add → (512, 7, 7)
[TRT] 188:Relu → (512, 7, 7)
[TRT] 189:GlobalAveragePool → (512, 1, 1)
[TRT] 190:Flatten → (512)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:1094: GEMM: A: (512), B: (512, 2), C: (2)
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:1131: Using opA: 2 opB: 0
[TRT] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/parsers/onnxOpenSource/builtin_op_importers.cpp:1132: GEMM: A, after squeezing: (512)
[TRT] 191:Gemm → (2)
[TRT] output_0:Softmax → (2)
[TRT] retrieved Input tensor “input_0”: 3x224x224
[TRT] device GPU, configuring CUDA engine
[TRT] device GPU, building FP16: ON
[TRT] device GPU, building INT8: OFF
[TRT] device GPU, building CUDA engine (this may take a few minutes the first time a network is loaded)
[TRT] Applying generic optimizations to the graph for inference.
[TRT] Original: 73 layers
[TRT] After dead-layer removal: 73 layers
[TRT] Fusing convolution weights from (Unnamed Layer* 0) [Convolution] with scale (Unnamed Layer* 1) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 4) [Convolution] with scale (Unnamed Layer* 5) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 7) [Convolution] with scale (Unnamed Layer* 8) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 11) [Convolution] with scale (Unnamed Layer* 12) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 14) [Convolution] with scale (Unnamed Layer* 15) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 18) [Convolution] with scale (Unnamed Layer* 19) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 21) [Convolution] with scale (Unnamed Layer* 22) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 23) [Convolution] with scale (Unnamed Layer* 24) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 27) [Convolution] with scale (Unnamed Layer* 28) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 30) [Convolution] with scale (Unnamed Layer* 31) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 34) [Convolution] with scale (Unnamed Layer* 35) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 37) [Convolution] with scale (Unnamed Layer* 38) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 39) [Convolution] with scale (Unnamed Layer* 40) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 43) [Convolution] with scale (Unnamed Layer* 44) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 46) [Convolution] with scale (Unnamed Layer* 47) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 50) [Convolution] with scale (Unnamed Layer* 51) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 53) [Convolution] with scale (Unnamed Layer* 54) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 55) [Convolution] with scale (Unnamed Layer* 56) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 59) [Convolution] with scale (Unnamed Layer* 60) [Scale]
[TRT] Fusing convolution weights from (Unnamed Layer* 62) [Convolution] with scale (Unnamed Layer* 63) [Scale]
[TRT] After scale fusion: 53 layers
[TRT] Fusing (Unnamed Layer* 0) [Convolution] with (Unnamed Layer* 2) [Activation]
[TRT] Fusing (Unnamed Layer* 4) [Convolution] with (Unnamed Layer* 6) [Activation]
[TRT] Fusing (Unnamed Layer* 7) [Convolution] with (Unnamed Layer* 9) [ElementWise]
[TRT] Fusing (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] with (Unnamed Layer* 10) [Activation]
[TRT] Fusing (Unnamed Layer* 11) [Convolution] with (Unnamed Layer* 13) [Activation]
[TRT] Fusing (Unnamed Layer* 14) [Convolution] with (Unnamed Layer* 16) [ElementWise]
[TRT] Fusing (Unnamed Layer* 14) [Convolution] + (Unnamed Layer* 16) [ElementWise] with (Unnamed Layer* 17) [Activation]
[TRT] Fusing (Unnamed Layer* 18) [Convolution] with (Unnamed Layer* 20) [Activation]
[TRT] Fusing (Unnamed Layer* 21) [Convolution] with (Unnamed Layer* 25) [ElementWise]
[TRT] Fusing (Unnamed Layer* 21) [Convolution] + (Unnamed Layer* 25) [ElementWise] with (Unnamed Layer* 26) [Activation]
[TRT] Fusing (Unnamed Layer* 27) [Convolution] with (Unnamed Layer* 29) [Activation]
[TRT] Fusing (Unnamed Layer* 30) [Convolution] with (Unnamed Layer* 32) [ElementWise]
[TRT] Fusing (Unnamed Layer* 30) [Convolution] + (Unnamed Layer* 32) [ElementWise] with (Unnamed Layer* 33) [Activation]
[TRT] Fusing (Unnamed Layer* 34) [Convolution] with (Unnamed Layer* 36) [Activation]
[TRT] Fusing (Unnamed Layer* 37) [Convolution] with (Unnamed Layer* 41) [ElementWise]
[TRT] Fusing (Unnamed Layer* 37) [Convolution] + (Unnamed Layer* 41) [ElementWise] with (Unnamed Layer* 42) [Activation]
[TRT] Fusing (Unnamed Layer* 43) [Convolution] with (Unnamed Layer* 45) [Activation]
[TRT] Fusing (Unnamed Layer* 46) [Convolution] with (Unnamed Layer* 48) [ElementWise]
[TRT] Fusing (Unnamed Layer* 46) [Convolution] + (Unnamed Layer* 48) [ElementWise] with (Unnamed Layer* 49) [Activation]
[TRT] Fusing (Unnamed Layer* 50) [Convolution] with (Unnamed Layer* 52) [Activation]
[TRT] Fusing (Unnamed Layer* 53) [Convolution] with (Unnamed Layer* 57) [ElementWise]
[TRT] Fusing (Unnamed Layer* 53) [Convolution] + (Unnamed Layer* 57) [ElementWise] with (Unnamed Layer* 58) [Activation]
[TRT] Fusing (Unnamed Layer* 59) [Convolution] with (Unnamed Layer* 61) [Activation]
[TRT] Fusing (Unnamed Layer* 62) [Convolution] with (Unnamed Layer* 64) [ElementWise]
[TRT] Fusing (Unnamed Layer* 62) [Convolution] + (Unnamed Layer* 64) [ElementWise] with (Unnamed Layer* 65) [Activation]
[TRT] After vertical fusions: 28 layers
[TRT] After final dead-layer removal: 28 layers
[TRT] After tensor merging: 28 layers
[TRT] After concat removal: 28 layers
[TRT] Graph construction and optimization completed in 1.03877 seconds.
[TRT] Constructing optimization profile number 0 out of 1
--------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.36263
[TRT] Tactic: 0 time 0.614843
[TRT] Fastest Tactic: 1002 Time: 0.36263
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 7.5544
[TRT] Tactic: 0 time 0.286016
[TRT] Fastest Tactic: 0 Time: 0.286016
[TRT] *************** Autotuning format combination: Float(1,224,50176,150528) → Float(1,112,12544,802816) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 3.01174
[TRT] Fastest Tactic: 0 Time: 3.01174
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (FusedConvActConvolution)
[TRT] Tactic: 1 time 4.9762
[TRT] Tactic: 49 time 4.65906
[TRT] Tactic: 128 time 4.255
[TRT] Fastest Tactic: 128 Time: 4.255
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1
[TRT] Tactic: 1062367460111450758 time 2.82388
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1
[TRT] Tactic: 4337000649858996379 time 2.63484
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1
[TRT] Tactic: 4501471010995462441 time 4.88021
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1
[TRT] Tactic: 6645123197870846056 time 2.24
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1
[TRT] Tactic: -9137461792520977713 time 4.88344
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1
[TRT] Tactic: -6092040395344634144 time 2.8737
[TRT] Fastest Tactic: 6645123197870846056 Time: 2.24
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 5.6726
[TRT] Tactic: 1 time 3.05862
[TRT] Tactic: 2 time 5.14909
[TRT] Fastest Tactic: 1 Time: 3.05862
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 6645123197870846056
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,224,50176,150528) → Half(1,112,12544,802816) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (FusedConvActConvolution)
[TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CaskConvolution)
[TRT] CaskConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 6.07667
[TRT] Tactic: 1 time 2.94841
[TRT] Tactic: 2 time 4.56203
[TRT] Fastest Tactic: 1 Time: 2.94841
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,224,50176:2,100352) → Half(1,112,12544:2,401408) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 1.52648
[TRT] Fastest Tactic: 0 Time: 1.52648
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (FusedConvActConvolution)
[TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[TRT] Tactic: 3564772625446233998 time 2.44872
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[TRT] Tactic: 3650389455493082349 time 2.02521
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[TRT] Tactic: 7205456024582378848 time 1.54763
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[TRT] Tactic: -6490690591794140522 time 1.55997
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[TRT] Tactic: -4686027666808657977 time 3.47766
[TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[TRT] Tactic: -3898373634979201110 time 3.43685
[TRT] Fastest Tactic: 7205456024582378848 Time: 1.54763
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaConvolution)
[TRT] CudaConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: LegacySASSConvolution Tactic: 0
[TRT]
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.731562
[TRT] Tactic: 0 time 1.21159
[TRT] Fastest Tactic: 1002 Time: 0.731562
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 2.4274
[TRT] Tactic: 0 time 0.974584
[TRT] Fastest Tactic: 0 Time: 0.974584
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.72789
[TRT] Tactic: 0 time 1.0319
[TRT] Fastest Tactic: 1002 Time: 0.72789
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 2.86172
[TRT] Tactic: 0 time 0.955443
[TRT] Fastest Tactic: 0 Time: 0.955443
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 3.55315
[TRT] Tactic: 0 time 0.880911
[TRT] Fastest Tactic: 0 Time: 0.880911
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 3.75583
[TRT] Tactic: 0 time 0.870026
[TRT] Fastest Tactic: 0 Time: 0.870026
[TRT] *************** Autotuning format combination: Float(1,112,12544,802816) → Float(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (Pooling)
[TRT] Tactic: -1 time 0.887812
[TRT] Fastest Tactic: -1 Time: 0.887812
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (TiledPooling)
[TRT] Tactic: 257 time 1.37336
[TRT] Tactic: 65793 time 1.40943
[TRT] Tactic: 131329 time 1.72
[TRT] Tactic: 196865 time 2.15417
[TRT] Tactic: 262401 time 1.35432
[TRT] Tactic: 327937 time 1.34724
[TRT] Tactic: 393473 time 1.90399
[TRT] Tactic: 459009 time 0.849896
[TRT] Tactic: 524545 time 0.691823
[TRT] Tactic: 590081 time 1.01815
[TRT] Tactic: 655617 time 1.08013
[TRT] Tactic: 721153 time 1.09807
[TRT] Tactic: 786689 time 0.898567
[TRT] Tactic: 852225 time 0.971614
[TRT] Tactic: 917761 time 0.678907
[TRT] Tactic: 983297 time 0.575833
[TRT] Tactic: 1048833 time 0.852318
[TRT] Tactic: 1114369 time 0.863777
[TRT] Tactic: 1179905 time 0.975521
[TRT] Tactic: 1245441 time 0.69948
[TRT] Tactic: 1310977 time 0.805339
[TRT] Tactic: 1376513 time 0.616848
[TRT] Tactic: 1442049 time 0.492318
[TRT] Tactic: 1507585 time 0.725312
[TRT] Tactic: 1573121 time 0.692604
[TRT] Tactic: 1638657 time 0.633984
[TRT] Tactic: 1704193 time 0.564349
[TRT] Tactic: 1769729 time 0.652604
[TRT] Tactic: 1835265 time 0.603359
[TRT] Tactic: 1900801 time 0.462708
[TRT] Tactic: 1966337 time 0.625755
[TRT] Tactic: 2031873 time 0.603619
[TRT] Tactic: 2097409 time 0.594349
[TRT] Tactic: 2162945 time 0.514531
[TRT] Tactic: 2228481 time 0.556823
[TRT] Tactic: 2294017 time 0.618724
[TRT] Tactic: 2359553 time 0.46375
[TRT] Tactic: 2425089 time 0.597317
[TRT] Tactic: 2490625 time 0.591432
[TRT] Tactic: 2556161 time 0.557943
[TRT] Tactic: 2621697 time 0.493464
[TRT] Tactic: 2687233 time 0.539714
[TRT] Tactic: 6947073 time 0.533126
[TRT] Fastest Tactic: 1900801 Time: 0.462708
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 1900801
[TRT]
[TRT] *************** Autotuning format combination: Half(1,112,12544,802816) → Half(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (Pooling)
[TRT] Tactic: -1 time 0.969557
[TRT] Fastest Tactic: -1 Time: 0.969557
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (TiledPooling)
[TRT] TiledPooling has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,112,12544:2,401408) → Half(1,56,3136:2,100352) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (Pooling)
[TRT] Tactic: -3 time 0.527031
[TRT] Fastest Tactic: -3 Time: 0.527031
[TRT] --------------- Timing Runner: (Unnamed Layer* 3) [Pooling] (TiledPooling)
[TRT] Tactic: 257 time 0.690339
[TRT] Tactic: 65793 time 0.583542
[TRT] Tactic: 131329 time 1.18289
[TRT] Tactic: 196865 time 0.927135
[TRT] Tactic: 262401 time 0.748204
[TRT] Tactic: 327937 time 0.766355
[TRT] Tactic: 393473 time 0.810964
[TRT] Tactic: 459009 time 0.44987
[TRT] Tactic: 524545 time 0.387448
[TRT] Tactic: 590081 time 0.531667
[TRT] Tactic: 655617 time 0.578593
[TRT] Tactic: 721153 time 0.482682
[TRT] Tactic: 786689 time 0.450286
[TRT] Tactic: 852225 time 0.503489
[TRT] Tactic: 917761 time 0.385312
[TRT] Tactic: 983297 time 0.331927
[TRT] Tactic: 1048833 time 0.424739
[TRT] Tactic: 1114369 time 0.458881
[TRT] Tactic: 1179905 time 0.40987
[TRT] Tactic: 1245441 time 0.372891
[TRT] Tactic: 1310977 time 0.410052
[TRT] Tactic: 1376513 time 0.341328
[TRT] Tactic: 1442049 time 0.290547
[TRT] Tactic: 1507585 time 0.359193
[TRT] Tactic: 1573121 time 0.407344
[TRT] Tactic: 1638657 time 0.352343
[TRT] Tactic: 1704193 time 0.321797
[TRT] Tactic: 1769729 time 0.359114
[TRT] Tactic: 1835265 time 0.345156
[TRT] Tactic: 1900801 time 0.294635
[TRT] Tactic: 1966337 time 0.36276
[TRT] Tactic: 2031873 time 0.373073
[TRT] Tactic: 2097409 time 0.358204
[TRT] Tactic: 2162945 time 0.323619
[TRT] Tactic: 2228481 time 0.355287
[TRT] Tactic: 2294017 time 0.332397
[TRT] Tactic: 2359553 time 0.294609
[TRT] Tactic: 2425089 time 0.365573
[TRT] Tactic: 2490625 time 0.372187
[TRT] Tactic: 2556161 time 0.349948
[TRT] Tactic: 2621697 time 0.313151
[TRT] Tactic: 2687233 time 0.353281
[TRT] Tactic: 6947073 time 0.311875
[TRT] Fastest Tactic: 1442049 Time: 0.290547
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 1442049
[TRT]
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.190859
[TRT] Tactic: 0 time 0.309687
[TRT] Fastest Tactic: 1002 Time: 0.190859
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.622084
[TRT] Tactic: 0 time 0.248933
[TRT] Fastest Tactic: 0 Time: 0.248933
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.192683
[TRT] Tactic: 0 time 0.261953
[TRT] Fastest Tactic: 1002 Time: 0.192683
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.630183
[TRT] Tactic: 0 time 0.244011
[TRT] Fastest Tactic: 0 Time: 0.244011
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.915026
[TRT] Tactic: 0 time 0.22565
[TRT] Fastest Tactic: 0 Time: 0.22565
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.909636
[TRT] Tactic: 0 time 0.222058
[TRT] Fastest Tactic: 0 Time: 0.222058
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.191328
[TRT] Tactic: 0 time 0.309896
[TRT] Fastest Tactic: 1002 Time: 0.191328
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.617787
[TRT] Tactic: 0 time 0.248802
[TRT] Fastest Tactic: 0 Time: 0.248802
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.188151
[TRT] Tactic: 0 time 0.263568
[TRT] Fastest Tactic: 1002 Time: 0.188151
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.630261
[TRT] Tactic: 0 time 0.244323
[TRT] Fastest Tactic: 0 Time: 0.244323
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.909869
[TRT] Tactic: 0 time 0.226589
[TRT] Fastest Tactic: 0 Time: 0.226589
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.91224
[TRT] Tactic: 0 time 0.223047
[TRT] Fastest Tactic: 0 Time: 0.223047
[TRT] *************** Autotuning format combination: Float(1,56,3136,200704) → Float(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 1.94354
[TRT] Tactic: 1 time 1.29154
[TRT] Fastest Tactic: 1 Time: 1.29154
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (FusedConvActConvolution)
[TRT] Tactic: 7 time 2.06359
[TRT] Tactic: 10 time 2.34182
[TRT] Tactic: 14 time 2.03648
[TRT] Tactic: 15 time 2.27479
[TRT] Tactic: 25 time 2.24036
[TRT] Tactic: 26 time 2.69646
[TRT] Tactic: 29 time 2.33823
[TRT] Tactic: 30 time 2.28675
[TRT] Tactic: 33 time 2.32479
[TRT] Tactic: 36 time 2.90206
[TRT] Tactic: 39 time 2.82047
[TRT] Tactic: 41 time 2.37052
[TRT] Tactic: 42 time 5.76255
[TRT] Tactic: 43 time 4.22417
[TRT] Tactic: 45 time 2.13344
[TRT] Tactic: 47 time 2.31451
[TRT] Tactic: 52 time 4.14367
[TRT] Tactic: 54 time 2.09365
[TRT] Tactic: 56 time 4.31859
[TRT] Tactic: 66 time 2.19521
[TRT] Tactic: 76 time 2.06203
[TRT] Tactic: 90 time 1.9668
[TRT] Tactic: 93 time 2.01875
[TRT] Tactic: 98 time 2.30417
[TRT] Tactic: 104 time 2.05341
[TRT] Tactic: 110 time 3.08216
[TRT] Tactic: 119 time 3.09065
[TRT] Tactic: 121 time 2.0118
[TRT] Tactic: 130 time 2.14607
[TRT] Tactic: 134 time 2.50138
[TRT] Tactic: 136 time 2.00005
[TRT] Tactic: 137 time 2.20221
[TRT] Tactic: 139 time 2.08115
[TRT] Tactic: 144 time 2.32385
[TRT] Tactic: 149 time 3.61987
[TRT] Tactic: 151 time 2.48042
[TRT] Tactic: 152 time 2.21216
[TRT] Tactic: 153 time 2.20128
[TRT] Tactic: 156 time 1.95091
[TRT] Tactic: 159 time 2.17299
[TRT] Tactic: 162 time 2.72753
[TRT] Tactic: 164 time 2.19536
[TRT] Fastest Tactic: 156 Time: 1.95091
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1
[TRT] Tactic: 1062367460111450758 time 2.72703
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1
[TRT] Tactic: 3827454225649558724 time 1.77786
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1
[TRT] Tactic: 4337000649858996379 time 1.93479
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1
[TRT] Tactic: 4501471010995462441 time 3.67549
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1
[TRT] Tactic: 5137655947464784826 time 2.20482
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT] Tactic: 5921334924264294896 time 1.35542
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1
[TRT] Tactic: 6645123197870846056 time 1.88128
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[TRT] Tactic: 7852627285308570038 time 1.8168
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1
[TRT] Tactic: -9137461792520977713 time 4.15448
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1
[TRT] Tactic: -6092040395344634144 time 2.70086
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1
[TRT] Tactic: -3456450830548107839 time 2.55117
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1
[TRT] Tactic: -410470605513481746 time 3.63945
[TRT] Fastest Tactic: 5921334924264294896 Time: 1.35542
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 4.6731
[TRT] Tactic: 1 time 2.17698
[TRT] Tactic: 2 time 4.16143
[TRT] Tactic: 6 time 1.60896
[TRT] Fastest Tactic: 6 Time: 1.60896
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: LegacySASSConvolution Tactic: 1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,56,3136,200704) → Half(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (FusedConvActConvolution)
[TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CaskConvolution)
[TRT] CaskConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 5.00344
[TRT] Tactic: 1 time 2.19349
[TRT] Tactic: 2 time 4.76542
[TRT] Tactic: 6 time 1.93888
[TRT] Fastest Tactic: 6 Time: 1.93888
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 6
[TRT]
[TRT] *************** Autotuning format combination: Half(1,56,3136:2,100352) → Half(1,56,3136:2,100352) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 0.936588
[TRT] Fastest Tactic: 0 Time: 0.936588
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (FusedConvActConvolution)
[TRT] Tactic: 7 time 1.14964
[TRT] Tactic: 10 time 1.3106
[TRT] Tactic: 14 time 1.10474
[TRT] Tactic: 15 time 1.30534
[TRT] Tactic: 25 time 1.29576
[TRT] Tactic: 26 time 6.60273
[TRT] Tactic: 29 time 1.11602
[TRT] Tactic: 30 time 1.20156
[TRT] Tactic: 33 time 1.3074
[TRT] Tactic: 36 time 2.46427
[TRT] Tactic: 39 time 1.59729
[TRT] Tactic: 41 time 1.23677
[TRT] Tactic: 42 time 3.05763
[TRT] Tactic: 43 time 2.74849
[TRT] Tactic: 45 time 1.23508
[TRT] Tactic: 47 time 1.26102
[TRT] Tactic: 52 time 2.75089
[TRT] Tactic: 54 time 1.19901
[TRT] Tactic: 56 time 2.47055
[TRT] Tactic: 66 time 1.21518
[TRT] Tactic: 76 time 1.15654
[TRT] Tactic: 90 time 1.06784
[TRT] Tactic: 93 time 1.17716
[TRT] Tactic: 98 time 1.19901
[TRT] Tactic: 104 time 1.08661
[TRT] Tactic: 110 time 1.41979
[TRT] Tactic: 119 time 1.56526
[TRT] Tactic: 121 time 1.18414
[TRT] Tactic: 130 time 1.25044
[TRT] Tactic: 134 time 2.16703
[TRT] Tactic: 136 time 1.14539
[TRT] Tactic: 137 time 1.23469
[TRT] Tactic: 139 time 1.19724
[TRT] Tactic: 144 time 1.19109
[TRT] Tactic: 149 time 2.15273
[TRT] Tactic: 151 time 1.36023
[TRT] Tactic: 152 time 1.24284
[TRT] Tactic: 153 time 1.21286
[TRT] Tactic: 156 time 1.14242
[TRT] Tactic: 159 time 1.26349
[TRT] Tactic: 162 time 1.46276
[TRT] Tactic: 164 time 1.17411
[TRT] Fastest Tactic: 90 Time: 1.06784
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[TRT] Tactic: 3564772625446233998 time 1.2245
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[TRT] Tactic: 3650389455493082349 time 1.26852
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT] Tactic: 4772821744921268633 time 0.750756
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1
[TRT] Tactic: 5319956359050645452 time 1.11195
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[TRT] Tactic: 7205456024582378848 time 0.972708
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[TRT] Tactic: -6490690591794140522 time 1.00065
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[TRT] Tactic: -4686027666808657977 time 1.92273
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1
[TRT] Tactic: -4212163711445252890 time 1.81034
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[TRT] Tactic: -3898373634979201110 time 1.85625
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1
[TRT] Tactic: -2409163523992614473 time 0.952942
[TRT] Fastest Tactic: 4772821744921268633 Time: 0.750756
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaConvolution)
[TRT] CudaConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (CudaDepthwiseConvolution)
[TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 4772821744921268633
[TRT] (Unnamed Layer* 4) [Convolution] + (Unnamed Layer* 6) [Activation] (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT]
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.19151
[TRT] Tactic: 0 time 0.310338
[TRT] Fastest Tactic: 1002 Time: 0.19151
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.619219
[TRT] Tactic: 0 time 0.249427
[TRT] Fastest Tactic: 0 Time: 0.249427
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.192708
[TRT] Tactic: 0 time 0.263593
[TRT] Fastest Tactic: 1002 Time: 0.192708
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.620234
[TRT] Tactic: 0 time 0.244792
[TRT] Fastest Tactic: 0 Time: 0.244792
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.909869
[TRT] Tactic: 0 time 0.225911
[TRT] Fastest Tactic: 0 Time: 0.225911
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.910833
[TRT] Tactic: 0 time 0.222708
[TRT] Fastest Tactic: 0 Time: 0.222708
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.191849
[TRT] Tactic: 0 time 0.310313
[TRT] Fastest Tactic: 1002 Time: 0.191849
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.616172
[TRT] Tactic: 0 time 0.248697
[TRT] Fastest Tactic: 0 Time: 0.248697
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.189401
[TRT] Tactic: 0 time 0.262838
[TRT] Fastest Tactic: 1002 Time: 0.189401
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.623334
[TRT] Tactic: 0 time 0.244088
[TRT] Fastest Tactic: 0 Time: 0.244088
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.905495
[TRT] Tactic: 0 time 0.22651
[TRT] Fastest Tactic: 0 Time: 0.22651
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.910182
[TRT] Tactic: 0 time 0.222864
[TRT] Fastest Tactic: 0 Time: 0.222864
[TRT] *************** Autotuning format combination: Float(1,56,3136,200704), Float(1,56,3136,200704) → Float(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 2.01615
[TRT] Tactic: 1 time 1.33135
[TRT] Fastest Tactic: 1 Time: 1.33135
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1
[TRT] Tactic: 1062367460111450758 time 2.363
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1
[TRT] Tactic: 3827454225649558724 time 1.84039
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1
[TRT] Tactic: 4337000649858996379 time 1.9426
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1
[TRT] Tactic: 4501471010995462441 time 4.27633
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1
[TRT] Tactic: 5137655947464784826 time 1.8501
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT] Tactic: 5921334924264294896 time 1.38654
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1
[TRT] Tactic: 6645123197870846056 time 2.31081
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[TRT] Tactic: 7852627285308570038 time 2.72331
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1
[TRT] Tactic: -9137461792520977713 time 4.35734
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1
[TRT] Tactic: -6092040395344634144 time 2.45872
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1
[TRT] Tactic: -3456450830548107839 time 2.63297
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1
[TRT] Tactic: -410470605513481746 time 4.07891
[TRT] Fastest Tactic: 5921334924264294896 Time: 1.38654
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 4.85464
[TRT] Tactic: 1 time 2.33365
[TRT] Tactic: 2 time 4.62102
[TRT] Tactic: 6 time 2.35203
[TRT] Fastest Tactic: 1 Time: 2.33365
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: LegacySASSConvolution Tactic: 1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,56,3136,200704), Half(1,56,3136,200704) → Half(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CaskConvolution)
[TRT] CaskConvolution has no valid tactics for this config, skipping
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CudaConvolution)
[TRT] Tactic: 0 time 5.12518
[TRT] Tactic: 1 time 2.31391
[TRT] Tactic: 2 time 4.10901
[TRT] Tactic: 6 time 2.55773
[TRT] Fastest Tactic: 1 Time: 2.31391
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 1
[TRT]
[TRT] *************** Autotuning format combination: Half(1,56,3136:2,100352), Half(1,56,3136:2,100352) → Half(1,56,3136:2,100352) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 0.947344
[TRT] Fastest Tactic: 0 Time: 0.947344
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CaskConvolution)
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[TRT] Tactic: 3564772625446233998 time 1.23398
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[TRT] Tactic: 3650389455493082349 time 1.29594
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT] Tactic: 4772821744921268633 time 0.764844
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1
[TRT] Tactic: 5319956359050645452 time 1.12466
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[TRT] Tactic: 7205456024582378848 time 0.9825
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[TRT] Tactic: -6490690591794140522 time 0.993281
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[TRT] Tactic: -4686027666808657977 time 1.90479
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1
[TRT] Tactic: -4212163711445252890 time 2.10896
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[TRT] Tactic: -3898373634979201110 time 1.87378
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1
[TRT] Tactic: -2409163523992614473 time 0.950469
[TRT] Fastest Tactic: 4772821744921268633 Time: 0.764844
[TRT] --------------- Timing Runner: (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (CudaConvolution)
[TRT] CudaConvolution has no valid tactics for this config, skipping
[TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 4772821744921268633
[TRT] (Unnamed Layer* 7) [Convolution] + (Unnamed Layer* 9) [ElementWise] + (Unnamed Layer* 10) [Activation] (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TRT]
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.192109
[TRT] Tactic: 0 time 0.309349
[TRT] Fastest Tactic: 1002 Time: 0.192109
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.615156
[TRT] Tactic: 0 time 0.248673
[TRT] Fastest Tactic: 0 Time: 0.248673
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.190781
[TRT] Tactic: 0 time 0.264089
[TRT] Fastest Tactic: 1002 Time: 0.190781
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.649166
[TRT] Tactic: 0 time 0.24461
[TRT] Fastest Tactic: 0 Time: 0.24461
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.907865
[TRT] Tactic: 0 time 0.225625
[TRT] Fastest Tactic: 0 Time: 0.225625
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.911433
[TRT] Tactic: 0 time 0.223672
[TRT] Fastest Tactic: 0 Time: 0.223672
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.19388
[TRT] Tactic: 0 time 0.309792
[TRT] Fastest Tactic: 1002 Time: 0.19388
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.619609
[TRT] Tactic: 0 time 0.248203
[TRT] Fastest Tactic: 0 Time: 0.248203
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.196016
[TRT] Tactic: 0 time 0.262161
[TRT] Fastest Tactic: 1002 Time: 0.196016
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.641641
[TRT] Tactic: 0 time 0.243854
[TRT] Fastest Tactic: 0 Time: 0.243854
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.908542
[TRT] Tactic: 0 time 0.226146
[TRT] Fastest Tactic: 0 Time: 0.226146
[TRT] --------------- Timing Runner: (Reformat)
[TRT] Tactic: 1002 time 0.908985
[TRT] Tactic: 0 time 0.222135
[TRT] Fastest Tactic: 0 Time: 0.222135
[TRT] *************** Autotuning format combination: Float(1,56,3136,200704) → Float(1,56,3136,200704) ***************
[TRT] --------------- Timing Runner: (Unnamed Layer* 11) [Convolution] + (Unnamed Layer* 13) [Activation] (LegacySASSConvolution)
[TRT] Tactic: 0 time 2.34878
[TRT] Tactic: 1 time 1.29951
[TRT] Fastest Tactic: 1 Time: 1.29951
[TRT] --------------- Timing Runner: (Unnamed Layer* 11) [Convolution] + (Unnamed Layer* 13) [Activation] (FusedConvActConvolution)
[TRT] Tactic: 7 time 2.06341
[TRT] Tactic: 10 time 2.34875
[TRT] Tactic: 14 time 2.03273
[TRT] Tactic: 15 time 2.67888
[TRT] Tactic: 25 time 2.25081
[TRT] Tactic: 26 time 2.76031
[TRT] Tactic: 29 time 1.93945
[TRT] Tactic: 30 time 2.26245
[TRT] Tactic: 33 time 2.69802
[TRT] Tactic: 36 time 2.89164
[TRT] Tactic: 39 time 2.83456
[TRT] Tactic: 41 time 2.30802
[TRT] Tactic: 42 time 5.74539
[TRT] Tactic: 43 time 4.22385
[TRT] Tactic: 45 time 2.41937
[TRT] Tactic: 47 time 2.3131
[TRT] Tactic: 52 time 4.15846
[TRT] Tactic: 54 time 2.10625
[TRT] Tactic: 56 time 4.1531
[TRT] Tactic: 66 time 2.20508
[TRT] Tactic: 76 time 2.05971
[TRT] Tactic: 90 time 2.03971
[TRT] Tactic: 93 time 2.05716
[TRT] Tactic: 98 time 2.27562
[TRT] Tactic: 104 time 2.0668
[TRT] Tactic: 110 time 2.67742
[TRT] Tactic: 119 time 3.08185
[TRT] Tactic: 121 time 2.02573
[TRT] Tactic: 130 time 2.14586
[TRT] Tactic: 134 time 2.5637
[TRT] Tactic: 136 time 2.43234
[TRT] Tactic: 137 time 2.20964
[TRT] Tactic: 139 time 2.11221
[TRT] Tactic: 144 time 2.33901
[TRT] Tactic: 149 time 3.62807
[TRT] Tactic: 151 time 2.44422
[TRT] Tactic: 152 time 2.14505
[TRT] Tactic: 153 time 2.18659
[TRT] Tactic: 156 time

The training logs:-

jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$ python3.6 train.py --model-dir=cat_dog_limited ~/datasets/cat_dog_limited
Use GPU: 0 for training
=> dataset classes: 2 [‘cat’, ‘dog’]
=> using pre-trained model ‘resnet18’
=> reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True)
Epoch: [0][0/2] Time 516.601 (516.601) Data 66.606 (66.606) Loss 6.9412e-01 (6.9412e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0] completed, elapsed time 548.319 seconds
Test: [0/2] Time 49.306 (49.306) Loss 0.0000e+00 (0.0000e+00) Acc@1 100.00 (100.00) Acc@5 100.00 (100.00)

  • Acc@1 50.000 Acc@5 100.000
    saved best model to: cat_dog_limited/model_best.pth.tar
    Epoch: [1][0/2] Time 20.410 (20.410) Data 16.897 (16.897) Loss 5.6614e+00 (5.6614e+00) Acc@1 62.50 ( 62.50) Acc@5 100.00 (100.00)
    Epoch: [1] completed, elapsed time 22.661 seconds
    Test: [0/2] Time 5.135 ( 5.135) Loss 1.3482e+05 (1.3482e+05) Acc@1 0.00 ( 0.00) Acc@5 100.00 (100.00)
  • Acc@1 50.000 Acc@5 100.000
    saved checkpoint to: cat_dog_limited/checkpoint.pth.tar
    Epoch: [2][0/2] Time 3.245 ( 3.245) Data 2.318 ( 2.318) Loss 2.2078e+01 (2.2078e+01) Acc@1 25.00 ( 25.00) Acc@5 100.00 (100.00)
    Epoch: [2] completed, elapsed time 4.477 seconds
    Test: [0/2] Time 2.568 ( 2.568) Loss 0.0000e+00 (0.0000e+00) Acc@1 100.00 (100.00) Acc@5 100.00 (100.00)
  • Acc@1 50.000 Acc@5 100.000
    saved checkpoint to: cat_dog_limited/checkpoint.pth.tar
    Epoch: [3][0/2] Time 3.569 ( 3.569) Data 2.285 ( 2.285) Loss 1.3840e+01 (1.3840e+01) Acc@1 37.50 ( 37.50) Acc@5 100.00 (100.00)
    Epoch: [3] completed, elapsed time 4.670 seconds
    Test: [0/2] Time 4.108 ( 4.108) Loss 1.2493e+09 (1.2493e+09) Acc@1 0.00 ( 0.00) Acc@5 100.00 (100.00)
  • Acc@1 50.000 Acc@5 100.000
    saved checkpoint to: cat_dog_limited/checkpoint.pth.tar
    Epoch: [4][0/2] Time 22.671 (22.671) Data 22.144 (22.144) Loss 2.5295e+00 (2.5295e+00) Acc@1 75.00 ( 75.00) Acc@5 100.00 (100.00)
    Epoch: [4] completed, elapsed time 23.865 seconds
    Test: [0/2] Time 1.083 ( 1.083) Loss 3.5332e+07 (3.5332e+07) Acc@1 0.00 ( 0.00) Acc@5 100.00 (100.00)
  • Acc@1 50.000 Acc@5 100.000
    saved checkpoint to: cat_dog_limited/checkpoint.pth.tar
    jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$

The losses under test are high and it doesn’t classify any of the images correctly (Acc@1 is zero). Perhaps the model needs more training epochs?

Thanks Dusty for the information.

I tried running on a bigger dataset (cat and dog sample) for 35 epochs. It did not run…giving below error.


jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$ python3.6 train.py --model-dir=cat_dog ~/datasets/cat_dog
Use GPU: 0 for training
=> dataset classes: 2 [‘cat’, ‘dog’]
=> using pre-trained model ‘resnet18’
Downloading: “https://download.pytorch.org/models/resnet18-5c106cde.pth” to /home/jetbot/.cache/torch/checkpoints/resnet18-5c106cde.pth
100.0%
=> reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True)
Epoch: [0][ 0/489] Time 36.110 (36.110) Data 1.174 ( 1.174) Loss 5.6194e-01 (5.6194e-01) Acc@1 87.50 ( 87.50) Acc@5 100.00 (100.00)
Epoch: [0][ 10/489] Time 0.676 ( 3.880) Data 0.000 ( 0.112) Loss 1.6232e+01 (9.9222e+00) Acc@1 50.00 ( 57.95) Acc@5 100.00 (100.00)
Epoch: [0][ 20/489] Time 0.672 ( 2.352) Data 0.000 ( 0.078) Loss 1.1748e+01 (1.4633e+01) Acc@1 50.00 ( 53.57) Acc@5 100.00 (100.00)
Epoch: [0][ 30/489] Time 0.682 ( 1.817) Data 0.000 ( 0.066) Loss 1.9839e+00 (1.3597e+01) Acc@1 87.50 ( 52.82) Acc@5 100.00 (100.00)
Epoch: [0][ 40/489] Time 0.664 ( 1.993) Data 0.001 ( 0.518) Loss 5.7024e+00 (1.1815e+01) Acc@1 37.50 ( 49.09) Acc@5 100.00 (100.00)
Epoch: [0][ 50/489] Time 0.675 ( 1.736) Data 0.000 ( 0.425) Loss 6.5811e-01 (1.0590e+01) Acc@1 62.50 ( 49.02) Acc@5 100.00 (100.00)
Epoch: [0][ 60/489] Time 0.713 ( 1.565) Data 0.000 ( 0.363) Loss 9.9551e-01 (1.0330e+01) Acc@1 62.50 ( 51.23) Acc@5 100.00 (100.00)
Epoch: [0][ 70/489] Time 0.692 ( 1.441) Data 0.000 ( 0.317) Loss 1.0977e+00 (1.0316e+01) Acc@1 37.50 ( 51.41) Acc@5 100.00 (100.00)
Epoch: [0][ 80/489] Time 0.725 ( 1.347) Data 0.000 ( 0.283) Loss 5.2664e-01 (9.4582e+00) Acc@1 50.00 ( 49.69) Acc@5 100.00 (100.00)
Epoch: [0][ 90/489] Time 0.689 ( 1.274) Data 0.000 ( 0.257) Loss 7.3769e-01 (8.5516e+00) Acc@1 62.50 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0][100/489] Time 0.667 ( 1.215) Data 0.000 ( 0.235) Loss 6.9083e-01 (7.7836e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00)
Epoch: [0][110/489] Time 0.676 ( 1.168) Data 0.000 ( 0.218) Loss 6.9221e-01 (7.1563e+00) Acc@1 62.50 ( 50.56) Acc@5 100.00 (100.00)
Epoch: [0][120/489] Time 0.661 ( 1.128) Data 0.000 ( 0.203) Loss 8.0697e-01 (6.6239e+00) Acc@1 25.00 ( 50.62) Acc@5 100.00 (100.00)
Traceback (most recent call last):
File “train.py”, line 506, in
main()
File “train.py”, line 135, in main
main_worker(args.gpu, ngpus_per_node, args)
File “train.py”, line 277, in main_worker
train(train_loader, model, criterion, optimizer, epoch, num_classes, args)
File “train.py”, line 321, in train
for i, (images, target) in enumerate(train_loader):
File “/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py”, line 819, in next
return self._process_data(data)
File “/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py”, line 846, in _process_data
data.reraise()
File “/usr/local/lib/python3.6/dist-packages/torch/_utils.py”, line 369, in reraise
raise self.exc_type(msg)
OSError: Caught OSError in DataLoader worker process 0.
Original Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py”, line 178, in _worker_loop
data = fetcher.fetch(index)
File “/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py”, line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py”, line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/usr/local/lib/python3.6/dist-packages/torchvision-0.4.1a0+a263704-py3.6-linux-aarch64.egg/torchvision/datasets/folder.py”, line 138, in getitem
sample = self.loader(path)
File “/usr/local/lib/python3.6/dist-packages/torchvision-0.4.1a0+a263704-py3.6-linux-aarch64.egg/torchvision/datasets/folder.py”, line 174, in default_loader
return pil_loader(path)
File “/usr/local/lib/python3.6/dist-packages/torchvision-0.4.1a0+a263704-py3.6-linux-aarch64.egg/torchvision/datasets/folder.py”, line 157, in pil_loader
return img.convert(‘RGB’)
File “/usr/local/lib/python3.6/dist-packages/PIL/Image.py”, line 930, in convert
self.load()
File “/usr/local/lib/python3.6/dist-packages/PIL/ImageFile.py”, line 249, in load
“(%d bytes not processed)” % len(b)
OSError: image file is truncated (23 bytes not processed)

jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$


The pytorch, torchvision, tensorrt and tensorflow versions are:

jetbot@jetbot-desktop:~$ python3.6 -c “import tensorflow; print(tensorflow.version)”
2020-01-13 23:20:50.242791: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
1.14.0
jetbot@jetbot-desktop:~$ python3.6 -c “import torch; print(torch.version)”
1.2.0a0+8554416
jetbot@jetbot-desktop:~$ python3.6 -c “import torchvision; print(torchvision.version)”
0.4.1a0+a263704
jetbot@jetbot-desktop:~$ python3.6 -c “import tensorrt; print(tensorrt.version)”
6.0.1.10
jetbot@jetbot-desktop:~$

It appears that one of the images in your copy of the dataset is corrupted. Sometimes I run this script to find images that are corrupt:

https://github.com/dusty-nv/pytorch-segmentation/blob/master/datasets/corrupt_images.py

Hi Dusty,

corrupt_images.py was useful. It detected one corrupt image in the dataset. I removed it from the set and then was able to run the training for 35 epocs. It took about 5 hours to complete.

And then i converted the model to onnx model.

When I used this onnx model with imagenet-console to test, it is still failing. The logs below…

jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$ ./imagenet-console --model=/home/jetbot/shankar/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog/labels.txt ~/datasets/cat_dog/test/cat/02.jpg

imageNet – loading classification network model from:
– prototxt (null)
– model /home/jetbot/shankar/resnet18.onnx
– class_labels /home/jetbot/datasets/cat_dog/labels.txt
– input_blob ‘input_0’
– output_blob ‘output_0’
– batch_size 1

[TRT] TensorRT version 6.0.1
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - GridAnchorRect_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] Could not register plugin creator: FlattenConcat_TRT in namespace:
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine
[TRT] loading network profile from engine cache… /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine
[TRT] device GPU, /home/jetbot/shankar/resnet18.onnx loaded
[TRT] Deserialize required 5869256 microseconds.
[TRT] device GPU, CUDA engine context initialized with 2 bindings
[TRT] binding – index 0
– name ‘input_0’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (SPATIAL)
– dim #1 224 (SPATIAL)
– dim #2 224 (SPATIAL)
[TRT] binding – index 1
– name ‘output_0’
– type FP32
– in/out OUTPUT
– # dims 1
– dim #0 2 (SPATIAL)
[TRT] binding to input 0 input_0 binding index: 0
[TRT] binding to input 0 input_0 dims (b=1 c=3 h=224 w=224) size=602112
[TRT] binding to output 0 output_0 binding index: 1
[TRT] binding to output 0 output_0 dims (b=1 c=2 h=1 w=1) size=8
device GPU, /home/jetbot/shankar/resnet18.onnx initialized.
[TRT] /home/jetbot/shankar/resnet18.onnx loaded
imageNet – loaded 2 class info entries
/home/jetbot/shankar/resnet18.onnx initialized.
[image] loaded ‘/home/jetbot/datasets/cat_dog/test/cat/02.jpg’ (620 x 410, 1 channels)
class 0000 - nan (cat)
class 0001 - nan (dog)
imagenet-console: failed to classify ‘/home/jetbot/datasets/cat_dog/test/cat/02.jpg’ (result=-1)
imagenet-console: shutting down…
imagenet-console: shutdown complete
jetbot@jetbot-desktop:~/jetson-inference/python/training/classification$

It appears that you are getting NaN outputs from your model:

When you trained it with PyTorch, what was the loss and accuracy of the model?

Can you upload your trained PyTorch model checkpoint that I could try?

I was copying the resnet18.onnx file to another directory and giving this new path as input argument to the imagenet-console command.

I tried with giving the path “/home/jetbot/jetson-inference/python/training/classification/cat_dog” and it worked. Feeling good it worked.

The difference is that in this path the model checkpoint is there.
Is there any dependency?

Oh ok, great - glad it is working for you! Thanks for sticking with it there.

There isn’t any runtime dependency on the PyTorch model checkpoint from the imagenet-console or imagenet-camera programs. After the onnx_export.py script is run, the PyTorch checkpoint isn’t used anymore (for inferencing).

So, the jetson-inference library looks to see if the serialized .engine file already exists in the directory where you specified the model (in your case, /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine) This way, it doesn’t have to convert the model to TensorRT every time the program loads (which can take a few minutes the first time). My guess what was happening, was this serialized .engine file corresponded to a previous version of your ONNX model (i.e the program was using one of your older models). And when you copied your newer resnet18.onnx model to this directory, imagenet-console kept on loading the previous .engine file.

You could try deleting /home/jetbot/shankar/resnet18.onnx.1.1.GPU.FP16.engine and see if it can load/run your most recent resnet18.onnx model properly from that directory.