Object Detection on GPUs in 10 Minutes

Originally published at: https://developer.nvidia.com/blog/object-detection-gpus-10-minutes/

Object detection remains the primary driver for applications such as autonomous driving and intelligent video analytics. Object detection applications require substantial training using vast datasets to achieve high levels of accuracy. NVIDIA GPUs excel at the parallel compute performance required to train large networks in order to generate datasets for object detection inference. This post…

Thank you for the wonderful read, did you test this using the Nvidia Jetson Nano?

in my case (running it with zed stereolabs camera), the output is as follows: root@d86527a4fbaa:/mnt# python SSD_Model/detect_objects_webcam.py
WARNING: Logging before flag parsing goes to stderr.
W0721 12:00:52.702791 140176104285952 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/graphsurgeon/_utils.py:2: The name tf.NodeDef is deprecated. Please use tf.compat.v1.NodeDef instead.

W0721 12:00:52.703287 140176104285952 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/graphsurgeon/DynamicGraph.py:4: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

TensorRT inference engine settings:
* Inference precision - DataType.FLOAT
* Max batch size - 1

Loading cached TensorRT engine from /mnt/SSD_Model/utils/../workspace/engines/FLOAT/engine_bs_1.buf
TRT ENGINE PATH /mnt/SSD_Model/utils/../workspace/engines/FLOAT/engine_bs_1.buf
Running webcam: True
Segmentation fault (core dumped)
root@d86527a4fbaa:/mnt#

environment: Ubuntu 18.04
second attempt:./setup_environment.sh
Setting envivonment variables for the webcam
non-network local connections being added to access control list
Downloading VOC dataset
--2019-07-21 19:04:05-- http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
Resolving host.robots.ox.ac.uk (host.robots.ox.ac.uk)... 129.67.94.152
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 451020800 (430M) [application/x-tar]
Saving to: ‘VOCtest_06-Nov-2007.tar’

VOCtest_06-Nov-2007 100%[===================>] 430.13M 12.6MB/s in 37s

2019-07-21 19:04:42 (11.6 MB/s) - ‘VOCtest_06-Nov-2007.tar’ saved [451020800/451020800]

Dockerfile has already been built
Starting docker container

=====================
== NVIDIA TensorRT ==
=====================

NVIDIA Release 19.05 (build 6392482)

NVIDIA TensorRT 5.1.5 (c) 2016-2019, NVIDIA CORPORATION. All rights reserved.
Container image (c) 2019, NVIDIA CORPORATION. All rights reserved.

https://developer.nvidia.com/tensorrt

To install Python sample dependencies, run /opt/tensorrt/python/python_setup.sh

root@efe2ba640ec8:/mnt# python SSD_Model/detect_objects_webcam.py
WARNING: Logging before flag parsing goes to stderr.
W0721 12:05:20.935808 140649279698688 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/graphsurgeon/_utils.py:2: The name tf.NodeDef is deprecated. Please use tf.compat.v1.NodeDef instead.

W0721 12:05:20.937713 140649279698688 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/graphsurgeon/DynamicGraph.py:4: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

Preparing pretrained model
Downloading /mnt/SSD_Model/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17.tar.gz
Download progress [==================================================] 100%
Download complete
Unpacking /mnt/SSD_Model/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17.tar.gz
Extracting complete
Removing /mnt/SSD_Model/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17.tar.gz
Model ready
W0721 12:05:35.382501 140649279698688 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/graphsurgeon/StaticGraph.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING: To create TensorRT plugin nodes, please use the `create_plugin_node` function instead.
NOTE: UFF has been tested with TensorFlow 1.12.0. Other versions are not guaranteed to work
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
UFF Version 0.6.3
=== Automatically deduced input nodes ===
[name: "Input"
op: "Placeholder"
attr {
key: "dtype"
value {
type: DT_FLOAT
}
}
attr {
key: "shape"
value {
shape {
dim {
size: 1
}
dim {
size: 3
}
dim {
size: 300
}
dim {
size: 300
}
}
}
}
]
=========================================

Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS_TRT yet.
Converting NMS as custom op: NMS_TRT
W0721 12:05:35.967093 140649279698688 deprecation_wrapper.py:119] From /usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py:179: The name tf.AttrValue is deprecated. Please use tf.compat.v1.AttrValue instead.

Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_conf as custom op: FlattenConcat_TRT
Warning: No conversion function registered for layer: GridAnchor_TRT yet.
Converting GridAnchor as custom op: GridAnchor_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_loc as custom op: FlattenConcat_TRT
No. nodes: 563
UFF Output written to /mnt/SSD_Model/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.uff
UFF Text Output written to /mnt/SSD_Model/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pbtxt
TensorRT inference engine settings:
* Inference precision - DataType.FLOAT
* Max batch size - 1

Building TensorRT engine. This may take few minutes.

and it works!
it was cache probably that needed to be removed
RESOLVED.
However, is ther ea way to run it on Nvidia Jetson Xavier device? e.g. without the docker part?

Anytime @disqus_z5z4JTinTV:disqus ,

I haven't tried running it on the Jetson Nano, but I assume that once you get Docker up and running, that the setup should be very similar.

Hey @disqus_n4MSuSh2Pt:disqus ,

Glad to hear you were able to resolve this by clearing the cache.

I haven't tried running on Jetson Xavier without Docker, but to start, I would open the Dockerfile I provided in GitHub and try to install those packages manually. If you can do that successfully, then you will have the same environment and should be able to run the code.

I pulled and ran the docker successful using ./setup_environment.sh, but when I called 'python SSD_Model/detect_objects_webcam.py' as instructed, I got the following error message:

root@b54ce065632c:/mnt# python SSD_Model/detect_objects_webcam.py
Traceback (most recent call last):
File "SSD_Model/detect_objects_webcam.py", line 12, in <module>
import utils.inference as inference_utils # TRT/TF inference wrappers
File "/mnt/SSD_Model/utils/inference.py", line 57, in <module>
import pycuda.autoinit
File "/usr/local/lib/python3.5/dist-packages/pycuda/autoinit.py", line 9, in <module>
context = make_default_context()
File "/usr/local/lib/python3.5/dist-packages/pycuda/tools.py", line 204, in make_default_context
"on any of the %d detected devices" % ndevices)
RuntimeError: make_default_context() wasn't able to create a context on any of the 1 detected devices

Any clue about this error?

Hey @disqus_7m7lUq2eB8:disqus,

Looks like it's having a hard time finding a device where it can create a CUDA context. Perhaps it's not detecting the GPU in your machine? I would try to run nvidia-smi from within the container to make sure that Docker is seeing your GPU. It should have automatically detected a GPU in your machine, but perhaps it did not and you may have to pass it to docker manually using the --gpus flag. Let me know how that goes.

Better to work with CPU's. Look for OpenVino framework from Intel. It is more robust and better support.
Nvidia forums nobody replies and nobody provides support. After 6 months of waiting i finally decide to move away from Nvidia to Intel.

Hate you Nvidia. You wasted my entire year with your pathetic drivers and pathetic frameworks.

Will this work on my own custom object detection model?