Building CUDA engine never completes

wweems · July 1, 2019, 3:44pm

Greetings…

I’ve been attempting to run through this tutorial with my newly acquired jetson nano dev board.

dusty-nv/jetson-inference/blob/master/docs/imagenet-console-2.md

<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="building-repo-2.md">Back</a> | <a href="imagenet-example-python-2.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>Image Recognition</sup></p>  

# Classifying Images with ImageNet
There are multiple types of deep learning networks available, including recognition, detection/localization, and semantic segmentation.  The first deep learning capability we're highlighting in this tutorial is **image recognition**, using classifcation networks that have been trained on large datasets to identify scenes and objects.

<img src="https://github.com/dusty-nv/jetson-inference/raw/pytorch/docs/images/imagenet.jpg" width="1000">

The [`imageNet`](../c/imageNet.h) object accepts an input image and outputs the probability for each class.  Having been trained on the ImageNet ILSVRC dataset of **[1000 objects](../data/networks/ilsvrc12_synset_words.txt)**, the GoogleNet and ResNet-18 models were automatically downloaded during the build step.  See [below](#downloading-other-classification-models) for other classification models that can be downloaded and used as well.

As an example of using the [`imageNet`](../c/imageNet.h) class, we provide sample programs for C++ and Python:

- [`imagenet.cpp`](../examples/imagenet/imagenet.cpp) (C++) 
- [`imagenet.py`](../python/examples/imagenet.py) (Python) 

These samples are able to classify images, videos, and camera feeds.  For more info about the various types of input/output streams supported, see the [Camera Streaming and Multimedia](aux-streaming.md) page.

This file has been truncated. show original

I’m powering the board via barrel jack connector and made sure the nvprofile was set to 0.

When running ./imagenet-console orange_0.jpg output_0.jpg, I’ve noticed that I get to the place where it says building CUDA engine, this could take a few minutes… nothing happens after. Never completes.

I’m not getting a crash or anything like I’ve heard other people complain about… but I’ve let this run hours and it never actually succeeds. (hitting ctrl+c does eventually break out to console)

Logs:

imagenet-console
  args (3):  0 [./imagenet-console]  1 [orange_0.jpg]  2 [output_0.jpg]


imageNet -- loading classification network model from:
         -- prototxt     networks/googlenet.prototxt
         -- model        networks/bvlc_googlenet.caffemodel
         -- class_labels networks/ilsvrc12_synset_words.txt
         -- input_blob   'data'
         -- output_blob  'prob'
         -- batch_size   2

[TRT]  TensorRT version 5.0.6
[TRT]  detected model format - caffe  (extension '.caffemodel')
[TRT]  desired precision specified for GPU: FASTEST
[TRT]  requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT]  native precisions detected for GPU:  FP32, FP16
[TRT]  selecting fastest native precision for GPU:  FP16
[TRT]  attempting to open engine cache file networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT]  cache file not found, profiling network model on device GPU
[TRT]  device GPU, loading networks/googlenet.prototxt networks/bvlc_googlenet.caffemodel
[TRT]  retrieved Output tensor "prob":  1000x1x1
[TRT]  retrieved Input tensor "data":  3x224x224
[TRT]  device GPU, configuring CUDA engine
[TRT]  device GPU, building FP16:  ON
[TRT]  device GPU, building INT8:  OFF
[TRT]  device GPU, building CUDA engine (this may take a few minutes the first time a network is loaded)

dusty_nv · July 1, 2019, 4:09pm

Hmm, it should only take up to 5 minutes or so on Nano to build the CUDA engine for Googlenet. Can you try running “sudo tegrastats” in the background and monitoring the system for activity?

If you continue to have the issue, you may want to try re-cloning the repo, or trying a fresh SD card image.

wweems · July 1, 2019, 4:54pm

I deleted the cloned folder and re-initialized using same instructions, and while tegrastats is showing ~40% cpu utilization its still seemingly hanging indefinitely… or at least for extended periods of time.

I’ll try a new SD card image now I guess.

jguy · July 2, 2019, 3:00pm

Just ran through this for the first time myself. It’s taking 12m45s to complete “building CUDA engine”. More than a few minutes! The other thing I ran into was that the power consumption spiked when the app finally ran. Nano crashed. After upgrading to a 3A 5V supply, the app runs OK (after another 12m45s delay). Edit: Just noticed that OP was building for Googlenet. I was for building for facedetection, but still much slower than expected.

dusty_nv · July 2, 2019, 8:17pm

OK, yes, the object detection networks can take longer to optimize. Try running this beforehand to make sure you are in 10W mode and with your clocks maximized:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Note that the delay to build the CUDA engine only occurs the first time you run a particular model. On subsequent runs, it should only take a couple seconds to load the already-optimized model.

Topic		Replies	Views
Classifying Images with Imagenet Jetson Nano cudnn	2	91	July 2, 2024
AI examples Jetson Nano - Could not register plugin creator - cache file not found Jetson Nano	4	1385	October 14, 2021
segnet-console and Jetson Nano examples Jetson Nano	4	1097	October 18, 2021
Using 10 lines of code tutorial, feed the frame into opencv Jetson Xavier NX opencv	13	1623	October 18, 2021
Detectnet failed to load Resnet50 ONNX model Jetson Nano onnx	2	1037	February 2, 2022
Can't run https://github.com/dusty-nv/jetson-inference/blob/master/docs/imagenet-console-2.md on jetpack 6.1 Jetson Orin Nano jetson-inference	2	48	December 30, 2024
Jetson inference failed to load detectNet model Jetson Nano jetson-inference , cudnn	2	328	February 20, 2024
resnet10.caffemodel_b8_fp16.engine is optimized for DeepStream SDK	10	1395	October 12, 2021
imagenet-console segfault -- sounds nvinfer::builder::buildGraph encounter NULL pointer Jetson Nano	3	555	October 18, 2021
detectnet-camera fails Jetson Nano	3	1925	October 18, 2021

Building CUDA engine never completes

Related topics