Extending pre-trained network to identify new objects using TX2 and jetpack 3.2.1

Hello,
I am using Transfer learning toolkit, with jetpack 4.4 on jetson TX2.
I want to use only classification without detection.
I made the configuration file for classification following the documentation for transfer learning toolkit.
But I don’t get any classifications on the running video. What can be the source of the problem?
Should I use the classification as a primary source or secondary source?

Thanks

Hi,

Sorry for the late update.

1.

libnvinfer.so.5: cannot open shared object file: No such file or directory

It indicates that the binary is trying to access TensorRT v5. (from tlt-converter)
However, JetPack4.4 integrate TensorRT v7.1. Please update/compile the binary for TensorRT 7.1 first.

2.
You can find a tlt sample with Deepstream below:
https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/pgie_yolov3_tlt_config.txt#L52

3.
For the no classification issue, not sure if there is any issue in the preprocessing parameters.
Since we do have a configure file for YOLO, would you mind to give above configure file a try ?

4.
It is possible to configure deepstream with only classifier.
Since pgie is mandatory, you will need to configure your model as pgie.

The only difference it to enable the is-classifier=1 configure and add the corresponding output parser if needed.

Thanks.

Hello,

Thanks so much for replying to the questions.
I want to run the jupyter notebook for YOLO network using TLT. I installed and ran every thing on a linux 18.04 before and I did not have a problem.
I wanted to run the training on another computer but I am getting the following error. How can I fix this problem? Thanks.

2020-12-07 17:32:08,717 [INFO] iva.yolo.scripts.train: Number of images in the training dataset:	  6778

Epoch 1/80
2020-12-07 17:32:16.681687: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-12-07 17:33:21.366397: E tensorflow/stream_executor/cuda/cuda_blas.cc:429] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
2020-12-07 17:33:21.366438: E tensorflow/stream_executor/cuda/cuda_blas.cc:2437] Internal: failed BLAS call, see log for details
Traceback (most recent call last):
  File "/usr/local/bin/tlt-train-g1", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_train.py", line 51, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo/scripts/train.py", line 239, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo/scripts/train.py", line 183, in run_experiment
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1039, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py", line 154, in fit_loop
    outs = f(ins)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas xGEMMBatched launch failed : a.shape=[4,3,3], b.shape=[4,3,3], m=3, n=3, k=3, batch_size=4
	 [[{{node CompositeTransform_6/CompositeTransform_5/CompositeTransform_4/CompositeTransform_3/CompositeTransform_2/CompositeTransform_1/CompositeTransform/RandomFlip/MatMul}}]]
	 [[cond_7/GatherNd_1/_4621]]
  (1) Internal: Blas xGEMMBatched launch failed : a.shape=[4,3,3], b.shape=[4,3,3], m=3, n=3, k=3, batch_size=4
	 [[{{node CompositeTransform_6/CompositeTransform_5/CompositeTransform_4/CompositeTransform_3/CompositeTransform_2/CompositeTransform_1/CompositeTransform/RandomFlip/MatMul}}]]
0 successful operations.
0 derived errors ignored.

Hi,

Thanks for replying to the questions.

I am deploying green grass group to an nvidia TX2 using AWS IOT using the following link:
https://github.com/aws-samples/aws-iot-greengrass-deploy-nvidia-deepstream-on-edge

I have changed the deepstream config file to run a local camera connected to TX2 instead of a RSTP server.
Attached is the config file.

The greengrass successfully runs but the camera does not start and nothing happens.

How do I fix the problem?

Many thanks,
Farough

source1_usb_dec_infer_resnet_int8.txt (2.84 KB)

Hi, farough.nasab

Since this topic is related to inference, could you file a new topic for the RTSP camera issue on Jan 21?
Thanks.

Hi,
Thanks for always replying to emails.

I have trained a YOLO3 network using nvidia TLT on a desktop. It detects all the objects in the inference test on the desktop. But when I run the network on an Nvidia Jetson TX2 with deepstream application, it detects only some of the objects and does not detect others. On some config files, it does not detect anything. I always used yolo3 library for drawing boxes around the objects. What is the solution?

Regards,
Farough

Hi,

Do you follow our document or example to deploy the YOLOv3 model?

https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/text/deploying_to_deepstream.html

Thanks.

Hello, I applied the tutorial and recreate the dataset . However, I got Nan loss so how can we retrain correctly?