Detectnet_v2 trained, tao infer can infer, but no results

erence · October 17, 2023, 1:03pm

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
-AMD64 RTX2700
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
Running on Container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
• Training spec file(If have, please share here)
detectnet_train_cfg3.txt (3.6 KB)

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Altough I trained seemingly good for one class and the loss fell good (up to 0.0007)
detectnet inference cannot detect any target. The inference labels are empty…
Why could it be? The inference.txt is as below

detectnet_inference.txt (1.4 KB)

erence · October 17, 2023, 7:42pm

It seems that I have to have a tlt or an etlt model first.

But how can I get a tlt or etlt model from my detectnet_v2 training? It creates hdf5 files.

Retraining them creates a pruned hdf5.

I can export hdf5 to onnx, but tao infer does not accept onnx.

How can I convert hdf5 to tlt ??

Thanks in advance

Morganh · October 18, 2023, 8:24am

Refer to https://github.com/NVIDIA/tao_tutorials/blob/main/notebooks/tao_launcher_starter_kit/detectnet_v2/specs/detectnet_v2_inference_kitti_tlt.txt.

erence · October 18, 2023, 1:37pm

The tao inference seem to be perfectly normal, but it does not annotate images and the labels are all empty.
I changed the inference config according to your suggestion as below
I even lowered the confidence to 0.1 without success
As said before, the loss did fell about 0.0007 during training. What could it be about?
Do you think converting it to trt engine file would not help, as the hdf5 is not infering?

detectnet_inference.txt (1.8 KB)

Thanks in advance.

Morganh · October 23, 2023, 8:53am

It is a bit confused. Can you share the command and full log how did you run tao inference? If TAO inference works, it should annotate images and generate labels.

erence · October 23, 2023, 2:35pm

I have pulled and then run the docker as below:

docker run -it --rm --gpus all -v /home/eren/tao:/tao nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5

and run the following command to train

detectnet_v2 train --gpus 1 --use_amp -e /tao/detectnet_train_cfg3.txt -r /tao/results

which did infer on docker with below command, but without bboxed drawn:

detectnet_v2 inference -i /tao/test_images -e /tao/detectnet_inference.txt -m /tao/results_detectnet_before_retrain/model.epoch-120.hdf5 -r /tao/inference_results

This inferes seemingly errorless, as said, but without any recognition, although test pictures were taken from training pictures. Log as below_:

2023-10-23 14:28:06.615858: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-10-23 14:28:06,655 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-10-23 14:28:08,065 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2023-10-23 14:28:08,105 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2023-10-23 14:28:08,108 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
WARNING: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
INFO: Log file already exists at /tao/inference_results/status.json
INFO: Starting DetectNet_v2 Inference
INFO: Merging specification from /tao/detectnet_inference.txt
INFO: Overlain images will be saved in the output path.
INFO: Constructing inferencer
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/inferencer/tlt_inferencer.py:96: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/inferencer/tlt_inferencer.py:96: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/inferencer/tlt_inferencer.py:99: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/inferencer/tlt_inferencer.py:99: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

INFO: Loading model from /tao/results_detectnet_before_retrain/model.epoch-120.hdf5:
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

Layer (type) Output Shape Param #

input_1 (InputLayer) (None, 3, 1920, 1200) 0

model_1 (Model) [(None, 1, 120, 75), (Non 11197893

Total params: 11,197,893
Trainable params: 11,188,165
Non-trainable params: 9,728

INFO: Initialized model
INFO: Commencing inference
100%|████| 21/21 [00:26<00:00, 1.24s/it]
INFO: Inference complete
INFO: Inference finished successfully.
Execution status: PASS
root@12584098aeef:/tao#

As said, there are no bboxes drawn to resulting images, altough I inferred with training images as test images. Does that mean I had too few photos annotated? I had only 120 pictures :), But the training had convoluted to a low loss numbers. should I maybe use a pretrained model for detectnet_v2, which I did not, as my picture was 1920x1200,seemingly different input… I thought maybe I should train from scratch. I know its too few photos, but shouldn`t it infere at least one target?:)

Morganh · October 23, 2023, 4:01pm

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

What is the mAP when you running training or retraining?
You can check the training log or retraining log.

For comparison, suggest you to follow official notebook to download public KITTI dataset and do training and then run inference directly, without pruning/retraining.

system · November 7, 2023, 5:54am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training Failure for License Plate Detection Toturial TAO Toolkit training , tao	5	1087	October 13, 2021
Inference with tensorrt engine file has different results compared with trained hdf5 model TAO Toolkit	9	199	July 8, 2024
Bad results, while running inference on the pretrained Image Classification models TAO Toolkit image-processing	6	42	November 15, 2024
Detectnet_v2 notebook stuck at tfrecords conversion step TAO Toolkit	17	51	October 30, 2024
Tao Deploying to DeepStream for YOLOv4-tiny TAO Toolkit	6	688	August 25, 2023
Detectnet_v2 training core dumped error TAO Toolkit tensorrt , tensorflow , deep-learning , tao	24	1081	June 21, 2022
Detectnetv2 wont train if pretrained_model_file is specified. Peoplenet transfer learning TAO Toolkit	11	1006	December 28, 2021
Error while training detectnet v2 taotollkit on default notebook TAO Toolkit	2	307	March 9, 2024
Tao model detectnet_v2 dataset_convert : ValueError: could not convert string to float: 'fallback"' TAO Toolkit	2	164	May 20, 2024
IndexError: list index out of range in training Detectnet_v2 TAO Toolkit	2	350	November 2, 2023

Detectnet_v2 trained, tao infer can infer, but no results

Layer (type) Output Shape Param #

model_1 (Model) [(None, 1, 120, 75), (Non 11197893

Related topics