Yolov3 not working

To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

2021-06-11 08:13:55,018 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2021-06-11 08:13:55,018 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py:49: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2021-06-11 08:13:55,462 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py:49: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py:52: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-06-11 08:13:55,462 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py:52: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2021-06-11 08:14:02,010 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2021-06-11 08:14:02,034 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

2021-06-11 08:14:02,086 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

2021-06-11 08:14:03,253 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

Traceback (most recent call last):
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py”, line 213, in
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py”, line 209, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/train.py”, line 84, in run_experiment
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/builders/model_builder.py”, line 181, in build_data_and_model
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/model_io.py”, line 48, in load_model
File “/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py”, line 417, in load_model
f = h5dict(filepath, ‘r’)
File “/usr/local/lib/python3.6/dist-packages/keras/utils/io_utils.py”, line 186, in init
self.data = h5py.File(path, mode=mode)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 312, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 142, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File “h5py/_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py/_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “h5py/h5f.pyx”, line 78, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = ‘EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5’, errno = 2, error message = ‘No such file or directory’, flags = 0, o_flags = 0)
Traceback (most recent call last):
File “/usr/local/bin/yolo_v3”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/entrypoint/yolo_v3.py”, line 12, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py”, line 296, in launch_job
AssertionError: Process run failed.
print("To resume from checkpoint, please change pretrain_model_path

Hi @bhargavi.sanadhya,

We recommend you to provide more details of the issue and issue repro scripts/model for better assistance.
Also please share following env details.
TensorRT Version :
GPU Type :
Nvidia Driver Version :
CUDA Version :
CUDNN Version :
Operating System + Version :
Python Version (if applicable) :
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) :

Thank you.

Could not find tensorRT version so I tried this
root@ef112f939b52:/workspace# dpkg -l | grep nvinfer
ii libnvinfer-dev 7.2.1-1+cuda11.1 amd64 TensorRT development libraries and headers
ii libnvinfer-plugin-dev 7.2.1-1+cuda11.1 amd64 TensorRT plugin libraries
ii libnvinfer-plugin7 7.2.1-1+cuda11.1 amd64 TensorRT plugin libraries
ii libnvinfer7 7.2.1-1+cuda11.1 amd64 TensorRT runtime libraries
root@ef112f939b52:/workspace# nm -D libnvinfer7 | grep tensorrt_version
nm: ‘libnvinfer7’: No such file
root@ef112f939b52:/workspace# ^C
root@ef112f939b52:/workspace# dpkg -l | grep TensorRT
ii libnvinfer-dev 7.2.1-1+cuda11.1 amd64 TensorRT development libraries and headers
ii libnvinfer-plugin-dev 7.2.1-1+cuda11.1 amd64 TensorRT plugin libraries
ii libnvinfer-plugin7 7.2.1-1+cuda11.1 amd64 TensorRT plugin libraries
ii libnvinfer7 7.2.1-1+cuda11.1 amd64 TensorRT runtime libraries
ii libnvonnxparsers-dev 7.2.1-1+cuda11.1 amd64 TensorRT ONNX libraries
ii libnvonnxparsers7 7.2.1-1+cuda11.1 amd64 TensorRT ONNX libraries
ii libnvparsers-dev 7.2.1-1+cuda11.1 amd64 TensorRT parsers libraries
ii libnvparsers7 7.2.1-1+cuda11.1 amd64 TensorRT parsers libraries

For GPU Drivers
root@ef112f939b52:/workspace# nvidia-smi --query-gpu=driver_version --format=csv,noheader
460.39
460.39
460.39
460.39

For CUDA version
root@ef112f939b52:/workspace# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

Operating system version
root@ef112f939b52:/workspace# cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION=“Ubuntu 18.04.5 LTS”
NAME=“Ubuntu”
VERSION=“18.04.5 LTS (Bionic Beaver)”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu 18.04.5 LTS”
VERSION_ID=“18.04”
HOME_URL=“https://www.ubuntu.com/
SUPPORT_URL=“https://help.ubuntu.com/
BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/
PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

tensorflow and python3
root@ef112f939b52:/workspace# python3 -c ‘import tensorflow as tf; print(tf.version)’
2021-06-11 11:25:09.070606: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
1.15.4

CUDNN version
root@ef112f939b52:/workspace# cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
root@ef112f939b52:/workspace# dpkg -l | grep cudnn
hi libcudnn8 8.0.4.30-1+cuda11.1 amd64 cuDNN runtime libraries
ii libcudnn8-dev 8.0.4.30-1+cuda11.1 amd64 cuDNN development libraries and headers

TLT image - docker pull nvcr.io/nvidia/tlt-streamanalytics:v3.0-py3

I am still not sure about TensorRT and CUDNN versions. Is there any command?
Moreover I found the above versions by running the commands inside the Docker container

Hi @bhargavi.sanadhya,

Based on the information you’ve provided it looks like you are using TLT container. We recommend you to post your query on TLT forum to get better help.

Thank you.