AssertionError: output_channel must be either 1 or 3. 2021-11-01 11:27:31,565 [INFO] tlt.components.docker_handler.docker_handler: Stopping container

EbinJose · November 1, 2021, 6:21am

TensorRT Version7.2.1.6
Quadro RTX 5000 dual GPU
Driver Version: 455.23.05
CUDA Version: 11.1
Ubuntu 18.04
python 3.6
Yolo_v4

nvidia/tao/tao-toolkit-tf:
docker_registry: nvcr.io
docker_tag: v3.21.08-py3

I am training a custom model using TAO yolov4 using only one class ,i made the data set using KITTI format and divided it into Train test val as mentioned by Toolkit. i also added custom anchors and made the following changes to the spec file (mentioned below) but when i statrt training i get this error. but when i start to train my data i get this error

To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
/home/vaaan/.local/lib/python3.7/site-packages/tlt/init.py:20: DeprecationWarning:
The nvidia-tlt package will be deprecated soon. Going forward please migrate to using the nvidia-tao package.

warnings.warn(message, DeprecationWarning)
~/.tao_mounts.json wasn’t found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-11-01 11:27:24,400 [INFO] root: Registry: [‘nvcr.io’]
2021-11-01 11:27:24,567 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/vaaan/.tlt_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

2021-11-01 05:57:30,143 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2021-11-01 05:57:30,143 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:40: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2021-11-01 05:57:30,312 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:40: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-11-01 05:57:30,313 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 110, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py”, line 494, in return_func
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py”, line 482, in return_func
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 106, in main
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 49, in run_experiment
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/utils/spec_loader.py”, line 63, in load_experiment_spec
AssertionError: output_channel must be either 1 or 3.
2021-11-01 11:27:31,565 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

• Training spec file
!cat $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti.txt

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(46.80, 8.32),(68.64, 13.44),(117.87, 22.04)]"
  mid_anchor_shape: "[(8.48, 1.95),(24.27, 3.73),(33.28, 5.76)]"
  small_anchor_shape: "[(3.39, 0.97),(4.75, 1.39),(6.44, 1.39)]"
  box_matching_iou: 0.25
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 0.8
  loss_neg_obj_weights: 100.0
  loss_class_weights: 0.5
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 8
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 10
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
  pretrain_model_path: "EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5"
}
eval_config {
  average_precision_mode: SAMPLE
  batch_size: 8
  matching_iou_threshold: 0.5
}
nms_config {
  confidence_threshold: 0.001
  clustering_iou_threshold: 0.5
  top_k: 200
}
augmentation_config {
  hue: 0.1
  saturation: 1.5
  exposure:1.5
  vertical_flip:0
  horizontal_flip: 0.5
  jitter: 0.3
  output_width: 1248
  output_height: 384
  randomize_input_shape_period: 0
  mosaic_prob: 0.5
  mosaic_min_ratio:0.2
}
dataset_config {
  data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/training/label_2"
      image_directory_path: "/workspace/tlt-experiments/data/training/image_2"
  }
  include_difficult_in_training: True
  target_class_mapping {
      key: "pothole"
      value: "pothole"
  }
  validation_data_sources: {
      label_directory_path: "/workspace/tlt-experiments/data/val/label"
      image_directory_path: "/workspace/tlt-experiments/data/val/image"
  }
}

Blockquote

please help

Morganh · November 1, 2021, 7:12am

Can you set output_channel in the spec? Refer to https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/yolo_v4.html#creating-a-configuration-file

EbinJose · November 1, 2021, 7:57am

Thank you ,it worked

system · November 15, 2021, 7:57am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error when training YOLOV3 with TAO TAO Toolkit	5	553	May 20, 2022
About tao_mounts.json and docker container stop in traning cell TAO Toolkit	7	862	July 6, 2022
OSError: Specfile not found plz help TAO Toolkit	16	1583	October 12, 2021
Unable to export QAT yolov3 in int8 TAO Toolkit	7	548	April 25, 2023
Can't see the classification and other folder inside TLT-V3 TAO Toolkit	21	2493	October 12, 2021
TAO yoloV4 cannot train from checkpoint TAO Toolkit	8	394	August 5, 2022
AssertionError: Config path must be a valid unix path. No file found at: /root/.docker/config.json. Did you run docker login? TAO Toolkit tao	11	1953	July 6, 2022
Spec file for yolo v3 not recognized TAO Toolkit	11	22	September 30, 2024
CLI update TAO Toolkit	14	1157	June 23, 2022
Tao pre-trained yolo4tiny - AssertionError: Must have more boxes than clusters TAO Toolkit	54	2264	January 21, 2022

AssertionError: output_channel must be either 1 or 3. 2021-11-01 11:27:31,565 [INFO] tlt.components.docker_handler.docker_handler: Stopping container

Related topics