On an AWS g4dn.12xlarge, after adding visualizer { enabled: True }
to my spec file, calling tao yolo_v4 train
fails if --gpus 4
is set with the following error:
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py", line 145, in <module>
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py", line 707, in return_func
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py", line 695, in return_func
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py", line 141, in main
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py", line 126, in main
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py", line 77, in run_experiment
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/models/yolov4_model.py", line 715, in train
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/utils/fit_generator.py", line 222, in fit_generator
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1211, in train_on_batch
class_weight=class_weight)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 789, in _standardize_user_data
exception_prefix='target')
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py", line 92, in standardize_input_data
data = [standardize_single_array(x) for x in data]
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py", line 92, in <listcomp>
data = [standardize_single_array(x) for x in data]
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py", line 27, in standardize_single_array
elif x.ndim == 1:
AttributeError: 'tuple' object has no attribute 'ndim'
I’m using:
v3.22.05-tf1.15.5-py3
toolkit_version: 3.22.05
published_date: 05/25/2022
Are there any workarounds?