Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/efficientde$
File “/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py”, line 417, in load_model
f = h5dict(filepath, ‘r’)
File “/usr/local/lib/python3.6/dist-packages/keras/utils/io_utils.py”, line 186, in init
self.data = h5py.File(path, mode=mode)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 312, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File “/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py”, line 142, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File “h5py/_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py/_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “h5py/h5f.pyx”, line 78, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
Please check if the ngc API key is correct.
yeah. it is correct. i have downloaded efficientnet_b0 and efficientnet_b2 as well. they both work fine. i am getting this error for efficientnet_b1 only.
Please try to download efficientnet_b1 again. Not sure if it is broken.
i tried deleting and downloading 3-4 times as well. but the error persists everytime.
Can you share full training command and training spec file?
training_config {
train_batch_size: 4
iterations_per_loop: 10
checkpoint_period: 1
num_examples_per_epoch: 56897
num_epochs: 20
#model_name: 'efficientdet-d1'
profile_skip_steps: 100
tf_random_seed: 42
lr_warmup_epoch: 5
lr_warmup_init: 1e-05
learning_rate: 0.001
amp: True
moving_average_decay: 0.9999
l2_weight_decay: 0.0001
l1_weight_decay: 0.0
checkpoint: "/workspace/TAO/efficientdet_pretrained_models/efficientnet_b1.hdf5"
}
dataset_config {
num_classes: 17
image_size: "544,960"
training_file_pattern: "/workspace/TAO/T_1/tfrecords/train-*"
validation_file_pattern: "/workspace/TAO/T_1/tfrecords/val-*"
validation_json_file: "/workspace/val/val_COCO.json"
max_instances_per_image: 100
skip_crowd_during_training: True
}
eval_config {
eval_batch_size: 4
eval_epoch_cycle: 1
eval_after_training: True
eval_samples: 20561
min_score_thresh: 0.4
max_detections_per_image: 100
}
model_config {
model_name: "efficientdet-d1"
min_level: 3
max_level: 7
num_scales: 3
aspect_ratios : '[(1,1), (1.73,0.57), (0.57,1.73)]'
anchor_scale : 4
}
augmentation_config {
rand_hflip: True
random_crop_min_scale: 0.1
random_crop_min_scale: 2.0
}
tao efficientdet train --gpus 1 -e /workspace/TAO/T_1/experiment_spec.txt -d /workspace/TAO/T_1/weights -k key --log_file /workspace/TAO/T_1/log.txt
Please run below and share the result.
tao efficientdet run ls -rlt /workspace/TAO/efficientdet_pretrained_models/*
and
tao efficientdet run md5sum /workspace/TAO/efficientdet_pretrained_models/*
Please check your ~/.tao_mounts.json if it is correct.
In your spec, you set pretrained model path to “/workspace/TAO/efficientdet_pretrained_models/efficientnet_b1.hdf5” .
I just want to know the "ls -rlt " and “md5sum” for your models.
ls -rlt /workspace/TAO/efficientdet_pretrained_models/*
-rw------- 1 root root 0 Mar 2 05:07 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b0.hdf5
-rw------- 1 root root 64864720 Mar 7 04:33 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b2.hdf5
-rw------- 1 root root 55160336 Mar 7 13:25 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b1.hdf5
-rw------- 1 root root 89213464 Mar 7 15:43 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b3.hdf5
md5sum /workspace/TAOefficientdet_pretrained_models/*
d41d8cd98f00b204e9800998ecf8427e /workspace/TAO/efficientdet_pretrained_models/efficientnet_b0.hdf5
49e8b63a6a15c28666e06f028352ea89 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b1.hdf5
483b0fee8386f263807269dd2a0d3b86 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b2.hdf5
6e9eed904dde1a148f10deec358c8314 /workspace/TAO/efficientdet_pretrained_models/efficientnet_b3.hdf5
I cannot reproduce your error. The efficientnet_b1.hdf5 works fine.
Please double check on your side.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.