• Hardware: A6000
• Network Type: re_identification
• TAO version: 5.0.0
• Training spec file:
model:
backbone: resnet_50
last_stride: 1
pretrain_choice: imagenet
pretrained_model_path: "/workspace/tao-experiments/models/resnet50_market1501_aicity156.tlt"
input_channels: 3
input_width: 128
input_height: 256
neck: bnneck
feat_dim: 256
neck_feat: after
metric_loss_type: triplet
with_center_loss: False
with_flip_feature: False
label_smooth: True
dataset:
train_dataset_dir: "/workspace/tao-experiments/data/bounding_box_train"
test_dataset_dir: "/workspace/tao-experiments/data/bounding_box_test"
query_dataset_dir: "/workspace/tao-experiments/data/query"
num_classes: 60
batch_size: 64
val_batch_size: 128
num_workers: 1
pixel_mean: [0.485, 0.456, 0.406]
pixel_std: [0.226, 0.226, 0.226]
padding: 10
prob: 0.5
re_prob: 0.5
sampler: softmax_triplet
num_instances: 4
re_ranking:
re_ranking: True
k1: 20
k2: 6
lambda_value: 0.3
train:
optim:
name: Adam
lr_monitor: val_loss
steps: [10, 20]
gamma: 0.1
bias_lr_factor: 1
weight_decay: 0.0005
weight_decay_bias: 0.0005
warmup_factor: 0.01
warmup_iters: 10
warmup_method: linear
base_lr: 0.00035
momentum: 0.9
center_loss_weight: 0.0005
center_lr: 0.5
triplet_loss_margin: 0.3
num_epochs: 30
checkpoint_interval: 10
• How to reproduce the issue ?
tao model re_identification export -e any.yaml
Hi,
I used TOA tookit 5.0.0 to retrain the re_identification model.
tao model re_identification train -e /workspace/tao-experiments/specs/experiment_spec_file.yaml -r /workspace/tao-experiments/results -k nvidia_tao
The volume mapping with the ~/.tao_mounts.json file also works fine. The training was done successful and I have a custom model.tlt file in my results. I want to export this model to an onnx file so I can run it in my deepstream pipeline’s sgie. Just like I do with the deployable version of the model.
However, evaluate, export and inference all fail due to missing file errors.
Traceback (most recent call last):
File "</usr/local/lib/python3.8/dist-packages/nvidia_tao_pytorch/cv/re_identification/scripts/export.py>", line 3, in <module>
File "<frozen cv.re_identification.scripts.export>", line 150, in <module>
File "<frozen core.hydra.hydra_runner>", line 107, in wrapper
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 296, in run_and_report
raise ex
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 213, in run_and_report
return func()
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 453, in <lambda>
lambda: hydra.run(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "<frozen cv.re_identification.scripts.export>", line 71, in main
File "<frozen cv.re_identification.scripts.export>", line 58, in main
File "<frozen core.utilities>", line 68, in update_results_dir
File "/usr/lib/python3.8/posixpath.py", line 76, in join
a = os.fspath(a)
Since nvidia_tao_pytorch is encrypted I can not determine what exactly causes this. However, it looks like the hydra schema to validate the export config is not found. This can also be reproduced by simply calling
tao model re_identification export -e any.yaml
The error occurs the moment an existing spec file is found. The provided spec file can even be empty. The other mandatory parameters can all be dropped.
error.log (3.8 KB)