I trained an SSD ResNet10 model using the “ssd.ipynb” notebook on Google Colab and obtained a file named “ssd_resnet10_epoch_010.tlt”. However, I encountered some issues when attempting to execute the “!tao ssd export” command in the same notebook.
General information:
• Google Colab
• Hardware T4
• SSD resnet10
• Training spec file
ssd_retrain_resnet10_kitti.txt (1.9 KB)
Command:
!tao ssd export -m $EXPERIMENT_DIR/experiment_dir_retrain/1280/weights/ssd_resnet10_epoch_010.tlt \
-o $EXPERIMENT_DIR/experiment_dir_etlt/1280/ssd_resnet10_epoch_10_fp32.etlt \
-e $SPECS_DIR/1280/ssd_retrain_resnet10_kitti.txt \
-k $KEY \
--data_type fp32 \
--gen_ds_config
Error:
2023-06-05 23:09:38.176465: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1086] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-05 23:09:38.176725: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1086] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-05 23:09:38.176905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13642 MB memory) → physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
Traceback (most recent call last):
File “</usr/local/lib/python3.6/dist-packages/iva/ssd/scripts/export.py>”, line 3, in
File “”, line 17, in
File “”, line 302, in launch_export
File “”, line 284, in run_export
File “”, line 372, in export
File “”, line 141, in save_etlt_file
File “”, line 218, in node_process
File “/usr/local/lib/python3.6/dist-packages/graphsurgeon/DynamicGraph.py”, line 330, in remove
remove_names = set(_get_node_names(nodes))
File “/usr/local/lib/python3.6/dist-packages/graphsurgeon/_utils.py”, line 85, in _get_node_names
return [node.name for node in nodes]
File “/usr/local/lib/python3.6/dist-packages/graphsurgeon/_utils.py”, line 85, in
return [node.name for node in nodes]
AttributeError: ‘str’ object has no attribute ‘name’
Telemetry data couldn’t be sent, but the command ran successfully.
[WARNING]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL
Previously I have trainned other models with yolo_v4.ipynb and yolo_v4_tiny.ipynb notebooks and everythings works as expected when exporting to .etlt