Converting Tfrecords for kitti trainval dataset

Hello, when I try to run the tfrecord converting command:

I am using tlt 0.1.4, docker_tag: v3.0-py3.

Spec:
kitti_config {
root_directory_path: “/workspace/tlt-experiments/data/training”
image_dir_name: “image_2”
label_dir_name: “label_2”
image_extension: “.jpg”
partition_mode: “random”
num_partitions: 2
val_split: 14
num_shards: 4
}
image_directory_path: “/workspace/tlt-experiments/data/training”

!tlt detectnet_v2 dataset_convert
-d $LOCAL_SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt
-o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval

I get this error.

Converting Tfrecords for kitti trainval dataset
2021-06-16 20:31:50,344 [INFO] root: Registry: [‘nvcr.io’]
Error: No such container: f516eb619ce39f9d8f96c5a8d9b0975631cb6ea2a0a72e296bc56f92bfcb1fbd
2021-06-16 20:31:58,410 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
Traceback (most recent call last):
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/api/client.py”, line 259, in _raise_for_status
response.raise_for_status()
File “/home/mzamanov/tlt/lib/python3.6/site-packages/requests/models.py”, line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/containers/f516eb619ce39f9d8f96c5a8d9b0975631cb6ea2a0a72e296bc56f92bfcb1fbd/stop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/mzamanov/tlt/bin/tlt”, line 8, in
sys.exit(main())
File “/home/mzamanov/tlt/lib/python3.6/site-packages/tlt/entrypoint/entrypoint.py”, line 114, in main
args[1:]
File “/home/mzamanov/tlt/lib/python3.6/site-packages/tlt/components/instance_handler/local_instance.py”, line 278, in launch_command
docker_handler.run_container(command)
File “/home/mzamanov/tlt/lib/python3.6/site-packages/tlt/components/docker_handler/docker_handler.py”, line 299, in run_container
self.stop_container()
File “/home/mzamanov/tlt/lib/python3.6/site-packages/tlt/components/docker_handler/docker_handler.py”, line 306, in stop_container
self._container.stop()
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/models/containers.py”, line 436, in stop
return self.client.api.stop(self.id, **kwargs)
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/utils/decorators.py”, line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/api/container.py”, line 1167, in stop
self._raise_for_status(res)
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/api/client.py”, line 261, in _raise_for_status
raise create_api_error_from_http_exception(e)
File “/home/mzamanov/tlt/lib/python3.6/site-packages/docker/errors.py”, line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found (“No such container: f516eb619ce39f9d8f96c5a8d9b0975631cb6ea2a0a72e296bc56f92bfcb1fbd”)

Refer to TLT mask rcnn error: Tlt.components.docker_handler.docker_handler: Stopping container - #11 by wilson.veloz

Where did you install tlt-launcher? In Jetson devices?

Yes, in Jetson Nano.

Please note that the tlt command should run on host PC instead of Jetson devices.

After training, if we deploy the model and run inference, it can be host PC or Jetson devices.

As Google Colab does not support docker, can you suggest any other cloud platforms that we can use to train a model?

See NVIDIA TAO Documentation
AWS or GCP can train a model.