Direct export/inference TAO CV models using TAO API

,

Hi,

I am attempting to explore ways to directly inference the data using pretrained TAO models.

Specifically, using the files provided via NGC-CLI (tao-getting-started_v5.2.0/notebooks/tao_api_starter_kit/api/object_detection.ipynb) hosted via TAO-API hosted on AWS, I’m using a lpdnet:unpruned_v1.0 PTM and attach inference_dataset to the created model (via POST request).

However, upon exporting via following POST command,

actions = ["export"]
# original: 
# data = json.dumps({"job":parent,"actions":actions,"parent_id":model_id,"parent_job_type":"model"})
# actual: 
data = json.dumps({"actions":actions, "parent_id":model_id, "parent_job_type":"model"})

the following occurred,

{
  "action": "export",
  "created_on": "2024-02-08T03:21:15.860367",
  "id": "38d9****-****-**c6-****-18ba134****6",
  "last_modified": "2024-02-08T03:21:16.301992",
  "parent_id": null,
  "result": {
    "detailed_status": {
      "message": "Error due to unmet dependencies"
    }
  },
  "status": "Error"
}

I’m also exploring ways that I can export the TRT engine so I can host the pre-trained model via a triton server, any general guidance to this would be appreciated.

Thanks.

For using tao_api, you can use kubectl commands to check further about the error, such as kubectl describe, kubectl logs, etc.
You can also double check when run notebook without tao_api, for example, tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2 at main · NVIDIA/tao_tutorials · GitHub.

Hi, @kyang31
Could you share your notebook file as well? Thanks.

Hi kyang31,

for a quick export without tao api, you could do this:

Nvidia/Luis

Hi @kyang31
May I know why you comment out "job":parent?

Hi @kyang31

I cannot reproduce the error with default notebook(tao_tutorials/notebooks/tao_api_starter_kit/api/object_detection.ipynb at main · NVIDIA/tao_tutorials · GitHub).

Then, I try to run 2nd experiment.
Since you mention that you are using lpdnet pretrained model, I try to use it as well and change the key according to https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/lpdnet.

pretrained_map = {"detectnet_v2" : "detectnet_v2:resnet18",

to

pretrained_map = {"detectnet_v2" : "lpdnet:unpruned_v1.0",

Also, need to change

encode_key = "tlt_encode"

to

encode_key = "nvidia_tlt"

The export can also run successfully.

So, I am afraid your training is not completed due to wrong key.

BTW, to debug any cell, as mentioned previously, you can run below in the terminal to check what is running in the pod.
$ kubectl get pods
$ kubectl describe pod <pod id>
$ kubectl logs -f <pod id>