Thank for your help. It do have some progress. However I encounter another error.
fatal: [127.0.0.1]: FAILED! => {"changed": true, "cmd": "helm upgrade --install --reset-values --cleanup-on-fail --create-namespace --namespace default --atomic --wait tao-api https://helm.ngc.nvidia.com/nvidia/tao/charts/tao-toolkit-api-5.5.0.tgz --values /tmp/tao-toolkit-api-helm-values.yml --username='$oauthtoken' --password=<my-token>", "delta": "0:05:03.056649", "end": "2024-11-14 09:02:57.118460", "msg": "non-zero return code", "rc": 1, "start": "2024-11-14 08:57:54.061811", "stderr": "W1114 08:57:56.750101 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750126 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750185 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750195 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750226 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750238 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750322 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750330 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750444 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nW1114 08:57:56.750527 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead\nError: release tao-api failed, and has been uninstalled due to atomic being set: context deadline exceeded", "stderr_lines": ["W1114 08:57:56.750101 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750126 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750185 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750195 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750226 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750238 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750322 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750330 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750444 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "W1114 08:57:56.750527 1852114 warnings.go:70] annotation \"kubernetes.io/ingress.class\" is deprecated, please use 'spec.ingressClassName' instead", "Error: release tao-api failed, and has been uninstalled due to atomic being set: context deadline exceeded"], "stdout": "Release \"tao-api\" does not exist. Installing it now.", "stdout_lines": ["Release \"tao-api\" does not exist. Installing it now."]}
Here is my deploy.yml
name: 'AI-Training-PC'
spec:
cns:
enable_mig: no
mig_profile: all-disabled
mig_strategy: single
gpu_driver_version: "535.161.08"
# will override existing drivers if present
install_driver: false
tao:
ngc_api_key: <my-key>
ngc_email: <my-email>
chart: https://helm.ngc.nvidia.com/nvidia/tao/charts/tao-toolkit-api-5.5.0.tgz
chart_values: |
---
cluster_name: tao-api-demo