TAO Toolkit API Deployment on Kubernetes → MongoDB Authentication Failed

Subject: TAO Toolkit API 6.0.0 Deployment on Kubernetes → MongoDB Authentication Failed

Hardware:

  • 2 × Quadro RTX 6000

  • Driver: 525.105.17

  • CUDA: 12.0

Network Type:
N/A – This issue occurs during TAO Toolkit API deployment (not related to training network).

TLT Version:
docker_tag: 6.0.0-pyt
(from Helm values.yaml → nvcr.io/nvidia/tao/tao-toolkit:6.0.0-pyt)

Training spec file:
N/A – issue occurs before any training jobs can be submitted.


How to Reproduce:

  1. Followed official instructions for TAO Toolkit API deployment on bare-metal Kubernetes:
    TAO Toolkit API Deployment – Bare Metal Setup

  2. Installed Helm chart (tao-toolkit-api-6.0.0-multi-node.tgz) with backend=local-k8s and hostPlatform=local.

  3. MongoDB StatefulSet comes up healthy (3 replicas running).

  4. TAO API app/workflow pods fail to initialize due to MongoDB authentication errors.


Cluster / Environment Info:

  • Kubernetes Client: v1.34.1, Server: v1.34.0

  • OS: Ubuntu 18.04.6 LTS (Linux ESIND-S2600WFT 5.4.0-150-generic)


Pods Status:

kubectl get pods
mongodb-0                  1/1   Running
mongodb-1                  1/1   Running
mongodb-2                  1/1   Running
tao-api-app-pod-55d97f8f5b-4vwcm     0/1   Init:0/2           1397 restarts
tao-api-app-pod-649885dd69-4zkhb     0/1   Init:0/2           1398 restarts
tao-api-workflow-pod-7bb574b857-9vgm6 0/1   CrashLoopBackOff   2158 restarts


Error Logs from tao-api-app-pod (mongodb-init container):

2025-10-02 05:18:44,197 - handlers.mongo_handler - ERROR - Exception in __init__: name 'mongo_client' is not defined
2025-10-02 05:19:44,250 - __main__ - ERROR - Error initializing replicaset! Authentication failed., full error: {'ok': 0.0, 'errmsg': 'Authentication failed.', 'code': 18, 'codeName': 'AuthenticationFailed'}
2025-10-02 05:20:44,285 - __main__ - ERROR - Error initializing replicaset! Authentication failed., full error: {'ok': 0.0, 'errmsg': 'Authentication failed.', 'code': 18, 'codeName': 'AuthenticationFailed'}

Error Logs from tao-api-workflow-pod:

2025-10-02 05:21:05,410 - nvidia_tao_core.microservices.handlers.mongo_handler - ERROR - Exception in __init__: Authentication failed., full error: {'ok': 0.0, 'errmsg': 'Authentication failed.', 'code': 18, 'codeName': 'AuthenticationFailed'}
2025-10-02 05:21:35,416 - nvidia_tao_core.microservices.handlers.mongo_handler - ERROR - Exception in __init__: Authentication failed., full error: {'ok': 0.0, 'errmsg': 'Authentication failed.', 'code': 18, 'codeName': 'AuthenticationFailed'}


Helm Values (relevant snippets):

backend: local-k8s
hostPlatform: local
mongoOperatorEnabled: false
mongoDesiredReplicas: 3
imageMongo: mongo


My Questions:

  1. When mongoOperatorEnabled=false, do I need to manually configure MongoDB users/replica set authentication for TAO Toolkit API?

  2. Is there a specific secret (username/password) the TAO API expects for Mongo connection?

  3. Or should the Helm chart handle Mongo initialization out-of-the-box?

Any guidance or working example values.yaml for local-k8s deployment without Mongo operator would be greatly appreciated.

Yes, you must configure it manually. Refer to tao_tutorials/setup/tao-docker-compose/docker-compose.yml at main · NVIDIA/tao_tutorials · GitHub, in this docker-compose.yml file , the mongodb service explicitly sets these environment variables:

  • MONGO_INITDB_ROOT_USERNAME: default-user
  • MONGO_INITDB_ROOT_PASSWORD: ${MONGOSECRET:-mongosecret}

This shows that even in a Docker Compose deployment, you need to manually specify the initial root username and password for MongoDB. Similarly, when using the Helm chart in Kubernetes with mongoOperatorEnabled=false, the chart will not automatically create users or enable authentication. You must manually provide these environment variables to the MongoDB pod in your values.yaml to ensure the MongoDB instance initializes with an authenticated user.

When mongoOperatorEnabled=false, the Helm chart’s role is equivalent to kubectl apply -f <xxx.yaml>. It is only responsible for deploying the MongoDB container, but it is not responsible for the initialization logic inside the container.

The actual initialization is handled by the official MongoDB image’s entrypoint script. This script checks for the presence of MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD environment variables. If they exist, it creates the user and starts the mongod process with authentication enabled. If they don’t exist, it starts without authentication. Therefore, the initialization is not handled “automatically” by the Helm chart; it is triggered by you when you provide the correct environment variables.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.