Unable to configure gpt-4o with VSS instead of vila using Openai-azure API key

I have launchpad access with 8 NVIDIA H100 NVL GPUs. I am able to deploy and use VSS with nvila but unable to use gpt-4o model with VSS replacing nvila. I followed the instructions mentioned in Nvidia documentation( Configure the VLM Video Search and Summarization Agent). Attaching the images from overrides file which I have modified for using gpt-4o through openai azure api key.

Below are the commands:
OPENAI_API_KEY=‘XXXXXXXXXXXXXXX’

NGC_API_KEY=‘nvapi-XXXXXXXXXXXXXXXXXXX’

kubectl create secret docker-registry ngc-docker-reg-secret --docker-server=nvcr.io --docker-username=‘$oauthtoken’ --docker-password=$NGC_API_KEY

kubectl create secret generic graph-db-creds-secret --from-literal=username=neo4j --from-literal=password=password

kubectl create secret generic openai-api-key-secret --from-literal=OPENAI_API_KEY=$OPENAI_API_KEY

helm fetch https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-vss-2.2.0.tgz --username=‘$oauthtoken’ --password=$NGC_API_KEY

helm install vss-blueprint nvidia-blueprint-vss-2.2.0.tgz --set global.ngcImagePullSecretName=ngc-docker-reg-secret -f overrides_gpt.yaml

Getting error while deploying, attached image for reference:

Content of overrides file:


vss:
  applicationSpecs:
    vss-deployment:
      containers:
        vss:
          image:
            repository: nvcr.io/nvidia/blueprint/vss-engine
            tag: 2.2.0 # Update to override with custom VSS image
          env:
            - name: VLM_MODEL_TO_USE
              value: openai-compat
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: openai-api-key-secret
                  key: OPENAI_API_KEY
            - name: DISABLE_GUARDRAILS
              value: "false" # "true" to disable guardrails.
            - name: TRT_LLM_MODE
              value: ""  # int4_awq (default), int8 or fp16. (for VILA only)
            - name: VLM_BATCH_SIZE
              value: ""  # Default is determined based on GPU memory. (for VILA only)
            - name: VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME
              value: "gpt-4o"  # Set to use a VLM exposed as a REST API with OpenAI compatible API (e.g. gpt-4o)
            - name: VIA_VLM_ENDPOINT
              value: "https://usncoai0kua.openai.azure.com"  # Default OpenAI API. Override to use a custom API
            - name: VIA_VLM_API_KEY
              value: "XXXXXXXXXXXXXXXXXXXXXXX"  # API key to set when calling VIA_VLM_ENDPOINT
            - name: OPENAI_API_VERSION
              value: "2024-05-01-preview"
            - name: AZURE_OPENAI_API_VERSION
              value: "2024-05-01-preview"
            - name: AZURE_OPENAI_ENDPOINT
              value: "https://usncoai0kua.openai.azure.com"

  resources:
    limits:
      nvidia.com/gpu: 2   # Set to 8 for 2 x 8H100 node deployment
  # nodeSelector:
  #   kubernetes.io/hostname: <node-1>

nim-llm:
  resources:
    limits:
      nvidia.com/gpu: 4
  # nodeSelector:
  #   kubernetes.io/hostname: <node-2>

nemo-embedding:
  resources:
    limits:
      nvidia.com/gpu: 1  # Set to 2 for 2 x 8H100 node deployment
  # nodeSelector:
  #   kubernetes.io/hostname: <node-2>

nemo-rerank:
  resources:
    limits:
      nvidia.com/gpu: 1  # Set to 2 for 2 x 8H100 node deployment
  # nodeSelector:
  #   kubernetes.io/hostname: <node-2>



vss_deploy_logs.txt (8.3 KB)
vss_blueprint_logs.txt (3.6 KB)
rerank_logs.txt (3.3 KB)
embed_logs.txt (3.3 KB)

Could you refer to our FAQ and attach the detailed log information?

Attached detailed logs for all pods which are not running

  Warning  FailedMount  2m10s (x2177 over 3d1h)  kubelet  MountVolume.SetUp failed for volume "secret-ngc-api-key-volume" : secret "ngc-api-key-secret" not found

Have you created the ngc-api-key-secret secret?

Yes, I have created and used the NGC API KEY for deploying VSS with Nvila. I am able to deploy that successfully but when changing the model to gpt-4o then it’s failed to deploy.

kubectl get secrets

Could you check if your helm chart and the secret are deployed in the same namespace?

Yes, they are in same namespace.
secrets


Pods:

From the image you attached, there is no ngc-api-key-secret in your vsstest namespace.

Thanks for notifying that. It solved the error while deploying the services. Now I am able to access UI but when I am trying to use summarization API getting below error:

Attached logs for your reference.
summary_logs.txt (761.7 KB)

Response body for summarization:

{
  "id": "418fe290-a491-4c24-b32e-afcf08e3ee5f",
  "prompt": "Write a concise and clear dense caption for the provided video",
  "model": "gpt-4o",
  "api_type": "internal",
  "response_format": {
    "type": "text"
  },
  "stream": false,
  "stream_options": {
    "include_usage": false
  },
  "max_tokens": 512,
  "temperature": 0.2,
  "top_p": 1,
  "top_k": 100,
  "seed": 10,
  "chunk_duration": 60,
  "chunk_overlap_duration": 10,
  "summary_duration": 60,
  "media_info": {
    "type": "offset",
    "start_offset": 0,
    "end_offset": 4000000000
  },
  "user": "user-123",
  "caption_summarization_prompt": "Prompt for caption summarization",
  "summary_aggregation_prompt": "Prompt for summary aggregation",
  "graph_rag_prompt_yaml": "",
  "tools": [],
  "summarize": true,
  "enable_chat": true,
  "num_frames_per_chunk": 10,
  "vlm_input_width": 10,
  "vlm_input_height": 10,
  "summarize_batch_size": 5,
  "rag_type": "graph-rag",
  "rag_top_k": 5,
  "rag_batch_size": 5
}

Hi @ina.khandelwal , could you try with a sample video packaged inside VSS container? We have tried and it works well.

  • Summarize the bridge video using gradio UI(To get the file added to the backend)
  • From swagger UI execute GET /files API to get the asset id of the bridge file
  • Then, use summarize API using the swagger UI. Modify the “id” to the asset id of the bridge file and remove “api_type” before running the summarize API.


This is the response body I tried after removing api_type, Status is 200 but I am not getting content in response

{
  "id": "fb796833-74f0-4fe5-b288-4b2af5fb0e10",
  "prompt": "Write a concise and clear dense caption for the provided warehouse video",
  "model": "gpt-4o",
  "response_format": {
    "type": "text"
  },
  "stream": false,
  "stream_options": {
    "include_usage": false
  },
  "max_tokens": 512,
  "temperature": 0.2,
  "top_p": 1,
  "top_k": 100,
  "seed": 10,
  "chunk_duration": 60,
  "chunk_overlap_duration": 10,
  "summary_duration": 60,
  "media_info": {
    "type": "offset",
    "start_offset": 0,
    "end_offset": 4000000000
  },
  "user": "user-123",
  "caption_summarization_prompt": "Prompt for caption summarization",
  "summary_aggregation_prompt": "Prompt for summary aggregation",
  "graph_rag_prompt_yaml": "",
  "tools": [],
  "summarize": true,
  "enable_chat": false,
  "num_frames_per_chunk": 10,
  "vlm_input_width": 10,
  "vlm_input_height": 10,
  "summarize_batch_size": 5,
  "rag_type": "graph-rag",
  "rag_top_k": 5,
  "rag_batch_size": 5
}

Output:

I tried to give same parameters as yours but getting below error in that case :

Did you use the “bridge video” from our samples?

UI is working fine but I want to use APIs and getting that error while using API for summarization. It should work for other videos as well. Although even for default videos API is not working.

After you switch the VLM to gpt-4o, is it also okay to use the UI?
We suggest that you first load the “bridge video” using gradio UI to facilitate obtaining the file ID. Then you can use the API to get summarization.

Do you mean that even using our “bridge video” is not working?

UI is working fine for all videos but when trying to use API for any of the video default or external for summarization then it’s not showing me the content.