VSS issue - API Key Issue When Using OpenAI GPT-4o Instead of LLM-SVC in VSS Blueprint

hye.jang · February 27, 2025, 10:01am

Hardware Platform (GPU model and numbers)
NVIDIA H100 80GB HBM3 * 8
System Memory
total 2.0Ti
available 1.7Ti
Ubuntu Version
Ubuntu 22.04.4 LTS
NVIDIA GPU Driver Version (valid for GPU only)
NVIDIA-SMI 550.90.07
Issue Type( questions )

Hello NVIDIA team,

I am currently trying to use OpenAI’s GPT-4o model instead of the default llm-svc in the VSS Blueprint Helm Chart. To achieve this, I configured the overrides.yaml file as follows:

Configuration (overrides.yaml):

nim-llm:
  env:
  - name: NVIDIA_VISIBLE_DEVICES
    value: "0,1,2,3"
  resources:
    limits:
      nvidia.com/gpu: 0    # no limit
  
vss:
  applicationSpecs:
    vss-deployment:
      containers:
        vss:
          startupProbe:
            failureThreshold: 360
          env:
          - name: VLM_MODEL_TO_USE
            value: openai-compat
          - name: OPENAI_API_KEY
            valueFrom:
              secretKeyRef:
                name: openai-api-key-secret
                key: OPENAI_API_KEY
          - name: OPENAI_API_KEY_NAME
            value: OPENAI_API_KEY 
          - name: MODEL_PATH
            value: "ngc:nim/nvidia/vila-1.5-40b:vila-yi-34b-siglip-stage3_1003_video_v8"
          - name: NVIDIA_VISIBLE_DEVICES
            value: "4,5,6,7"
          - name: ASSET_STORAGE_DIR # custom upload directory
            value: "/tmp/custom-asset-dir"
          - name: EXAMPLE_STREAMS_DIR  # custom example directory
            value: "/tmp/custom-example-streams-dir"

  resources:
    limits:
      nvidia.com/gpu: 0
  extraPodVolumes:
  - name: custom-asset-dir
    hostPath:
      path: /home/nvadmin/Workspace/blueprint/video_uploads # custom upload directory on host
  - name: custom-example-streams-dir
    hostPath:
      path: /home/nvadmin/Workspace/blueprint/video_examples  # custom example directory on host
  extraPodVolumeMounts:
  - name: custom-asset-dir
    mountPath: /tmp/custom-asset-dir
  - name: custom-example-streams-dir
    mountPath: /tmp/custom-example-streams-dir
  configs:
    ca_rag_config.yaml:
      chat:
        embedding:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        llm:
          base_url: https://api.openai.com/v1
          model: gpt-4o
        reranker:
          base_url: http://nemo-rerank-ranking-deployment-ranking-service:8000/v1
      summarization:
        embedding:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        llm:
          base_url: https://api.openai.com/v1
          model: gpt-4o
    guardrails_config.yaml:
      models:
      - engine: nim
        model: gpt-4o
        parameters:
          base_url: https://api.openai.com/v1
        type: main
      - engine: nim_patch
        model: nvidia/llama-3.2-nv-embedqa-1b-v2
        parameters:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        type: embeddings


nemo-embedding:
  applicationSpecs:
    embedding-deployment:
      containers:
        embedding-container:
          env:
          - name: NGC_API_KEY
            valueFrom:
              secretKeyRef:
                key: NGC_API_KEY
                name: ngc-api-key-secret
          - name: NVIDIA_VISIBLE_DEVICES
            value: '4'
  resources:
    limits:
      nvidia.com/gpu: 0

nemo-rerank:
  applicationSpecs:
    ranking-deployment:
      containers:
        ranking-container:
          env:
          - name: NGC_API_KEY
            valueFrom:
              secretKeyRef:
                key: NGC_API_KEY
                name: ngc-api-key-secret
          - name: NVIDIA_VISIBLE_DEVICES
            value: '4'
  resources:
    limits:
      nvidia.com/gpu: 0

After deploying with this configuration, I encountered the following error (vss-deployment):

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/via/via-engine/via_server.py", line 1154, in run
    self._stream_handler = ViaStreamHandler(self._args)
  File "/opt/nvidia/via/via-engine/via_stream_handler.py", line 422, in __init__
    response = asyncio.run(
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/rails/llm/llmrails.py", line 688, in generate_async
    new_events = await self.runtime.generate_events(
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/colang/v1_0/runtime/runtime.py", line 167, in generate_events
    next_events = await self._process_start_action(events)
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/colang/v1_0/runtime/runtime.py", line 363, in _process_start_action
    result, status = await self.action_dispatcher.execute_action(
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/action_dispatcher.py", line 253, in execute_action
    raise e
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/action_dispatcher.py", line 214, in execute_action
    result = await result
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/library/self_check/input_check/actions.py", line 71, in self_check_input
    response = await llm_call(llm, prompt, stop=stop)
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/llm/utils.py", line 96, in llm_call
    raise LLMCallException(e)
nemoguardrails.actions.llm.utils.LLMCallException: LLM Call Exception: [###] {'message': 'Incorrect API key provided: **********. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}
{'error': {'message': 'Incorrect API key provided: **********. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/via/via-engine/via_server.py", line 2481, in <module>
    server.run()
  File "/tmp/via/via-engine/via_server.py", line 1156, in run
    raise ViaException(f"Failed to load VIA stream handler - {str(ex)}")
via_exception.ViaException: ViaException - code: InternalServerError message: Failed to load VIA stream handler - LLM Call Exception: [###] {'message': 'Incorrect API key provided: **********. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}
{'error': {'message': 'Incorrect API key provided: **********. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

However, when I manually executed the following script inside the vss-vss-deployment pod, it seemed to work correctly, which suggests that the API key itself is not the issue:

import os
import openai

models = openai.models.list()
print(models)

SyncPage[Model](data=[Model(id='omni-moderation-2024-09-26', created=1732734466, object='model', owned_by='system'), Model(id='gpt-4o-mini-audio-preview-2024-12-17', created=1734115920, object='model', owned_by='system'), Model(id='dall-e-3', created=1698785189, object='model', owned_by='system'), Model(id='dall-e-2', created=1698798177, object='model', owned_by='system'), Model(id='gpt-4o-audio-preview-2024-10-01', created=1727389042, object='model', owned_by='system'), Model(id='o1', created=1734375816, object='model', owned_by='system'), Model(id='gpt-4o-audio-preview', created=1727460443, object='model', owned_by='system'), Model(id='gpt-4o-mini-realtime-preview-2024-12-17', created=1734112601, object='model', owned_by='system'), Model(id='o1-2024-12-17', created
....

The script successfully retrieved the available models from OpenAI, indicating that authentication is working in this environment.

Question:
How can I properly configure VSS Blueprint to use OpenAI’s GPT-4o instead of llm-svc without encountering the API key error? Are there any additional environment variables or configurations I should modify?

Any guidance would be greatly appreciated.

Thank you!

yuweiw · February 28, 2025, 1:48am

Could you refer to our configuring-for-gpt-4o to modify your overrides.yaml file?

hye.jang · February 28, 2025, 2:53am

yes. I add this code

          - name: VLM_MODEL_TO_USE
            value: openai-compat
          - name: OPENAI_API_KEY
            valueFrom:
              secretKeyRef:
                name: openai-api-key-secret
                key: OPENAI_API_KEY

It works fine when I just add the values above.

The ultimate goal is to use the openai api and not use llm-nim pod.
I want use openai instead of “http://nim-llm:8000”
so I add config in my values.
then I am getting openai api authentication error.

  configs:
    ca_rag_config.yaml:
      chat:
        embedding:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        llm:
          base_url: https://api.openai.com/v1
          model: gpt-4o
        reranker:
          base_url: http://nemo-rerank-ranking-deployment-ranking-service:8000/v1
      summarization:
        embedding:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        llm:
          base_url: https://api.openai.com/v1
          model: gpt-4o
    guardrails_config.yaml:
      models:
      - engine: nim
        model: gpt-4o
        parameters:
          base_url: https://api.openai.com/v1
        type: main
      - engine: nim_patch
        model: nvidia/llama-3.2-nv-embedqa-1b-v2
        parameters:
          base_url: http://nemo-embedding-embedding-deployment-embedding-service:8000/v1
        type: embeddings

yuweiw · February 28, 2025, 3:22am

So the VLM was successfully replaced with GPT-4o, but the LLM failed, right? We’ll look into the problem as soon as possible.

hye.jang · February 28, 2025, 3:57am

Yes, That’s correct. We want to replace the LLM with the OpenAI API. We’ll wait for your response. Thanks

yuweiw · March 4, 2025, 1:23am

Hi @hye.jang , could you modify the engine: nim to engine: openai in the config file and try it?

hye.jang · March 4, 2025, 3:41am

I confirmed that summarize using openai works(even if the llm pod is terminated after helm deployment) Thank you

system · March 18, 2025, 3:41am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Getting Error while running blueprint-vss demo Visual AI Agent	30	458	January 24, 2025
VSS Blueprint Helm Installation- Nemo embedding pod failure Visual AI Agent nim , llama	29	86	May 29, 2025
Unable to configure gpt-4o with VSS instead of vila using Openai-azure API key Visual AI Agent	17	99	June 12, 2025
VSS blueprint 2.2.0 - ERROR Failed to load VIA stream handler - Failed to generate TRT-LLM engine Visual AI Agent nim , llama-31-70b-instruct , llama	16	308	April 22, 2025
Deploying VSS blueprint using kubernetes Visual AI Agent	3	130	February 14, 2025
NIM embedding model downloads but fails with auth error on startup Access/Accounts nim , nv-embedqa-e5-v5	29	713	April 10, 2025
VSS 2.3.0 Docker remote_llm_deployment Failed to generate TRT-LLM engine Visual AI Agent nim , paligemma , kosmos-2 , llama	5	47	May 23, 2025
Broken GPU state query failure in AMD + H100 Confidential Computing	10	1105	February 15, 2024
Rror getting vGPU config: error getting all vGPU devices: unable to read MDEV devices directory: open /sys/bus/mdev/devices: no such file or directory General Topics and Other SDKs gpu	9	1688	October 19, 2023
Warning Unhealthy kubelet Startup probe failed: Get "v1/health/ready": dial tcp 10.1.124.81:8000: connect: connection refused Visual AI Agent nvbugs , nim , llama	31	244	April 14, 2025

VSS issue - API Key Issue When Using OpenAI GPT-4o Instead of LLM-SVC in VSS Blueprint

Related topics