Having issue running NeMo Guardrail Docker Image

I tried to run NeMo Guardrail docker image, nvcr.io/nvidia/nemo-microservices/guardrails:25.11. I mounted my config.yml file to the docker container and have self-hosted LLM deployed through Nvidia NIM framework.

This is my //config-store/nemoguard/config.yml

models:
  - type: main
    engine: nim
    parameters:
      base_url: https://<masked>/v1
      model_name: Meta-Llama-3.3-70B-Instruct

  - type: content_safety
    engine: nim
    parameters:
      base_url: https://<masked>/v1
      model_name: Qwen2.5-0.5B-Instruct

  - type: topic_control
    engine: nim
    parameters:
      base_url: https://<masked>/v1
      model_name: Qwen2.5-0.5B-Instruct

rails:
  input:
    parallel: true
    flows:
      - content safety check input $model=content_safety
      - topic safety check input $model=topic_control
  output:
    parallel: false
    flows:
      - content safety check output $model=content_safety
prompts:
  - task: content_safety_check_input $model=content_safety
    content: |
      Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.

      <BEGIN UNSAFE CONTENT CATEGORIES>
      S1: Violence.
      S2: Sexual.
      S3: Criminal Planning/Confessions.
      S4: Guns and Illegal Weapons.
      S5: Controlled/Regulated Substances.
      S6: Suicide and Self Harm.
      S7: Sexual (minor).
      S8: Hate/Identity Hate.
      S9: PII/Privacy.
      S10: Harassment.
      S11: Threat.
      S12: Profanity.
      S13: Needs Caution.
      S14: Other.
      S15: Manipulation.
      S16: Fraud/Deception.
      S17: Malware.
      S18: High Risk Gov Decision Making.
      S19: Political/Misinformation/Conspiracy.
      S20: Copyright/Trademark/Plagiarism.
      S21: Unauthorized Advice.
      S22: Illegal Activity.
      S23: Immoral/Unethical.
      <END UNSAFE CONTENT CATEGORIES>

      <BEGIN CONVERSATION>

      user: {{ user_input }}

      <END CONVERSATION>

      Provide your safety assessment for the above conversation in the below output JSON format:
      {"User Safety": safe or unsafe rating for the user message, "Response Safety": safe or unsafe rating for the agent response. Omit if no agent response present. "Safety Categories": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}

      Do not include anything other than the output JSON in your response.
      Output JSON:
    output_parser: nemoguard_parse_prompt_safety
    max_tokens: 50

  - task: content_safety_check_output $model=content_safety
    content: |
      Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.

      <BEGIN UNSAFE CONTENT CATEGORIES>
      S1: Violence.
      S2: Sexual.
      S3: Criminal Planning/Confessions.
      S4: Guns and Illegal Weapons.
      S5: Controlled/Regulated Substances.
      S6: Suicide and Self Harm.
      S7: Sexual (minor).
      S8: Hate/Identity Hate.
      S9: PII/Privacy.
      S10: Harassment.
      S11: Threat.
      S12: Profanity.
      S13: Needs Caution.
      S14: Other.
      S15: Manipulation.
      S16: Fraud/Deception.
      S17: Malware.
      S18: High Risk Gov Decision Making.
      S19: Political/Misinformation/Conspiracy.
      S20: Copyright/Trademark/Plagiarism.
      S21: Unauthorized Advice.
      S22: Illegal Activity.
      S23: Immoral/Unethical.
      <END UNSAFE CONTENT CATEGORIES>

      <BEGIN CONVERSATION>

      user: {{ user_input }}

      response: agent: {{ bot_response }}

      <END CONVERSATION>

      Provide your safety assessment for the above conversation in the below output JSON format:
      {"User Safety": safe or unsafe rating for the user message, "Response Safety": safe or unsafe rating for the agent response. Omit if no agent response present. "Safety Categories": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}

      Do not include anything other than the output JSON in your response.
      Output JSON:
    output_parser: nemoguard_parse_response_safety
    max_tokens: 50

  - task: topic_safety_check_input $model=topic_control
    content: |

This is the script that I run:

docker run --rm -it \
  -v /<masked path>/config-store:/config-store \
  -e CONFIG_STORE_PATH="/config-store" \
  -e NVIDIA_API_KEY=$NGC_API_KEY \
  -e DEFAULT_CONFIG_ID="default" \
  -e DEFAULT_LLM_PROVIDER="nim" \
  -e NEMO_GUARDRAILS_SERVER_ENABLE_CORS="False" \
  -e NEMO_GUARDRAILS_SERVER_ALLOWED_ORIGINS="*"\
  -u $(id -u) \
  -p 7331:7331 \
  $IMG_NAME

But I got this error when I run “http://localhost:7331/v1/guardrail/chat/completions” endpoint.
This is the code I run.

# Set the API key and base URL
guardrails_base_url = "http://localhost:7331"

# Set the API endpoint and parameters
endpoint = f"{guardrails_base_url}/v1/guardrail/chat/completions"
params = {
    "model": "Meta-Llama-3.3-70B-Instruct",
    "messages": [{"role": "user", "content": "You are stupid"}],
    "guardrails": {"config_id": "nemoguard"},
    "temperature": 0.0,
    "top_p": 1,
}

# Make the API request
response = requests.post(
    endpoint, headers={"Content-Type": "application/json"}, data=json.dumps(params)
)

# Print the response
print(response.json())
NFO:     172.16.2.1:38880 - "POST /v1/guardrail/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/langchain_initializer.py", line 161, in init_langchain_model
    result = try_initialization_method(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/langchain_initializer.py", line 106, in try_initialization_method
    result = initializer.execute(
             ^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/langchain_initializer.py", line 74, in execute
    return self.init_method(model_name, provider_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/langchain_initializer.py", line 253, in _init_text_completion_model
    provider_cls = _get_text_completion_provider(provider_name)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/providers/providers.py", line 185, in _get_text_completion_provider
    raise RuntimeError(f"Could not find LLM provider '{provider_name}'")
RuntimeError: Could not find LLM provider 'nimchat'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__
    with recv_stream, send_stream, collapse_excgroups():
  File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/app/.venv/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/middlewares.py", line 25, in capture_trace_id
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
    raise app_exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__
    with recv_stream, send_stream, collapse_excgroups():
  File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/app/.venv/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/middlewares.py", line 43, in add_request_id_header
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
    raise app_exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__
    with recv_stream, send_stream, collapse_excgroups():
  File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/app/.venv/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/middlewares.py", line 62, in log_requests
    raise e
  File "/app/.venv/lib/python3.11/site-packages/guardrails/middlewares.py", line 56, in log_requests
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
    raise app_exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/app/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 735, in __call__
    await self.app(scope, otel_receive, otel_send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
    await route.handle(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
    await self.app(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 78, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/app/.venv/lib/python3.11/site-packages/starlette/routing.py", line 75, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 302, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 213, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/apis/v1/routers/openai/chat.py", line 102, in chat_completion
    result = await request_handler.handle_request()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/handlers/completion.py", line 102, in handle_request
    await self.instantiate_llm_rails(config_ids, token)
  File "/app/.venv/lib/python3.11/site-packages/guardrails/handlers/completion.py", line 338, in instantiate_llm_rails
    self.llm_rails = await self.rails_service.get_rails(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/guardrails/services/rails/service.py", line 45, in get_rails
    llm_rails = LLMRails(config=rails_config)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/rails/llm/llmrails.py", line 256, in __init__
    self._init_llms()
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/rails/llm/llmrails.py", line 446, in _init_llms
    self.llm = init_llm_model(
               ^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/initializer.py", line 51, in init_llm_model
    return init_langchain_model(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/nemoguardrails/llm/models/langchain_initializer.py", line 194, in init_langchain_model
    raise ModelInitializationError(base) from last_exception
nemoguardrails.llm.models.langchain_initializer.ModelInitializationError: Failed to initialize model 'Meta-Llama-3.3-70B-Instruct' with provider 'nimchat' in 'chat' mode: Could not find LLM provider 'nimchat'

I think the error is due to the lack of langchain-nvidia-ai-endpoints library. Is that the issue leading to the error? If so, can you fix that issue in the next release of the NeMo guardrail docker image?

Hello. We have stopped supporting yaml files as config in the config-store. Can you follow the docs, to add any new configuration using the curl command. Your above mentioned config is right. So, you can still proceed to add the same.

Here is the link to follow - Creating a guardrails configuration