Hosted Integrate /v1/responses returns 404 across multiple models while /v1/models and /v1/chat/completions work

I did a hosted compatibility repro against https://integrate.api.nvidia.com/v1 and put the artifacts here:

The question is narrower than “Codex support” or “NIM support” generally. I’m trying to clarify the current hosted Integrate contract for /v1/responses.

Exact results:

  • direct GET /v1/models returned 200
  • direct POST /v1/chat/completions returned 200
  • direct POST /v1/responses returned 404 page not found
  • the same /v1/responses 404 also appeared in a widened six-model matrix on the same hosted Integrate surface across NVIDIA, Meta, and Mistral instruct models:
    • nvidia/nemotron-3-super-120b-a12b
    • nvidia/llama-3.3-nemotron-super-49b-v1
    • nvidia/llama-3.1-nemotron-70b-instruct
    • nvidia/nemotron-mini-4b-instruct
    • meta/llama-3.3-70b-instruct
    • mistralai/mistral-large
  • NVIDIA’s current LLM NIM release notes describe experimental Responses API support, which is why I’m asking whether this hosted result is expected

The narrow question is:

  • is /v1/responses expected to work today on the hosted Integrate surface?
  • if yes, is it limited to a different subset of models, accounts, or endpoint variants than the ones tested here?

If it helps, I can also provide the exact per-model request/response captures from the repro repo artifacts.