Model outputs contain the literal control token string <|return|> inside returned content.
The returned completion object does not include a reasoning_content field despite reasoning_effort being set.
When reasoning_effort is provided, the response should include a reasoning_content field (or documented equivalent) containing structured reasoning or chain-of-thought, or an explicit documented indication that reasoning output is unavailable.
No internal tokens like <|return|> should appear in user-visible output.
from openai import OpenAI
client = OpenAI(
base_url="https://integrate.api.nvidia.com/v1",
api_key=""
)
completion = client.chat.completions.create(
model="openai/gpt-oss-120b",
reasoning_effort="medium",
messages=[{"role":"user","content":"who are you?"}],
temperature=0.6,
top_p=0.9,
max_tokens=4096,
response_format={"type": "json_object"}
)