Issue with run gpt-oss-120b in vLLM

AastaLLL · October 13, 2025, 5:23am

Hi,

The configuration should be similar to the standard vLLM usage.
As @cailuyu suggested, please try if --max-model-len helps for your requirement.

Thanks.

johnny_nv · October 13, 2025, 9:40pm

Hello,
I was working on VLLM repository.
If you build vllm now, it is working fine for Thor

cailuyu · October 18, 2025, 1:27pm

yes ,without backend setting, it will be ok！

system · November 19, 2025, 2:50am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Announcing new VLLM container & 3.5X increase in Gen AI Performance in just 5 weeks of Jetson AGX Thor Launch Jetson Thor jetson , llama-31-8b-instruct , llama , deepseek , nemotron	46	3269	December 14, 2025
Thor开发板上测试vllm失败 Jetson Thor generative_ai	9	261	November 5, 2025
Install vllm in Thor failed Jetson Thor generative_ai	6	1031	October 16, 2025
Run VLLM in Thor from VLLM Repository Jetson Thor	15	1602	November 29, 2025
求救，运行vllm报错 Jetson Thor camera , generative_ai	4	356	November 17, 2025
vLLM container 25.10-py3 fails to start Jetson Thor nvbugs , generative_ai	13	553	December 8, 2025
vLLM 0.12.x Container for jetson Thor Jetson Thor generative_ai	4	167	January 8, 2026
vLLM Compatibility Problem with GPT OSS 120B and OpenClaw by spark-vllm-docker DGX Spark / GB10 cuda	17	807	February 13, 2026
Run VLLM in Spark DGX Spark / GB10	143	9755	January 31, 2026
Support for openai_gptoss reasoning parser in vLLM, and its impact on the effective inference performance on Spark DGX Spark / GB10 benchmarks	8	322	February 9, 2026