Hi,
The configuration should be similar to the standard vLLM usage.
As @cailuyu suggested, please try if --max-model-len helps for your requirement.
Thanks.
Hi,
The configuration should be similar to the standard vLLM usage.
As @cailuyu suggested, please try if --max-model-len helps for your requirement.
Thanks.
Hello,
I was working on VLLM repository.
If you build vllm now, it is working fine for Thor
yes ,without backend setting, it will be ok!
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.