This should make life easier :
“llama.cpp server now ships with router mode, which lets you dynamically load, unload, and switch between multiple models without restarting.”
This should make life easier :
“llama.cpp server now ships with router mode, which lets you dynamically load, unload, and switch between multiple models without restarting.”
Yep. Llama-swap is still more flexible, but if you don’t need it to launch vllm/sglang/etc, then built-in functionality is great.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.