New router mode in llama.cpp

This should make life easier :

“llama.cpp server now ships with router mode, which lets you dynamically load, unload, and switch between multiple models without restarting.”

Yep. Llama-swap is still more flexible, but if you don’t need it to launch vllm/sglang/etc, then built-in functionality is great.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.