Sure. Might take a few hours until support across all inference servers pop up.
I already tried vLLM with latest Transformers v5.5.0 (which is required), but I failed:
llama.cpp has added support already: