Noob with Noob questions...again!

Hello bright minds!

I have a rather dumb question I have noticed that if I leave my spark up and running, say with a model loaded onto it with vLLM and I disconnect all my windows/sessions. I can come back in say 20-30 minutes and login and see the dashboard still has the model loaded (RAM usage is high), but if I let it simmer say over night I always come back to a system where vLLM has stopped.

So is there like an idle timeout built in to vllm or the spark that says if the model hasn’t done anything in X time kill the process to conserve power? I do apologize for the truly dumb question but alas its still an issue I have!

Also both my boss and others told me to use tmux so that i could have persistent ssh connections. I thought that meant if I started up say vllm in it and I signed out of my laptop I could come back and log into the spark and type tmux and it would still pop up in whatever state it in. Clearly it doesn’t work that way I tried it and found myself staring at a blank window. Is it mean to be left open (the tmux) terminal is that the trick?

Sorry guys I’m trying and I AM learning…slowly!

Thanks in advance and your patience for what is probably a giant eye-roll to most.

Cheers.

Pipe your logs to a file and see if the process is crashing. You can use the -d flag on docker or the launch_cluster.sh script to launch detached. You can also use screen to launch the process and not have it killed if you log out.

1 Like

Logs would be helpful here. For tmux, make sure to look up how to reattach to an existing session: Tmux cheat sheet

1 Like

Thanks gents! I will give this a shot and see how it goes. Fighting another battle right now. I’m trying to load a model deepseek 14B model in Eugr’s vLLM configuration, but its god awfully slow. I know its user error so I’m trying to figure it out.