Hi @sanujen.20,
Here are some troubleshooting steps:
- Stop all JPS services*
Stop any JPS foundation services you launched with systemctl such as redis, monitoring or ingress.
Stopping redis for example:
sudo systemctl stop jetson-redis
- Stop all docker containers
If any of the workflows are running such as ai-nvr, zero shot detection or the vlm service bring them down with their respective docker compose down
commands.
Check for any other running docker containers with
docker ps
Then stop any running docker containers:
docker stop <container_name>
- Delete the VLM model folder
If the VLM model was partially downloaded or failed to finish quantization, you will need to delete the model folder so next time the VLM service is launched it will redownload and optimize the model.
It is easiest to delete the whole VLM folder. This will delete any downloaded VLM models.
sudo rm -rf /data/vlm
Be careful here, you do not want to delete the entire /data folder just the vlm folder inside /data.
-
Restart the Jetson
-
Verify there is sufficient disk space
When the jetson comes back up run
df /data -h
This will print out the disk usage for the /data folder. Verify there is sufficient space for the VLM model. Vila 2.7B requires 7.1 GBs.
- Launch the VLM service again configured to VILA2.7b
Follow the documentation to launch the necessary foundation services and the VLM service again.
- Monitor the memory usage and the VLM status
Monitor the memory usage with top or jtop.
Check the VLM status at the health endpoint http://0.0.0.0:5015/v1/health
If this does not work, try using “Efficient-Large-Model/VILA1.5-3b” and see if you get different results from this.
Thanks,
Sammy Ochoa