Dear all,
I am working for several days to try to have a Stable Diffusion (A1111 or Forge) running on my 4GB Jetson Nano.
I am here to share what I was able to achieve and looking for your help to overcome, I hope, the final step !
The closest solution I found is:
- Flash your SD card with the Linux 20.04 provided by Qengineering, tutorial and material here: GitHub - Qengineering/Jetson-Nano-Ubuntu-20-image: Jetson Nano with Ubuntu 20.04 image
- Install pyenv and global python 3.10.14, source: How to install 'pyenv' Python version manager on Ubuntu 20.04 (N.B: Python 3.10 is required for SD)
- Build your Torch 1.13 and Torchvision 0.14.0 on your jetson with python 3.10.14 (I can provide mine if needed, it’s 12h in total to build), tutorial in section “installation from scratch”: Install PyTorch on Jetson Nano - Q-engineering
- git clone the Stable Diffusion repository
- In the requirements, modify the line “python-lightening==1.9.4” into “python-lightening==1.8.6”, source: python - Cannot create weak reference to 'Weakcallableproxy' object in Pytorch Module - Stack Overflow
- Create the venv using “python -m venv venv”
- You have to reference the library with: “export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libGLdispatch.so.0”
- then install your torch and torchvision wheels with “pip install torch-1.13.0a0+git7c98e70-cp310-cp310-linux_aarch64.whl” and “pip install torchvision-0.14.0a0-cp310-cp310-linux_aarch64.whl”
- launch the SD installation with “./webui.sh”
With the previous steps, the install of the Stable Diffusion is able to proceed smoothly, and it can use torch with CUDA enabled (a mandatory step to use all the potentialities of the Jetson, as I understood).
The problem appears when the SD installation prepares the model (in this case the SD1.5 pruned v1-5-pruned-emaonly.ckpt · runwayml/stable-diffusion-v1-5 at main), in particular during the “loading weights” phase. After freezing, the bash process is killed, i suppose due to a Out Of Memory issue, but I don’t know how to diagnose it properly.
I have tried to add the recommended arguments for low configurations or optimizing for NVIDIA GPUs running SD: --lowvram, --cuda-stream --pin-shared-memory but in anycase it looks like a OOM issue.
Do you think about anything that might be able to make it work ?
Thanks a lot for your support,
Kind regards