Does Stable Diffusion run on Jetson Orin Nano?

Hello, I am trying to install Stable Diffusion on Jetson Orin Nano but still not succeeded.
Does it work in Jetson Orin Nano? Does anyone succeed in execution?

Please refer to Stable Diffusion on Jetson AGX Orin and Xavier - Jetson & Embedded Systems / Jetson Projects - NVIDIA Developer Forums

I actually followd the instruction, but still not succeeded.
I tried enlarging memory by swap and still failed.
So I was wonderring if Jetson Orin Nano works as Jetson AGX Orin/Xavier. Is Jetson Orin Nano supposed to execute Stable Diffusion?

@seiazetsu you might want to try the stable-diffusion-webui, I was able to get that running on Orin Nano. Perhaps it uses different models/ect because the performance is different (faster) than what was on that GitHub.

I assumed Stable Diffusion WebUI just accept GUI.
What I would like to do is generate images with the list of prompts in csv or json.
Can we utilze stable_diffusion library with WebUI?

@seiazetsu I haven’t yet run standalone scripts that use the lower-level libraries directly (although I intend to soon), but I assume they work given that the webui also uses them and it works. Through the webui, I’ve been using the default model (stable-diffusion-1.5-ema-pruned), so perhaps with that configuration you’ll be able to run it?

Typically I mount extra swap, disable ZRAM, and disable the desktop GUI:

Next I will try the stable-diffusion library and script directly on Jetson Orin Nano.

Thank you very much. I haven’t succeeded yet.
I am glad if you tell me the result of your trial.


Hi @seiazetsu, strangely I haven’t been able to get stable-diffusion/ to work on Orin Nano either, it runs out of memory despite trying various arguments. I saw that on AGX Orin, it takes at least 12GB of memory.

Since stable-diffusion-webui works, my guess is that project has been optimized to be more memory efficient - at some point, I will have to look it over for the differences. I’ve also been meaning to try this TensorRT version: diffusers/examples/community/ at main · huggingface/diffusers · GitHub

1 Like

@seiazetsu I was able to get it running on Orin Nano 8GB using this more memory-optimized fork:

(using the optimizedSD/ script from that)

Thank you very much for your trying. Great to hear you succeeded.
I am trying with your references. After installation of several libraries and a couple of file modifications, I got this error:

AssertionError: Torch not compiled with CUDA enabled

I searched this error and some articles tell this is caused by the version of torch and torchivision. What version are you using?

@seiazetsu I installed the stable-diffusion stuff on top of l4t-ml container, which has PyTorch installed from here:

The PyTorch wheels from that topic were built with CUDA enabled.
I would recommend using a container for experimenting with this stuff to keep your environment clean.

Thank you very much. I succeeded finally.
However it took about 3 hours which was much longer than I expected…

Hmm, it took my Orin Nano 2 minutes to generate a 512x512 image with 25 steps, PLMS sampling, using the sd-v1-4.ckpt model. Do you have your swap mounted on NVME? Are you sure it’s using GPU? (you should be able to keep an eye on that with tegrastats or jtop)

According to jtop, it doesn’t use GPU (please refer to the attached screenshot).

What I did is

  • downloaded l4t-ml container and ran it
  • Manually installed some libraries (because these were not installed in the container)
pip install omegaconf pillow tqdm einops transformers taming-transformers scipy kornia -e .
  • Replaced with this file
    taming-transformers/taming/modules/vqvae/ at master · CompVis/taming-transformers · GitHub
  • downloaded sd-v1-4.ckpt as model.ckpt
  • in ~/stable-diffusion/optimizedSD/, changed
    from pytorch_lightning.utilities.distributed import rank_zero_only_
    from pytorch_lightning.utilities.rank_zero import rank_zero_only
  • executed the command
    python3 optimizedSD/ --prompt “Cyberpunk style image of a Tesla car reflection in rain” --H 256 --W 256 --seed 27 --n_iter 2 --n_samples 5 --ddim_steps 50

I checked if GPU available and it is available.

I am glad if you can tell me what is wrong or lack.
I appreciate if you can show how you successfully executed step by step.

@seiazetsu I build the latest l4t-ml container from GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T, and then built this Dockerfile on top of it:

ARG BASE_IMAGE=l4t-ml:r35.3.1

WORKDIR /diffusion

# stable-diffusion-webui

RUN git clone --branch ${STABLE_DIFFUSION_WEBUI_VERSION} && \
    cd stable-diffusion-webui && \
    git clone extensions-builtin/stable-diffusion-webui-tensorrt && \
    python3 -c 'from modules import launch_utils; launch_utils.prepare_environment()'
# re-install OpenCV (stable-diffusion-webui installs a conflicting version)
# partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
ARG OPENCV_DEB=OpenCV-4.5.0-aarch64.tar.gz

COPY docker/containers/scripts/ /tmp/
RUN cd /tmp && ./ ${OPENCV_URL} ${OPENCV_DEB}

# stable-diffusion
RUN pip3 install --no-cache-dir --verbose clip kornia taming-transformers invisible-watermark einops
RUN wget -O /usr/local/lib/python3.8/dist-packages/taming/modules/vqvae/
RUN git clone

ENV PYTHONPATH=${PYTHONPATH}:/diffusion/stable-diffusion

# small-stable-diffusion
RUN git clone small-stable-diffusion
# import tensorflow => ImportError: cannot import name '_message' from 'google.protobuf.pyext'
RUN pip3 install --force-reinstall protobuf==3.20.3
# model cache directories
ENV TRANSFORMERS_CACHE=/diffusion/data/huggingface
ENV NEMO_CACHE_DIR=/diffusion/data/nemo
ENV TORCH_HOME=/diffusion/data/torch
ENV DATA=/diffusion/data

RUN env

This is just for my testing at this time, and may require some experimentation on your end to get fully working, as you have found.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.