Unable to build stable-diffusion-webui

I have an AGX Orin 64GB developer kit machine.

I am attempting to build/install stable-diffusion, and followed the instructions here: Stable Diffusion - NVIDIA Jetson AI Lab

I ran:

git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh

and these succeeded.

The last, deceptively simple, command

jetson-containers run $(autotag stable-diffusion-webui)

failed.

Now, because this command results in about 3 hours of downloading and compiling (apparently inside of docker containers) and more downloading and more compiling, I have very little understanding of just where it failed.

Also, that it prints out compiler warnings in red makes it even less clear. (Was there no consideration that this process might ever fail?)

It appears that the errors (there were many – fail fast anyone?) that eventually derailed things were all C++ compiler errors (I can say none of them were warnings).

What was it in the process of building? I’m not rightly sure. It seems that there was a Python script involved, that in turn invoked cmake with the arguments --build /opt/onnxruntime/build/Linux/Release --config Release -- -j12
So I guess onnxruntime?

It returned exit code 2, so if anyone knows what that means, let me know.

The basic tenor of the compiler errors is something like:

/opt/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc: In member function ‘onnxruntime::common::Status onnxruntime::Min_6<T>::Compute(onnxruntime::OpKernelContext*) const [with T = float]’:
/opt/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:708:56: error: no matching function for call to ‘Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >::min<Eigen::PropagateNaN>(Eigen::ArrayWrapper<Eigen::Map<const Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >)’

These are details to which that I would expect not to be exposed.

Because it was building Docker containers (why are they not already pre-built?), here is the output of docker images so you have an idea of how far it got:

REPOSITORY                       TAG                       IMAGE ID       CREATED         SIZE
<none>                           <none>                    31b45dbc2d9a   40 hours ago    18GB
stable-diffusion-webui           r36.4.0-tensorrt          4e0a103e0ae3   40 hours ago    18GB
stable-diffusion-webui           r36.4.0-opencv            9240c676737d   40 hours ago    14.9GB
stable-diffusion-webui           r36.4.0-pycuda            ff52f2d7ad0a   42 hours ago    10.1GB
stable-diffusion-webui           r36.4.0-xformers          92231cb73572   42 hours ago    10.1GB
stable-diffusion-webui           r36.4.0-transformers      b7d174f61cb5   42 hours ago    10.1GB
stable-diffusion-webui           r36.4.0-rust              303a52fabec7   42 hours ago    9.6GB
stable-diffusion-webui           r36.4.0-huggingface_hub   bea8e3bbef69   42 hours ago    8.09GB
stable-diffusion-webui           r36.4.0-torchvision       7f81cb6ad021   42 hours ago    8.08GB
stable-diffusion-webui           r36.4.0-pytorch_2.5       2c76ab09bd02   42 hours ago    8.05GB
stable-diffusion-webui           r36.4.0-onnx              bac4167a6f7c   42 hours ago    7.13GB
stable-diffusion-webui           r36.4.0-cmake             564369fc18ff   42 hours ago    7.1GB
stable-diffusion-webui           r36.4.0-numpy             c4d4f71d19f0   42 hours ago    7.04GB
stable-diffusion-webui           r36.4.0-python            5edc6ba09a2f   42 hours ago    6.98GB
stable-diffusion-webui           r36.4.0-cudnn_9.4         d47514ce3bd0   42 hours ago    6.91GB
stable-diffusion-webui           r36.4.0-cuda_12.6         4ead91dcf5fd   42 hours ago    5.91GB
stable-diffusion-webui           r36.4.0-pip_cache_cu126   652b1fc7a9d5   43 hours ago    723MB
stable-diffusion-webui           r36.4.0-build-essential   b5b17627d26e   43 hours ago    723MB
ubuntu                           22.04                     981912c48e9a   3 months ago    69.2MB

Presumably it failed while trying to build container id 31b45dbc2d9a?

I have attached the (as far as I can tell) relevant portion of the stdout.
stable-diffusion-webui-log.txt (392.8 KB)

If more information is needed, please let me know and I will provide it.

Thanks!

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

3. Tutorial

Startup deep learning tutorial:

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

Thank you for those suggestions, however they do not solve the problem I describe.

Hi,

Thanks for the feedback.
We will give it a try and provide more info to you later.

Thanks.

Thank you. I hope this can be figured out.

Hi,

It looks like you are facing the same issue as the below link:

In the following comment, check out the v1.18.2 ONNXRuntime branch seems to fix the issue.
Could you give it a try?

This can be done similarly as the below change:

Thanks.

Hi,
I am not sure where or how I should do this. It is building the runtime (apparently) inside a Docker container, is it not? I don’t know which container, or what script needs to be modified.

Thanks.

Ok… I replied too fast. I found the file referenced in the your second link (not in a Docker image or container), but I still do not understand what I should do with it.

Hi,

When you clone the repository, please edit the [jetson-containers]/packages/ml/onnxruntime/config.py :

Change the line #39 from

onnxruntime('1.21', requires=['>=36', '>=cu124'], default=False, branch='main'),

into

onnxruntime('1.18.2', requires=['>=36', '>=cu124'], default=False, branch='main'),

Then rebuild the container with below command:

$ bash jetson-containers/install.sh
$ jetson-containers run $(autotag stable-diffusion-webui)

Thanks.

Thank you for those instructions.

I made those changes and attempted to rebuild, however it failed, in apparently the same or similar fashion:

gmake: *** [Makefile:146: all] Error 2
Traceback (most recent call last):
  File "/opt/onnxruntime/tools/ci_build/build.py", line 2998, in <module>
    sys.exit(main())
  File "/opt/onnxruntime/tools/ci_build/build.py", line 2888, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/opt/onnxruntime/tools/ci_build/build.py", line 1739, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/opt/onnxruntime/tools/ci_build/build.py", line 867, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
  File "/opt/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/local/bin/cmake', '--build', '/opt/onnxruntime/build/Linux/Release', '--config', 'Release', '--', '-j12']' returned non-zero exit status 2.
The command '/bin/sh -c /tmp/onnxruntime/install.sh || /tmp/onnxruntime/build.sh' returned a non-zero code: 1
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/media/user/SSD/repos/jetson-containers/jetson_containers/tag.py", line 58, in <module>
    image = find_container(args.packages[0], prefer_sources=args.prefer, disable_sources=args.disable, user=args.user, quiet=args.quiet)
  File "/media/user/SSD/repos/jetson-containers/jetson_containers/container.py", line 537, in find_container
    return build_container('', package) #, simulate=True)
  File "/media/user/SSD/repos/jetson-containers/jetson_containers/container.py", line 147, in build_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)  
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'DOCKER_BUILDKIT=0 docker build --network=host --tag stable-diffusion-webui:r36.4.0-onnxruntime --file /media/user/SSD/repos/jetson-containers/packages/ml/onnxruntime/Dockerfile --build-arg BASE_IMAGE=stable-diffusion-webui:r36.4.0-tensorrt --build-arg ONNXRUNTIME_VERSION="1.20.0" --build-arg ONNXRUNTIME_BRANCH="v1.20.0" --build-arg ONNXRUNTIME_FLAGS="--allow_running_as_root" /media/user/SSD/repos/jetson-containers/packages/ml/onnxruntime 2>&1 | tee /media/user/SSD/repos/jetson-containers/logs/20250113_134441/build/stable-diffusion-webui_r36.4.0-onnxruntime.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.
-- Error:  return code 1
V4L2_DEVICES: 
### DISPLAY environmental variable is already set: ":10.0"
localuser:root being added to access control list
+ docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /media/user/SSD/repos/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb -e DISPLAY=:10.0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 --name jetson_container_20250113_162512
"docker run" requires at least 1 argument.
See 'docker run --help'.

Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Create and run a new container from an image

I don’t know if the following is related, but it doesn’t seem correct:

Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.zeros(0).cuda()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 310, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled```


I have to ask: Has anyone, after upgrading to Jetpack 6.1 gotten stable-diffusion to work?  Do we know that it, in fact, works with the latest version? (Is 6.1 still latest version?)

Thanks.

Hi,

Thanks for your patience.

We do see some similar failures in onnxruntime of stable-diffusion-webui container.
But the related stable-diffusion container can work as expected.

Is the container an option for you?

$ jetson-containers build stable-diffusion
...
Samples finished in 1.90 minutes and exported to /data/images/stable-diffusion/a_photograph_of_an_astronaut_riding_a_horse
 Seeds used = 42
	Command being timed: "python3 optimizedSD/optimized_txt2img.py --sampler plms --seed 42 --n_samples 1 --n_iter 1 --ddim_steps 25 --outdir /data/images/stable-diffusion --ckpt /data/models/stable-diffusion/sd-v1.4.ckpt --prompt a photograph of an astronaut riding a horse"
	User time (seconds): 66.48
	System time (seconds): 33.68
	Percent of CPU this job got: 80%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 2:04.31
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 8945716
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 1337
	Minor (reclaiming a frame) page faults: 1493025
	Voluntary context switches: 11149
	Involuntary context switches: 2966
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
-- Done building container stable-diffusion:r36.4.0

Thanks.

I do not believe I have tried building anything other than the webui container (but I have tried some many things I can’t be sure). I have just started the jet-containers build stable-diffusion command. I will report back on its success/failure.

Thanks!

It is not clear to me whether it worked or not. I feel like it did not work properly. Is this bad:

memory capacity:  64348992 KB
running scripts/txt2img.py
Traceback (most recent call last):
  File "/opt/stable-diffusion/scripts/txt2img.py", line 2, in <module>
    import cv2
  File "/usr/local/lib/python3.10/dist-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/cv2/__init__.py", line 153, in bootstrap
    native_module = importlib.import_module("cv2")
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libtesseract.so.4: cannot open shared object file: No such file or directory
Command exited with non-zero status 1

This was near the end. All of the other output looked ok, but there’s no way for me to know.

The very next stuff in the log is:

python3 scripts/txt2img.py --plms --n_samples 1 --n_iter 1 --ddim_steps 25 --outdir /data/images/stable-diffusion --ckpt /data/models/stable-diffusion/sd-v1.4.ckpt --prompt a photograph of an astronaut riding a horse

Where is /data/images/stable-diffusion? Inside a container?

I just looked in the only container that was built today, and there is no directory /data/images

Hi,

Do you meet the libtesseract.so.4 error when running the building command?
The data folder is created in one of the building steps. Are there other errors during the building process?

In our testing, we can run the below command as expected:

python3 optimizedSD/optimized_txt2img.py --sampler plms --seed 42 --n_samples 1 --n_iter 1 --ddim_steps 25 --outdir /data/images/stable-diffusion --ckpt /data/models/stable-diffusion/sd-v1.4.ckpt --prompt a photograph of an astronaut riding a horse"

Thanks.