CUDA Error / Ubuntu / Ampere / 3090 - Constant CUDA error: an illegal instruction was encountered

ivan.larsen1 · December 21, 2025, 2:42am

HELP! This is beating up a keyboard for two days without success.

I have a Ampere 3090 with SMI - information - NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0. Distributor ID: Ubuntu

Description: Ubuntu 24.04.3 LTS

Release: 24.04

Codename: noble. MY code is simple, simple. —-

import torch
from diffusers import ZImagePipeline

# 1. Load the pipeline
# Use bfloat16 for optimal performance on supported GPUs
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=False,
)
pipe.to("cuda")

# [Optional] Attention Backend
# Diffusers uses SDPA by default. Switch to Flash Attention for better efficiency if supported:
# pipe.transformer.set_attention_backend("flash")    # Enable Flash-Attention-2
# pipe.transformer.set_attention_backend("_flash_3") # Enable Flash-Attention-3

# [Optional] Model Compilation
# Compiling the DiT model accelerates inference, but the first run will take longer to compile.
# pipe.transformer.compile()

# [Optional] CPU Offloading
# Enable CPU offloading for memory-constrained devices.
# pipe.enable_model_cpu_offload()

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."

# 2. Generate Image
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,  # This actually results in 8 DiT forwards
    guidance_scale=0.0,     # Guidance should be 0 for the Turbo models
    generator=torch.Generator("cuda").manual_seed(42),
).images[0]

image.save("example.png")

Here's the URL:  https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

cceleratorError: CUDA error: an illegal instruction was encountered Search for cudaErrorIllegalInstruction' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Curefab · December 22, 2025, 1:47pm

As you are calling Torch, the error probably is within the torch library calling Cuda code.

Is your torch version compatible to your RTX 3090? (compute capability 8.6)? Is the memory large enough?

Does it find the correct GPU? Is it compatible to your SDK (13.0) and your driver version?

Can you run other Cuda programs?

Can you configure the target GPU type (compute capability 8.6)?

ivan.larsen1 · December 22, 2025, 3:29pm

I tried with different models (mostly stableAI ) but after working through a few things via Anthropic, it found that there was a known bug with 3090/Amperes and the Clip/T5 encoders. Stress tests worked fine, memory, etc. Not sure how to handle it because most models want either the newer OpenAI Clip or T5 encoders.

linzheng0428 · December 27, 2025, 8:10am

On Ampere (3090), a constant illegal instruction is very rarely caused by kernel logic.
In practice it almost always comes from a binary / PTX compatibility mismatch — e.g. SASS targeting a different SM, or PTX JIT running on a driver that doesn’t fully support the generated PTX.

This shows up a lot with containers, cached builds, or mixed compute_XX / sm_YY flags.
I’d first verify what SASS is actually executed and whether the driver is JIT’ing PTX at runtime.

ivan.larsen1 · December 27, 2025, 9:51pm

I genuinely don’t know where to start with what you replied with. What are the first 5 commands I can use in unix to determine your questions?

Curefab · December 27, 2025, 11:00pm

It is rather an error within pyTorch or at least the installation than how you are calling it.

ivan.larsen1 · December 27, 2025, 11:24pm

What would be helpful to thiss community in order to debug the PyTorch? Again, I genuinely don’t know where to start with this particular one, as my EVGA 3060 and EVGA 3090 are working without issues.

Curefab · December 27, 2025, 11:43pm

So you have two RTX 3090 GPUs on the same system, one is working, one is showing this error?

ivan.larsen1 · December 28, 2025, 12:04am

Yes, I have a Zotac 3090, EVGA 3090 and EVGA 360. EVGA’s are working with no issues

Topic		Replies	Views
CUDA illegal instruction on 3090, Ubuntu 20.04, Pytorch 1.11 Cuda 11.3, cudnn 8.2.0 Linux	8	2596	December 14, 2022
PyTorch CUDA Errors on Ubuntu 22 with RTX 3090 Ti x2 CUDA Setup and Installation cuda , ubuntu , pytorch , python	5	4779	April 29, 2023
Error running pytorch on RTX3090/3060 Frameworks (archived) cuda , pytorch , python	0	973	January 13, 2023
RuntimeError: CUDA error: no kernel image is available for execution on the device Linux	29	82625	February 22, 2021
Problem With 3090 in ubuntu 18.04 Container: CUDA cuda , ubuntu , python	0	1227	March 10, 2022
Illegal instruction jetson nano - yolov8 Jetson Nano yolo	4	1533	January 3, 2024
RuntimeError: CUDA error: no kernel image is available for execution on the device Linux cuda	2	2086	July 10, 2022
Newbie 5090 passing CUDA_LAUNCH_BLOCKING=1 problem CUDA Programming and Performance	9	3085	March 8, 2025
Training GAN with RTX3060 as computing accelerator face that nan is always displayed Linux cuda	6	886	March 2, 2021
RuntimeError: CUDA error: no kernel image is available for execution on the device on RTX 3060 Linux	3	4312	July 18, 2022

CUDA Error / Ubuntu / Ampere / 3090 - Constant CUDA error: an illegal instruction was encountered

Related topics