CUDA Error / Ubuntu / Ampere / 3090 - Constant CUDA error: an illegal instruction was encountered

HELP! This is beating up a keyboard for two days without success.

I have a Ampere 3090 with SMI - information - NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0. Distributor ID: Ubuntu

Description: Ubuntu 24.04.3 LTS

Release: 24.04

Codename: noble. MY code is simple, simple. —-

import torch
from diffusers import ZImagePipeline

# 1. Load the pipeline
# Use bfloat16 for optimal performance on supported GPUs
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=False,
)
pipe.to("cuda")

# [Optional] Attention Backend
# Diffusers uses SDPA by default. Switch to Flash Attention for better efficiency if supported:
# pipe.transformer.set_attention_backend("flash")    # Enable Flash-Attention-2
# pipe.transformer.set_attention_backend("_flash_3") # Enable Flash-Attention-3

# [Optional] Model Compilation
# Compiling the DiT model accelerates inference, but the first run will take longer to compile.
# pipe.transformer.compile()

# [Optional] CPU Offloading
# Enable CPU offloading for memory-constrained devices.
# pipe.enable_model_cpu_offload()

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."

# 2. Generate Image
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,  # This actually results in 8 DiT forwards
    guidance_scale=0.0,     # Guidance should be 0 for the Turbo models
    generator=torch.Generator("cuda").manual_seed(42),
).images[0]

image.save("example.png")

Here's the URL:  https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

cceleratorError: CUDA error: an illegal instruction was encountered Search for cudaErrorIllegalInstruction' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.


As you are calling Torch, the error probably is within the torch library calling Cuda code.

Is your torch version compatible to your RTX 3090? (compute capability 8.6)? Is the memory large enough?

Does it find the correct GPU? Is it compatible to your SDK (13.0) and your driver version?

Can you run other Cuda programs?

Can you configure the target GPU type (compute capability 8.6)?

I tried with different models (mostly stableAI ) but after working through a few things via Anthropic, it found that there was a known bug with 3090/Amperes and the Clip/T5 encoders. Stress tests worked fine, memory, etc. Not sure how to handle it because most models want either the newer OpenAI Clip or T5 encoders.

On Ampere (3090), a constant illegal instruction is very rarely caused by kernel logic.
In practice it almost always comes from a binary / PTX compatibility mismatch — e.g. SASS targeting a different SM, or PTX JIT running on a driver that doesn’t fully support the generated PTX.

This shows up a lot with containers, cached builds, or mixed compute_XX / sm_YY flags.
I’d first verify what SASS is actually executed and whether the driver is JIT’ing PTX at runtime.

1 Like

I genuinely don’t know where to start with what you replied with. What are the first 5 commands I can use in unix to determine your questions?

It is rather an error within pyTorch or at least the installation than how you are calling it.

What would be helpful to thiss community in order to debug the PyTorch? Again, I genuinely don’t know where to start with this particular one, as my EVGA 3060 and EVGA 3090 are working without issues.

So you have two RTX 3090 GPUs on the same system, one is working, one is showing this error?

Yes, I have a Zotac 3090, EVGA 3090 and EVGA 360. EVGA’s are working with no issues