Running Llama-3.1-8B-FP4 get triton error. Value 'sm_121a' is not defined for option 'gpu-name'

fdong22 · October 24, 2025, 6:28am

I tried to run the fp4 llama-3.1-8b that NVIDIA provided on hugging face on a DGX Spark. I got the following error when I run the given script on Llama-3.1-8B-FP4 hugging face page:

triton.runtime.errors.PTXASError: PTXAS error: Internal Triton PTX codegen error
ptxas stderr:

ptxas-blackwell fatal : Value ‘sm_121a’ is not defined for option ‘gpu-name’

Could you please help me figure out this? Thank you.

johnny_nv · October 24, 2025, 4:49pm

export TORCH_CUDA_ARCH_LIST=12.1a # Spark 12.0,12.1f, 12.1a
export TRITON_PTXAS_PATH=/usr/local/cuda/bin/ptxas
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

malkevich.alex · October 24, 2025, 8:01pm

I’m trying to run it in vLLM docker container and setting TORCH_CUDA_ARCH_LIST=12.1a env var does not help.

Is there anything else that’s needed to fix this error? I’m using vllm/vllm-openai:nightly image.

Topic		Replies	Views
Tlt train error: Value 'sm_86' is not defined for option 'gpu-name' TAO Toolkit	2	3633	October 12, 2021
tensorRT 3.0.4 error * stack smashing detected *: TensorRT	1	856	August 10, 2018
TRT failed to generate plan on Turing (T4&RTX2080) General	8	1283	July 2, 2019
Using TensorRT3.0 to convert tensorflow model to create TensorRT engine Jetson TX1	3	653	March 8, 2018
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context? TensorRT	7	4054	October 22, 2018
Running tensorflow models on DLA Jetson AGX Xavier	9	1751	October 18, 2021
is FP16 running only on the Volta? TensorRT	8	2996	October 12, 2021
Tensorflow 1.7 with TensorRT fails Jetson TX2	13	3915	October 18, 2021
Tensorflow Memory Error Jetson TX2	25	15442	October 18, 2021
Must specify at least one GPU id TAO Toolkit	2	1015	October 12, 2021

Running Llama-3.1-8B-FP4 get triton error. Value 'sm_121a' is not defined for option 'gpu-name'

Related topics