Run Triton kernels on Jetson AGX Orin

I am trying to run and install Triton in my Jetson AGX Orin but I faced these two errors:

  • Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation

And when I tried to install Triton :

  • Could not find a version that satisfies the requirement triton (from versions: none)

Anyone can help me?

Could you share the error with us?
It looks like the kernel can be executed but just fallback to other supported operator?

You can find a Triton server for JetPack 5 below:


This is the error “/whisper/venv/lib/python3.8/site-packages/whisper/ UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation.”

How do you install the Triton package?
Are you using the package shared above?


I did not install Triton because when I tried to install Triton using “pip install triton”:

  • Could not find a version that satisfies the requirement triton (from versions: none)

Also, I tried to install it using all the commands and methods suggested in the official documentation. Here is the link

This is the error shown when I run “pip install -e .” using the “From source” method:

error: subprocess-exited-with-error

** × Getting requirements to build editable did not run successfully.**
** │ exit code: 1**
** ╰─> [28 lines of output]**
** Traceback (most recent call last):**
** File “/home/mauro/whisper/venv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/”, line 353, in **
** main()**
** File “/home/mauro/whisper/venv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/”, line 335, in main**
** json_out[‘return_val’] = hook(hook_input[‘kwargs’])
** File “/home/mauro/whisper/venv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/”, line 132, in get_requires_for_build_editable**
** return hook(config_settings)**
** File “/tmp/pip-build-env-ffd_amvg/overlay/lib/python3.8/site-packages/setuptools/”, line 450, in get_requires_for_build_editable**
** return self.get_requires_for_build_wheel(config_settings)**
** File “/tmp/pip-build-env-ffd_amvg/overlay/lib/python3.8/site-packages/setuptools/”, line 341, in get_requires_for_build_wheel**
** return self._get_build_requires(config_settings, requirements=[‘wheel’])**
** File “/tmp/pip-build-env-ffd_amvg/overlay/lib/python3.8/site-packages/setuptools/”, line 323, in _get_build_requires**
** self.run_setup()**
** File “/tmp/pip-build-env-ffd_amvg/overlay/lib/python3.8/site-packages/setuptools/”, line 487, in run_setup**
** super(_BuildMetaLegacyBackend,**
** File “/tmp/pip-build-env-ffd_amvg/overlay/lib/python3.8/site-packages/setuptools/”, line 338, in run_setup**
** exec(code, locals())**
** File “”, line 237, in **
** File “”, line 121, in download_and_copy_ptxas**
** File “/usr/lib/python3.8/”, line 415, in check_output**
** return run(popenargs, stdout=PIPE, timeout=timeout, check=True,*
** File “/usr/lib/python3.8/”, line 493, in run**
** with Popen(popenargs, kwargs) as process:
** File “/usr/lib/python3.8/”, line 858, in init
** self._execute_child(args, executable, preexec_fn, close_fds,**
** File “/usr/lib/python3.8/”, line 1704, in _execute_child**
** raise child_exception_type(errno_num, err_msg, err_filename)**
** OSError: [Errno 8] Exec format error: ‘/home/mauro/whisper2/triton/python/triton/third_party/cuda/bin/ptxas’**
** [end of output]**

** note: This error originates from a subprocess, and is likely not a problem with pip.**
error: subprocess-exited-with-error


Just want to clarify first.

/whisper/venv/lib/python3.8/site-packages/whisper/ UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation.

Is the error above shown when you try to install the package?



That error happens when I tried to run whisper with cli options “–word_timestamp True”.



So you try to run whisper without installing the Triton first?
Is Triton listed as a dependency of whisper?

We will check the Triton installation issue and update here later.

Yes, I try to run Whisper without installing Triton first because Trito is a dependency when I install Whisper.

Here below you can find an installation example in Google Colab:

Collecting git+
Cloning to /tmp/pip-req-build-d8pwz7zk
Running command git clone --filter=blob:none --quiet GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision /tmp/pip-req-build-d8pwz7zk
Resolved GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision to commit 248b6cb124225dd263bb9bd32d060b6517e067f8
Installing build dependencies … done
Getting requirements to build wheel … done
Preparing metadata (pyproject.toml) … done
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from openai-whisper==20230314) (2.0.0)

I think that could be a problem with python version. Jetson AGX Orin has version 3.8 by default.

Just want to confirm again.
Which JetPack version do you use? Is it JetPack 5.1.1?



Yes, I am using JetPack 5.1.1.


We have confirmed that the Triton server can work normally on Orin+JetPack 5.1.1.
Could you give it a try to see if it helps the whisper issue?

Install dependency

$ sudo apt-get update
$ sudo apt-get install -y --no-install-recommends \
            software-properties-common \
            autoconf \
            automake \
            build-essential \
            git \
            bc \
            g++-8 \
            gcc-8 \
            clang-8 \
            lld-8 \
            curl \
            jq \
            libb64-dev \
            libre2-dev \
            libssl-dev \
            libtool \
            libboost-dev \
            rapidjson-dev \
            patchelf \
            pkg-config \
            libopenblas-dev \
            libarchive-dev \
            zlib1g-dev \
            python3 \
            python3-dev \
            python3-pip \
            libb64-0d \
            libre2-5 \
            libssl1.1 \
$ pip3 install --upgrade wheel setuptools cython
$ pip3 install --upgrade flake8 flatbuffers expecttest xmlrunner hypothesis aiohttp pyyaml scipy ninja typing_extensions protobuf grpcio-tools numpy attrdict pillow

Install PyTorch

$ pip3 install --upgrade

Install the Triton inference server

$ wget
$ sudo mkdir /opt/tritonserver
$ sudo tar zxvf tritonserver2.33.0-jetpack5.1.tgz -C /opt/tritonserver/

Download model

$ git clone --depth 1
$ mkdir model_repository ; cp -r server/docs/examples/model_repository/simple model_repository


$ /opt/tritonserver/bin/tritonserver --model-repository=./model_repository --backend-directory=/opt/tritonserver/backends --backend-config=tensorflow,version=2
$ /opt/tritonserver/clients/bin/perf_analyzer -m simple
*** Measurement Settings ***
  Batch size: 1
  Service Kind: Triton
  Using "time_windows" mode for stabilization
  Measurement window: 5000 msec
  Using synchronous calls for inference
  Stabilizing using average latency

Request concurrency: 1
    Request count: 21281
    Throughput: 1181.81 infer/sec
    Avg latency: 844 usec (standard deviation 1163 usec)
    p50 latency: 827 usec
    p90 latency: 896 usec
    p95 latency: 933 usec
    p99 latency: 1024 usec
    Avg HTTP time: 836 usec (send/recv 114 usec + response wait 722 usec)
    Inference count: 21283
    Execution count: 21283
    Successful request count: 21283
    Avg request latency: 449 usec (overhead 53 usec + queue 37 usec + compute input 37 usec + compute infer 297 usec + compute output 24 usec)

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 1181.81 infer/sec, latency 844 usec


Hi! Thank you so much! I will try this solution when I flash the device again. The device comes with at least two errors:
- Wifi module can’t connect to any wifi signal.
- Eth connection fails suddenly.

After flashing the device I will test wifi and eth connection and then I will try your solution.

I think there is some confusion in this thread between Nvidia’s Triton Inference Server and OpenAI’s Triton - which is what the error message maurofirmani originally posted is from.

These are two completely separate things afaict.