Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton

Originally published at: Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton | NVIDIA Technical Blog

NVIDIA CUDA Tile is a GPU-based programming model that targets portability for NVIDIA Tensor Cores, unlocking peak GPU performance. One of the great things about CUDA Tile is that you can build your own DSL on top of it. This post shares the work NVIDIA is doing to integrate CUDA Tile as a backend for…