Numba: High-Performance Python with CUDA Acceleration

Originally published at:

Looking for more? Check out the hands-on DLI training course: Fundamentals of Accelerated Computing with CUDA Python [Note, this post was originally published September 19, 2013. It was updated on September 19, 2017.] Python is a high-productivity dynamic programming language that is widely used in science, engineering, and data analytics applications. There are a number…

could you please indicate what modules you include in the fractal example (Mendel kernel)? I tried with

from numpy import uint32
from numbapro import cuda

but I am still missing something
Thank you

Check the iPython notebook here:

But to answer your question, you need:

import numpy as np
from pylab import imshow, show
from timeit import default_timer as timer
from numbapro import cuda
from numba import *

Hi Mark,
thank you! Now it's working, provided that I change the "ThreadIdx" attributes to "threadIdx", that is with lower case "T".

That's a typo, which I have now fixed -- thanks!

guess it works only with python 2.7?

looking at the tutorial
and Python versions:
Python 2.7, 3.3-3.6
NumPy 1.8 and later

Notebook is now updated also -- no longer import from numbapro, just numba. See the notebook.

It works with Python 3.x, if you set parantheses at the print statements.

amazing tutorial thank you! Numba is amazing.

Is anyone writing a book on Python 3.6 CUDA programming? I did an Amazon and Google search but I couldn't find anything. If you want people to use Python 3.6 and CUDA, someone should write a book. Even a PDF would be OK.

Please help me.

Is there a way to synchronize Numba.CUDA threads on incoming data in the GPU? #GPUdirect

Is there any book and/ or tutorial for Python + CUDA?

There is an NVIDIA DLI course:

One question: is it possible to access and use the RT cores on a Turing from Numba? I need to compute intersections for my science application and I prefer to use the hardware approach.

It’s not possible to directly access them using Numba (or CUDA in general). They can be used via OptiX, the DirectX Ray Tracing (DXR) API, or the VKRay extensions to Vulkan. Each of these are referred to on the NVIDIA RTX Ray Tracing | NVIDIA Developer site.

The Numba CUDA target is extensible (see Extending the Numba CUDA Target, a part of the Numba for CUDA Programmers tutorial). I’m not aware of an extension that provides access to OptiX functions, but it would be possible to build a Numba extension that provides access to the OptiX APIs.

1 Like