Combining results from multiple CUDA threads to a python array

Hi, I was just getting started with CUDA programming using Numba python. How can i store the results obtained from multiple CUDA threads to a single python list on the host.

numba CUDA doesn’t support python lists for communication to/from a numba CUDA kernel. The supported data containers are scalars, tuples, and numpy arrays.

You would want to put results in a numba CUDA array, then transfer it to the host. At that point it will be in a numpy array. You can then transfer data from a numpy array to a list using methods that have nothing to do with CUDA or Numba CUDA.

When i try to use numpy.append() inside CUDA kernel i am getting error. Is the function supported.

The link I provided: → Please click here ← shows what is supported. numpy.append() is going to do array creation:


append ndarray

A copy of arr with values appended to axis. Note that append does not occur in-place: a new array is allocated and filled.

which is not supported:

Unsupported numpy features:

array creation APIs.

Does that mean that i have to know the size of the array before passing it to the kernel.

Yes, probably. This is a somewhat common question in CUDA (here is one example, there are others) and in numba CUDA.

Ok. Thanks.