I assume you are referring to this:
That isn’t really predicated on the size of the array in question.
You’re not doing anything wrong, in my opinion. Your definition here:
data = numpy.ones(2048)
is a host-based numpy array. As you have demonstrated, you can certainly use that as “input” to CUDA device code, in numba CUDA. I always considered that a “feature” (i.e. a convenience that is nice to have) in numba CUDA.
In CUDA C++, traditionally, such a thing doesn’t work. It’s illegal. You must explicitly copy the data to and from the GPU. Naturally, this copying takes time; its not “free”.
What numba CUDA does under the hood for you is copy that numpy array to a device memory array that is accessible to CUDA device code (kernel code). On the other end (after the kernel is finished) numba also copies that data back to the original numpy array on the host. This makes all your host code work as one might typically expect, if they were a python programmer and generally unaware of the asynchronous nature of GPU computing.
However, it does incur that copy cost. People who don’t understand any of this might assume that the cost to copy data is “free”, and then wonder why a trivial kernel takes so long to execute (in numba CUDA). It’s taking a long time to execute, because in numba CUDA, when using host numpy arrays for input/output, the kernel launch is burdened with the additional cost of data copying.
That is the reason for the “performance warning”. To make it really obvious to people who are unaware of the mechanism, that what they are doing is not “free”.
Numba didn’t always warn this way. Somewhere along the way, perhaps somebody got tired of people posting questions asking why their kernels are taking so long to execute. As a result of trying to address that, we now have questions asking what the warning means. So you can’t win, really. But maybe its better to be really obvious and verbose about everything.
I’m sure one method to eliminate this warning is to manually/explicitly do the copying yourself.
The general necessity to copy data to and from a CUDA GPU is covered in the first few sections of this online training series although that has CUDA C++ in view. You can find numba CUDA online tutorials with a bit of searching. Here is one.
Also, when posting code here on this forum, its preferred that you format it correctly. One simple method to do that is when you are editing your post (click the pencil icon below your post), select the code, then click the </>
button at the top of the edit window, then save your changes. I’ve done it for you this time.