I need to have a possiblity to create CUDA linear arrays on the device without knowing the size of the linear array in advance.
One possibility is to create the linear array in host code and transfer the data in the parameter list, as following:
Since I have to call the kernel more than 25000 times, I assume that the time to setup the kernel with the cuParamSetx
methods and the other necessary calls is to slow and I currently have a tremendous bottleneck doing it this way.
Since I’m developing in C# I have no direct link between host and device code, I need to have something like this:
Since I’m in C# the symbol pointer can not be changed and it stays NULL as defined in the kernel code.
What I need is to change the symbol address.
Do you see a way to do this?