For an application I’m writing I need to generate an image on the GPU, then use it as a texture in the next kernel. For maximum speed it is recommended to use CUDA arrays for this.
It seems that I need to allocate the texture then use cudaMemcpyToArray to do a copy, given the current API, even though the texture is already in GPU memory.
Why isn’t it possible to get the address of an array directly so that I can write to it in a kernel? It seems this shouldn’t be a problem unless someone writes and reads from a texture in the same kernel, and would save a redundant copy in this case.