In my program, all threads must read two different float values (randomly) from an array.
I tryed to use different memory types for the same problem, so I stored the array in global memory, which was quite fast, then in shared mem, which was a bit faster, and in constant memory, which was really slow. Of course, I get the same result for all three kernels.
The kernel launch takes like ten times the processing time, when using constant memory, compared to global memory. Is this what you would expect for random access in constant memory, or is my code corrupt? I thought that in principle the runtime should be better for constant memory, because I just read from the memory…
Thanks for any help,