Hi,
I understand that access constant memory(as a cache) is faster than access global memory.
But the kernel code I programmed is the same speed when I placed the array in constant or global memory , and two cases as shown :
constant memory
kernel test(constant int * a){
…
int n = rand (); // rand() function is used to generate a integer number and n is not the same for each workitem
int tmp = a[n];
…
}
global memory
kernel test(global int * a){
…
int n = rand (); // rand() function is used to generate a integer number and n is not the same for each workitem
int tmp = a[n];
…
}
Therefore , each workitem might not access the same address of array a.
These cases result in same speed (kernel time)
In nvidia guide , if access constant memory address is not the same address , it will be access sequentially.
In this code pattern , is it not different between these two cases ?
Thanks