why constant memory doesn't work in my example application

i really can’t explain this.
My application works just great when i use only global memory but when i try to use constant memory it’s like constant memory is filled with zeroes.
What i’m trying to do in my example is offload data_x and data_y in constant memory to speed up my app.
Thank you for your time.

my_k-means_map_tid_reduce_atomic_no_thrust_constant.cu (7.53 KB)

a.txt (190 Bytes)

b.txt (192 Bytes)

Edit:i actually found the problem just now,for some weird reason the pointer to constant memory screwed everything up.
I changed d_data_x to c_data_x INSIDE the kernel NOT the argument list and it worked,but to my dissapointment it was slower than global memory.(probably due to fermi cache).
Can anybody explain to me why the pointer thing is happening?

The compiler needs to be able to figure out at compile time to which kind of memory a pointer points to. This is possible if you set in inside the kernel, but not if you pass it in as an argument.
Did you get a “Cannot tell what pointer points to, assuming global memory space” warning?

Constant cache is best if all threads of a warp read the same address. If different addresses are read, it needs as many cycles as addresses are read, so it might get quite slow. In that case it is better to use a texture.

no i didn’t get any warnings,

I know what constant cache is good for but in any case shouldn’t it be at least as good as global memory ? (that’s why i said that it is likely because of L1 cache that constant memory implementation was slower)

Yes I assume you were perfectly right about the cache as global memory would certainly be slower. However for the reason cited constant cache can be slower than cached global memory.