Double in C1060 - shared variable splitting

Hi ,

am using a double shared variable in y kernel … and according to the programming guide on page 61. I would have bank conflicts when I access data from it.

he solution they state is to split the variable into hi and lo variables. I was not able to fully understand how it would work ?

Also I wondering if there is a better way to handle that problem? as my current sp code is little messy when it comes to shared memory access, I dont wana make it more messy with this hi and lo thing.

thanks all…

Then dont make it more messier… Shared memory conflicts are the last thing to look at for speedup.

Thanks Sarnath…

Why do you say they shared memory conflicts are the last thing to consider ? I am getting a performance of 45gflops … its a test Nbody algorithm, which I have written to learn cuda with double precision. The card is Tesla c1060 (max dp gflops of ~ 80 )

Pardon my lack of know how … but am not from computer science background.

Thanks,