warp serialize

Hi everyone,
I am optimizeing my kernel by CUDA Visual Profiler now and the Profiler shows high number of “warp serialize” for my kernel. I read the document of Profiler, it tells that the data is only relevant on the accessing conflict to neither constant or shared memory. However, both kinds of memory are never defined explicitly in my kernel. What’s the reason?
My card is GTX 260.There are plenty of branch in my kernel which works to solve a cubic equation. The number of register is high as well, 57 per thread. Is there anything wrong?

Hi everyone,
I am optimizeing my kernel by CUDA Visual Profiler now and the Profiler shows high number of “warp serialize” for my kernel. I read the document of Profiler, it tells that the data is only relevant on the accessing conflict to neither constant or shared memory. However, both kinds of memory are never defined explicitly in my kernel. What’s the reason?
My card is GTX 260.There are plenty of branch in my kernel which works to solve a cubic equation. The number of register is high as well, 57 per thread. Is there anything wrong?