Data transfer from FBO to PBO

Recently I am facing some performance issues while transferring data between frame buffer object (FBO) and pixel buffer object (PBO) for openGL based Cuda application. I am rendering a frame in FBO and copying it to PBO. Then I use the data in CUDA. Transfer of data between FBO and PBO is taking much longer time. It takes more time to transfer data from FBO to PBO than rendering the entire scene in FBO. Do you know any reason for the delay? Frame size I am using is 64x64. I am using latest Nvidia driver and I have tried the same program on both GTX 260 and GTX460.