Surface reference faster than Surface object


I recently change the surface reference of my algorithm for a surface object. And i notice that the program run slower.

Here a simple example:
I fill a 3D floating array [400400400] with a constant value.

Surface reference API : 9.068928 ms

surface<void, cudaSurfaceType3D> s_volumeSurf;
surf3Dwrite(value, s_volumeSurf, px*sizeof(float), py, pz, cudaBoundaryModeTrap);

Surface object API : 14.960256 ms

cudaSurfaceObject_t l_volSurfObj;
surf3Dwrite(value, l_volSurfObj, px*sizeof(float), py, pz, cudaBoundaryModeTrap);

On a GTX 680, cuda capabilities 3.0, Cuda V5.0.

Anyone have a idea to explain the difference?