Timming memcpy

Hello everybody I have a very stupid question, Im trying to measure the time that the gpu needs to copy frome the device->host and host->to device and I dont know how to do it. I was trying to do it with cutCreateTimer but I think is not correct because is always give to me the same result doesnt matter the size of the image… could you help me¿¿

Check out the bandwidthTest example in the CUDA SDK, it uses events for accurate timing of the memcpy operations.