Whats the difference between cudaMemcpy and the standard memcpy function?

I tried both with time measurement and it seems that the normal memcpy function is a little bit faster. So whats the point in using cudaMemcpy?