memcopy time taken

i am copying some 250 megs into global memory from the cpu onto a g8800 gt card ,its taking around 610ms and the rest of code takes 22ms is there something wrong

also how much overhead is incurred on the first call of memcopy ?

There is a significant driver initialization overhead to the first CUDA function call.