a strange question in "thrust::copy()"

My program does several iterators in each video frame, and in each iterator it will use the thrust::copy() function to copy the data from host to device. I developped the program with microsoft vs2013 and the time of thrust::copy() in each iterator in Debug mode is about:
0.5, 0.5, 0.5, 0.5, 0.5 (ms)
0.5, 0.5, 0.5, 0.5, 0.5 (ms)
0.5, 0.5, 0.5, 0.5, 0.5 (ms)
0.5, 0.5, 0.5, 0.5, 0.5 (ms)
0.5, 0.5, 0.5, 0.5, 0.5 (ms) …
But in Release mode it will go strangely like:
0.2, 0.1, 0.1, 0.1, 0.1 (ms)
40, 0.1, 0.1, 0.1, 0.1 (ms)
60, 0.1, 0.1, 0.1, 0.1 (ms)
60, 0.1, 0.1, 0.1, 0.1 (ms)
60, 0.1, 0.1, 0.1, 0.1 (ms) …
The first frame seems to be correct. But it goes wrong from the second frame and it costs so much time in the first iterator of each frame. This is very strange because the data in each iterator have the same size. Is there anything different between Debug and Release mode? Thanks a lot!

Could anyone help me with this problem?

Got it! It’s because my GPU is too poor! OMG!