cudamemcpy ....

I have the following code to process packets in gpu using cuda. but when I measured the time taken to copy packets from cpu to device ,process it and copy back to cpu ,it shows o msec. I have no idea what is wrong with my code. here is some part of my code. What is wrong with my code? any help…

in many cases, the quickest way to discover why your program is ‘misbehaving’, is to step it with the debugger