Reading benchmark

/* Reading benchmark /
global void PcreMatch(int
* pcreTbl, char* packet, PacketInfo* pkInfo, bool* pcreRes)
{
unsigned char ch = 0;
int matched = false;
int st = 0, size = 0;
int *stBase = 0;
char *base = 0;

base = packet;
size = 64;

for (int i = 0; i < size; i++) {
ch = (unsigned char) base[i];

}
pcreRes[threadIdx.x + blockIdx.x * blockDim.x] = matched;
}

This is simple reading testing program. But this gives me about 1.5 gbps when I use 256KB data.
But I tested that the copy time was about 20gbps when I used 256 KB data. :(
Why it is so slow?? Because of non coalsced memory read?