page-locked memory: alignment? reason: inconsistent results for memcopy

Duh! Yes, that was the problem (and the solution). Binding the process to a specific cpu (cpu0 in my case) yields a constant value of 3200MB/s host->dev.

Thanks a lot!

Hrrm, that’s the problem with having too many resources (and not knowing how to use them properly).