We do have test data where PCIe Gen-2 and x4 speeds touch up to 12Gbps i.e 1.5 GB/s in an environment where end point’s DMA is trying to do read or write to TX1’s system memory.
In case of flash based storage devices, read from flash or write to flash might play some role here.
Coming to this specific device which showed >1GBps on other systems but around 800MBps on TX1, we need to consider clocks at which both systems are running and test environment being same in both cases etc…
Although there will be some gap (little) between perf measured on an x86 system and TX1 because of TX1’s lack of IO-coherency. But achieving around 1.5GBps is very much possible.