Poor performance when accessing iomem


According to Memcpy performance on Jetson AGX ORIN, I did the same test and the result is quite similar with the result in that topic:

d@d-desktop:~$ sudo ./copybenchmark
copying 1953 MB
time = 0.160565  622.800720 millions of uints/sec
[memcpy] time = 0.030113  3320.824846 millions of uints/sec

I noticed that the above test is based on 2 memory region generated by kmalloc.

However when I try memcpy from a kmalloc memory to a virtual memory which is generated by ioremap_wc from a iomem region, the speed will be much slower:

[ 7505.369562] Mapped BAR Phys Addr Base: 0x0000002761000000
[ 7505.369569] Mapped size: 16777216
[ 7505.369589] io_base: 0xffff800037000000
Data transferred 2097152 Bytes, time consume 1726975 ns
Transfer speed 1158 MBytes/s

Is that the real performance when accessing the iomem? If not how to fix it to an ideal speed?

Best regard.

Do you mean you see vst1q_u32() has worse performance than memcpy()? We have confirmed this is expected:
Memcpy performance on Jetson AGX ORIN - #8 by DaneLLL

Hi DaneLLL,

No, I mean memcpy shows poor performance between a normal memory(kmalloc) and a virtual memory(map to iomem by ioremap).