Pinned Memory Performance

hi,
is there someone that can explain to me why the bandwidth for writing on pinned memory (from CPU to CPU memory) is almost 2 times slower than writing on an unpinned region of memory?
I’m asking because the gain from writing on pinned memory + transferring from pinned memory + overlap of transfer with computation becomes negligible with respect to simply transferring to GPU from “normal” memory

I tested the bandwidth for writing on pinned memory vs writing on “normal” memory using SSE instructions (_mm_stream_ps), numa awareness and varying the size of the memory written from 32MB to 1GB

thanks