Gpu-cpu memory copying is slower in WSL?!

It is a little bit slower roughly in the range some milliseconds slower.
Also Python stuff caching and memory copying less than Windows.
I tested Pycuda-PyopenCL-Cupy-Numba. for big arrays.
these python stuff i don’t know they double cache or what.
Now, I write my kernels and run them from Pycuda.
Writing them in C faster but alot of intiliazition and garbage collection.
the old-way. .