Gde_copy_to_bar Failing Due to CPU Affinity

Hello, I use a Tesla P40 for real-time control applications. Our system uses a series of different cpu’s all reading and writing from various shared memory locations. One of these scripts reads from shared memory and then passes this data to the GPU using a gdr_copy_to_bar() call. Recently this copy has resulted in large delays which causes the entire system to fail via a timeout.
I have been reading the documentation of gdr copy here:https://github.com/NVIDIA/gdrcopy/blob/master/README.md and due to the small size of the buffer being passed I believe I am experiencing a “NUMA effect” where there is a disagreement between the affinity of the CPU moving the memory and the affinity of the CPU hosting/driving the GPU.
My question is how do I check which CPU is hosting my GPU and is there anyway to set/request that the GPU run from a certain affinity if that CPU affinity is already taken by another system.

Thank you,
Alexander Battey

try
nvidia-smi topo -m
hwloc-ls
numactl