cudaMapResource slow?

I’m running a cuda kernel on a DX texture. I want to avoid repeated calls of cudaD3D9MapResources() because it takes ~2ms on my machine.

While I only do cudaD3D9RegisterResource() once, is it correct I have to call cudaD3D9MapResources() every time the kernel runs?

I had thought to simply cache the MappedPointer/Pitch but couldn’t get it to work.

Is there a way of doing this without taking the hit of mapping every time?