multiple PBOs in emulation?

Hi, I have some code that: copies from a FBO into several PBOs, registers/maps the PBOs, then invokes a kernel, then unregisters/unmaps the PBOs, and later performs some GL calls.

The kernel only reads data from the mapped PBOs. When I run the program in emulation mode the kernel executes correctly but subsequent GL calls cause a segfault in the GL driver (_nv000130gl). In device mode the kernel executes correctly and subsequent GL calls do not produce errors.

I still encounter the behavior if I comment out all of the code in the kernel. The behavior does not occur if I comment out the cuda/PBO related calls in the main program.

Any suggestion as to how I can debug this further?

Thanks-
Abe