Got cuda internal error

The compilation of our code failed to be compile and the error message is

nvlink fatal : Internal error: overlapping offsets allocated to objects

CUDA6.5 is used.
CUDA7.0 is not installed, since we do not have root on HPC server.

Any possible problem is this? Is it the bug of CUDA? How should we fix it?

It’s likely a compiler bug. Is it a large project with many device-linked objects? I think it might be similar to this:

that you reported previously.

I don’t have any suggestions (if you’re not able to try CUDA 7 or CUDA 7.5RC) unless you can provide a reproducing sample as you did back then.