The story is:
Cuda code compiled into a static library in Visual Studio. This static lib. linked into a dll, which is loaded/unloaded by a bigger application.
When application exists, it unloads this dll, then calls onexit(), and then it crashes inside it (invalid memory access at 0x5 …)!
Without cuda code (enough to remove the kernel, .cu file) nothing crashes.