Strange behaviour when linking with cublas

Il have a program that mix my own kernel calls and cublas (cuda beta2.0) calls. It work perfectly in emulation mode but doesn’t work when running on a GT200 device.
I investigate the problem and I notice that even If don’t call any cublas functions only linking with cublas cause sometimes a segfault or a wrong result like if my kernel where not launched.

Does someone already notice this problem?

I’ve noticed that it takes much longer for the 280gtx to warm up if the application is linked to cublas even when no cublas function is called in the program… maybe this works as designed, i don’t know


Thanks for your response bu I insert some ‘sleep(20)’ at the beginning or after device init

and the problem is still here.


The slow CUBLAS loading is a known bug in CUDA 2.0 beta.