Program runs forever CUDA 3.1 (RedHat 5.4) CUDA 3.1 on RedHat 5.4, gcc 4.1.2, Tesla C1060

I have the code which run perfectly on CUDA 3.0 on RedHat enterprise 5.4, gcc 4.1.2 (HP Z800) Tesla C1060, CUDA Driver 256.53. After upgrading the system to CUDA 3.1, the compiled code just run forever (never stop). I tested with another machine, installed with Ubuntu 9.04, gcc4.3.3, nvidia driver 256.53 and Tesla C1060, it works normally.
I have no idea what may cause the problem on RedHat machine.
Could some body give me a hint, or if you need other information that may help you to figure out the problem, please let me know.
Thank you,

Tuan

[EDIT]: I just tested with SDK sample code, and it seems waiting forever when it try to allocate the memory on device. I don’t know why.

I have the code which run perfectly on CUDA 3.0 on RedHat enterprise 5.4, gcc 4.1.2 (HP Z800) Tesla C1060, CUDA Driver 256.53. After upgrading the system to CUDA 3.1, the compiled code just run forever (never stop). I tested with another machine, installed with Ubuntu 9.04, gcc4.3.3, nvidia driver 256.53 and Tesla C1060, it works normally.
I have no idea what may cause the problem on RedHat machine.
Could some body give me a hint, or if you need other information that may help you to figure out the problem, please let me know.
Thank you,

Tuan

[EDIT]: I just tested with SDK sample code, and it seems waiting forever when it try to allocate the memory on device. I don’t know why.