Howdy, Stranger!
It looks like you're new here. If you want to get involved, click one of these buttons!
Categories
- All Discussions1,524
- General534
- Graphics109
- GPU Computing419
- Mobile141
- Pro Graphics163
- Tools158
Problem with CUDA release 4.1, different behavior when using LLVM compile
-
HI all,
When I tried to use CUDA release 4.1 (default compiler, llvm one) to build and execute my program, the output is not as same as the one I can get by using open64 compiler of 4.1 version or earlier.
I tried to strip my large program as much as possible and got the simple program as attached. It might seems making non-sense, but its result shows difference between using open64 and llvm nvcc compiler.
If I try to simplify it a little bit more, the output difference will disappear.
Basically, this little example does some calculation and fill an array in global memory. Only uses 1 block, 1 thread.
I hope I could get kindly help from you or compiler expertise to let me know what is the root cause of this problem. If it is cause by flaw of this program, it could help me to fix and avoid it in future.
The platform info is as following:
OS: CentOS release 5.5
CPU: Intel Xeon X5660
GPU: M2070
CUDA toolkit: release 4.1
I Appreciate your attention deeply.
Best Regards,
Susan
code_with_problem_41_llvm.tar.gz1K