work for sm20 but fail sm13

lattice · April 15, 2012, 4:17am

I started to use cuda 3 months ago, and this is my first post here. Plz bare with me if my question is very naive.

But I am really puzzled. I have a code of Monte Carlo simulation. It is not complicated, but use everything a numerical calculation needs, such as complex number operations, matrix multiplications, random number generations. In the end of every iterations, some measurement is taken throughout all threads, and results is transfered back to cpu, and summation is taken there.

I am using C2050. My code works perfect if I compile with sm20, I got results as theory expected. So I am sure the algorithm and arithmetics are doing the right thing. However, when compiled with option -arch=sm13, I got nonsense answers. Even after a few iterations, I got nan’s! I tried the cuda-gdb, with -g -G flags. Then it works, it does give right answers.

Why? I suspect it is related to memory allocation and accessing. I use texture to access global memory, and my data are arrays of matrix of float2. But I really don’t know what the differences are between sm13 and sm20 regarding memory. Plz point me to a proper direction, which I should search.

Thanks!

Topic		Replies	Views
Wrong results with -arch=sm_20 on a compute capability 2.0 GPU -arch=sm_13 and -arch=sm_20 does not CUDA Programming and Performance	5	10663	April 16, 2011
-arch sm_13 vs -arch sm_20 (sm_20 slower on C2050) CUDA Programming and Performance	21	7430	December 21, 2010
Double doesnt work even with -arch=sm_13 CUDA Programming and Performance	0	768	June 10, 2011
Is CUDA backward compatible? CUDA Programming and Performance	2	2929	February 26, 2009
cuda computability in multi gpu card CUDA Programming and Performance	2	1885	June 17, 2013
Results differ when compiled with sm_10 and sm_20 CUDA Programming and Performance	19	2597	June 17, 2011
Problem with arch=sm_20 CUDA Programming and Performance	16	4417	March 4, 2011
Does NVCC move data from constant memory to shared memory? CUDA Programming and Performance	7	2109	June 19, 2012
sm-level : 1.3 vs 2.0 performance (first wins O_o) CUDA Programming and Performance	8	1911	January 3, 2014
compilation error CUDA Programming and Performance	2	3997	July 16, 2009

work for sm20 but fail sm13

Related topics