Strange behaviour with more kernels

pototschnig · July 8, 2007, 2:06pm

Hello,

I tried to put 2 kernels into one file and declaread at the top:

shared float4 smem1[512];
shared float4 smem2[512];

and then

global func1() {
… // something with smem1
}

global func2() {
… // something with smem2
}

I got some weird results back from func1 (although func2 was never executed).

I tried to comment func2 and smem2 out and then func1 worked properly. I tried to figure out what caused the problem but after commenting in func2 and smem2 it still worked and I don’t have any idea what caused the problem.

Any ideas?

regards
Pototschnig

afathy · July 8, 2007, 4:31pm

It is a memory problem “both arrays are starting in the same memory address!!”
See Variable Type Qualifiers (shared) section 4.2.2.3 on pages 19,20 in the CUDA_Programming_Guide_0.8.pdf for details about how to define arrays in shared memory without having this problem.

Regards,
Ahmed

pototschnig · July 8, 2007, 4:41pm

Yes, but this shouldn’t affect anything because the kernels don’t run concurrently at the same time.

Or did I miss something?

prkipfer · July 8, 2007, 4:44pm

This is not correct. Please read section 4.2.2.3 carefully again. Shared arrays with specified dimension are always disjoint. You only need to calculate offsets for extern shared arrays (where the size is specified on kernel invocation).

Peter

afathy · July 8, 2007, 4:51pm

I don’t know, what you say is correct, but may be there is a memory initialization or something like that the runs in parallel with the kernel and corrupts your calculations!

let’s try at first to do the correct allocation and if the problem still there, then we think!!

afathy · July 8, 2007, 5:31pm

Sorry, I didn’t notice that our case is static allocation!!

I tried the example with 2 statically allocated arrays and 2 kernels, and I called only one kernel… everything was OK and no data corruption happened!!!

please check your main program, sure there is something wrong in your code!!

pototschnig · July 8, 2007, 5:51pm

I finally got the problem … perhaps a CUDA bug?

The declaration at the top was wrong … this works perfectly:
shared char array[512*sizeof(float4)]

global func1 {
float4* ptr = (float4*) array;
… //something with ptr
}

global func2 {
float4* ptr = (float4*) array;
… //something with ptr
}

Why does this work and my first attempt not?

Topic		Replies	Views
help getting shared memory working CUDA Programming and Performance	11	4435	June 12, 2007
shared memory wrong allocation? CUDA Programming and Performance	2	892	July 29, 2009
[SOLVED] Shared memory variable declaration CUDA Programming and Performance	3	15479	December 23, 2016
Shared Memory initialization CUDA Programming and Performance	19	45568	March 26, 2007
using a component in the share memory twice CUDA Programming and Performance	4	2703	September 22, 2010
shared memory issue in sorting array CUDA Programming and Performance	1	951	August 6, 2010
Doubts about Sharedmemory. CUDA Programming and Performance	1	3154	June 4, 2009
Some confusion on using shared memory. CUDA Programming and Performance	26	9467	June 2, 2009
More question about shared mem Some point is unclear in the document CUDA Programming and Performance	4	3581	November 29, 2007
__shared__ memory: Just a question what happens if CUDA Programming and Performance	3	937	March 15, 2016

Strange behaviour with more kernels

Related topics