Newbie question: compute capability <1.1 or == 1.3 ? overlapping kernel execution with device/hos

AlexH · January 16, 2009, 7:44pm

Hi there,

sorry, maybe this question seems very stupid, but I am new to CUDA, so I tried
the examples in the SDK. And I am a bit confused:

Here is what ./deviceQuery tells me:

//////////////////////////////////////////////////////////////////////////////////////////////////////////////
There is 1 device supporting CUDA

Device 0: “GeForce GTX 280”
Major revision number: 1
Minor revision number: 3
Total amount of global memory: 1073479680 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes

Test PASSED

Press ENTER to exit…
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Ok, that’s ok.
But then I use ./simpleStreams.

//////////////////////////////////////////////////////////////////////////////
./simpleStreams
memcopy: 22.09
kernel: 18.53
non-streamed: 39.16 (40.62 expected)
8 streams: 34.93 (21.29 expected with compute capability 1.1 or later)

Test PASSED

Press ENTER to exit…

According to the result (34.93), I should have compute capability < 1.1.
But ./deviceQuery tells me 1.3.
Any ideas?

Thank in advance.
A.

netllama · January 16, 2009, 7:48pm

I think you’re misinterpreting the output. It said “1.1 or later”. 1.3 is later/larger than 1.1.

AlexH · January 16, 2009, 8:52pm

Hi there,
thanks for your reply, and I know, I have 1.3, so I am later than 1.1.

The result tells me:
…
8 streams: 34.93 (21.29 expected with compute capability 1.1 or later)

That means: the real value is 34.93. But if I’d use 1.1 or later (I do), then I would have 21.29.
Right?

netllama · January 16, 2009, 8:53pm

No. It means that at least 21.29 is expected with 1.1 or later. Hence the test PASSED. There’s nothing wrong with any of the output that you posted.

AlexH · January 16, 2009, 9:08pm

Ah, thank you very much. But maybe they should write it
down this way: (>= 21.29 expected …
But thanks again.

Oh, I have another question: maybe you could help me with that, too:

In the example “simpleStreams” in the SDK there is
a line in the kernel:
…
g_data[idx] += *factor; // non-coalesced on purpose, to burn time

Why is that memory access non-coalesced?

Thanks in advance.

Topic		Replies	Views
non-streamed and 4 streamed much lower than expected CUDA Programming and Performance	0	3099	June 18, 2009
non-streamed and 4 streamed much lower than expected CUDA Programming and Performance	0	853	June 18, 2009
device compute capabillity? CUDA Programming and Performance	2	2595	March 17, 2009
Capability issue CUDA Programming and Performance	2	5651	June 19, 2008
streams: feature is not yet implemented in 64 bit Ubuntu 9.04 CUDA Programming and Performance	0	1254	July 29, 2009
Coalesced access for misaligned float CUDA Programming and Performance	4	1558	September 3, 2009
compute capability/CUDA Toolkit 3.1 CUDA Programming and Performance	2	8992	July 13, 2010
Compute 1.3 and invalid device function CUDA Programming and Performance	2	3130	January 30, 2009
Quadro FX 3800 compute capability CUDA Programming and Performance	6	4708	August 7, 2009
Post your device capabilities here (TestProgram.exe included) Device capabilities. CUDA Programming and Performance	9	4026	June 17, 2011

Newbie question: compute capability <1.1 or == 1.3 ? overlapping kernel execution with device/hos

////////////////////////////////////////////////////////////////////////////// ./simpleStreams memcopy: 22.09 kernel: 18.53 non-streamed: 39.16 (40.62 expected) 8 streams: 34.93 (21.29 expected with compute capability 1.1 or later)

Related topics

//////////////////////////////////////////////////////////////////////////////
./simpleStreams
memcopy: 22.09
kernel: 18.53
non-streamed: 39.16 (40.62 expected)
8 streams: 34.93 (21.29 expected with compute capability 1.1 or later)