The Tesla C2050 GPU has 448 cores as mentioned by NVIDIA. But here in the following output, its showing 112 cores:
#lib/gpu/nvc_get_devices
Device 0: “Tesla C2050”
Revision number: 2.0
Total amount of global memory: 2.62 GB
Number of multiprocessors: 14 Number of cores: 112
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: Yes
Is there any difference between CUDA core and a normal core (as highlighted above)?
Also, there are no sample example programs inside CUDA Toolkit. Does anybody have such programs, which can compare both GPU & CPU performance?
That is just a problem with the deviceQuery example. The previous generations of cards had 8 cores per multiprocessor, whereas the GF100 cards have 32 cores per multiprocessor. Because CUDA API doesn’t (yet) report the core count, only the multiprocessor count, that code has a fixed constant of 8 cores per multiprocessor. So it reports 14 * 8 = 112 instead of 14 * 32 = 448 as it should. There is nothing wrong with your card and you have not misunderstood the specifications.
That is just a problem with the deviceQuery example. The previous generations of cards had 8 cores per multiprocessor, whereas the GF100 cards have 32 cores per multiprocessor. Because CUDA API doesn’t (yet) report the core count, only the multiprocessor count, that code has a fixed constant of 8 cores per multiprocessor. So it reports 14 * 8 = 112 instead of 14 * 32 = 448 as it should. There is nothing wrong with your card and you have not misunderstood the specifications.
The OS is RHEL-5.3 64 bit. The cuda toolkit says RHEL-5.4 64bit (cudatoolkit_3.1_linux_64_rhel5.4.run). Is this compatibility issue causing the failure?
Is there anybody successful in running this application on GPU?
The OS is RHEL-5.3 64 bit. The cuda toolkit says RHEL-5.4 64bit (cudatoolkit_3.1_linux_64_rhel5.4.run). Is this compatibility issue causing the failure?
Is there anybody successful in running this application on GPU?
I had the same issue with a couple of GTX 465 cards (reporting 88 instead of 352 cores). When I upgraded from the 2.3 Toolkit and SDK to the 3.1 Toolkit and SDK and reran deviceQuery, it got it right.
I guess the latest toolkit knows how many cores per multiprocessor for this device.
I had the same issue with a couple of GTX 465 cards (reporting 88 instead of 352 cores). When I upgraded from the 2.3 Toolkit and SDK to the 3.1 Toolkit and SDK and reran deviceQuery, it got it right.
I guess the latest toolkit knows how many cores per multiprocessor for this device.