pgaccelinfo prints wrong cuda cores

Dear PGI Users,

new Kepler Nvidia GTX-680 has 8 X 192 = 1,536 cuda cores, but pgaccelinfo prints 256 only. so, I cannot utilize GTX-680 using PGI accelerator well.
Is there any workaround ?

  1. pgaccelinfo prints as below.
    $ pgaccelinfo
    CUDA Driver Version: 4020
    NVRM version: NVIDIA UNIX x86_64 Kernel Module 295.49 Mon Apr 30 23:46:33 PDT 2012

Device Number: 0
Device Name: GeForce GTX 680
Device Revision Number: 3.0
Global Memory Size: 2146762752
Number of Multiprocessors: 8
Number of Cores: 256
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 705 MHz
Execution Timeout: Yes
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 3004 MHz
Memory Bus Width: 256 bits
L2 Cache Size: 524288 bytes
Max Threads Per SMP: 2048
Async Engines: 1
Unified Addressing: Yes
Initialization time: 9122 microseconds
Current free memory: 2034229248
Upload time (4MB): 2930 microseconds (1724 ms pinned)
Download time: 3801 microseconds (1845 ms pinned)
Upload bandwidth: 1431 MB/sec (2432 MB/sec pinned)
Download bandwidth: 1103 MB/sec (2273 MB/sec pinned)

  1. CUDA example prints as below:
    ./deviceQueryDrv
    [deviceQueryDrv] starting…

CUDA Device Query (Driver API) statically linked version
There is 1 device supporting CUDA

Device 0: “GeForce GTX 680”
CUDA Driver Version: 4.2
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 2047 MBytes (2146762752 bytes)
( 8) Multiprocessors x (192) CUDA Cores/MP: 1536 CUDA Cores
GPU Clock rate: 706 MHz (0.71 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Max Texture Dimension Sizes 1D=(65536) 2D=(65536,65536) 3D=(4096,4096,4096)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:

[deviceQueryDrv] test results…
PASSED



Any comments are welcomed.

Best regards,
Jin-Soo Kim


kjs2000@konkuk.ac.kr

Here is reply from PGI support.

" Kepler boards will not be supported until our 12.6 release in June. "