CUDA, Total Global Memory: 0.000 Gbytes. Why?

Sohrab_Kehtari · May 23, 2010, 4:49pm

Hi,

I took a tough start in CUDA with the PGI Fortran compiler. I firmly intend to make religion here, but can’t seem to get around the problem that the cufinfo tells me that my card has no global memory, i.e.,

Device Number: 0
Device Name: Device Emulation (CPU)
Total Global Memory: 0.000 Gbytes      <---- this line
sharedMemPerBlock: 16384 bytes
regsPerBlock: 8192
warpSize: 1                            <---- is this correct by the way? shouldn't it be 32?
maxThreadsPerBlock: 512
maxThreadsDim: 512 x 512 x 64
maxGridSize: 65535 x 65535 x 1
ClockRate: 1.350 GHz
Total Const Memory: 65536 bytes
Compute Capability Revision: 9999.9999
TextureAlignment: 256 bytes
deviceOverlap: F
multiProcessorCount: 16
integrated: T
canMapHostMemory: T

The above was run on a MacBook Pro equipped with a GeForce 8600M GT. As a result, the matmult example returns an error when allocating memory on the device for the matrix.

Many thanks for your help!

Sohrab

Sohrab_Kehtari · May 23, 2010, 6:47pm

Following my last thread, here is what deviceQuery from the CUDA SDK tells:

 CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "GeForce 8600M GT"
  CUDA Driver Version:                           3.0
  CUDA Runtime Version:                          3.0
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         1
  Total amount of global memory:                 134021120 bytes
  Number of multiprocessors:                     4
  Number of cores:                               32
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    0.94 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       No
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 53331, CUDA Runtime Version = 3.0, NumDevs = 1, Device = GeForce 8600M GT


PASSED

Sohrab_Kehtari · May 25, 2010, 2:51pm

However even though deviceQuery tells me that my GPU does have memory and I managed to run some toy codes written in C, I still cannot allocate any variable on the device using Fortran.

Sohrab_Kehtari · May 27, 2010, 7:47am

Ideas anyone?

Tuan · May 27, 2010, 4:32pm

You should post your code so that we can know how you allocate data on device memory.

Tuan

MatColgrove · May 27, 2010, 9:01pm

Hi Sohrab Kehtari,

I was able to recreate the issue here on a MacBook Pro. It appears that the CUDA 2.3 libraries that we ship with the compilers are incompatible with NVIDIA’s CUDA 3.0 MacOS driver. To fix either rename or remove the “/opt/pgi/osx86/2010/cuda/2.3” directory and compile with “-ta=nvidia,cuda3.0” when using the PGI Accelerator model or “-Mcuda=cuda3.0” when using CUDA Fortran.

Note that the incompatibility seems to only occur with devices using compute capability 1.1.

Hope this helps,
Mat

Sohrab_Kehtari · May 28, 2010, 9:42am

Many thanks Mat, this was very helpful and did fix the problem.

Best regards,
Sohrab

MMB · May 29, 2010, 2:03am

Mat, this raises the question - when will PGI begin shipping CUDA 3.0, or 3.1?

Malcolm

MatColgrove · May 29, 2010, 5:44pm

Hi Malcolm,

We started shipping CUDA 3.0 with the 10.4 release. Future versions of CUDA will be added after they are officially released by NVIDIA (i.e. not Beta) and once we have validated it.

Mat

Topic		Replies	Views
K20 Global Memory CUDA Setup and Installation	1	4298	February 11, 2013
CUDA Fortran Error Legacy PGI Compilers cuda	2	795	July 31, 2020
Dynamic global memory allocation Legacy PGI Compilers	2	5474	July 25, 2014
FAILED to free megabyte array on device Legacy PGI Compilers	3	2848	October 30, 2012
cudaMalloc difference between Tesla Device and Geforce Device? cudaMalloc on complete global memory CUDA Programming and Performance	6	8787	June 1, 2011
Odd amount of gloabl memory CUDA Programming and Performance	3	2392	May 7, 2009
Memory not found Problem with 7 and CUDA CUDA Programming and Performance	0	1736	September 12, 2011
GTX 970 device properties issue CUDA Programming and Performance	1	914	March 3, 2015
0 global memory CUDA 4.0 / win7@32 on mac book pro 13 after upgrading to cuda 4.0 and corresponding CUDA Programming and Performance	6	4682	July 2, 2011
cuda fortran questions Legacy PGI Compilers	10	11055	July 27, 2012

CUDA, Total Global Memory: 0.000 Gbytes. Why?

Related topics