Tesla C2075 Total global memory shows to be only 1.3 gigabytes

flipordie · April 30, 2013, 10:59am

Good day,

as mentioned in the topic my Tesla C2075 global memory is displayed as 1.3 gigabytes which is far less than it is supposed to be according to the product information.
Product information can be found here NVIDIA DGX Station A100 | NVIDIA

My setup:
Tesla C2075 driver version 311.50, Release date 2013.04.17
CUDA 5.0

The code snippet that I used for displaying global memory

void PrintDeviceProperties(cudaDeviceProp devProp)
{
FILE *deviceProperties = fopen(“DeviceProperties.txt”, “a+”);
fprintf(deviceProperties, “Major revision number: %d\n”, devProp.major);
fprintf(deviceProperties, “Minor revision number: %d\n”, devProp.minor);
fprintf(deviceProperties, “Name: %s\n”, devProp.name);
fprintf(deviceProperties, “Total global memory: %u\n”, devProp.totalGlobalMem);
fprintf(deviceProperties, “Total shared memory per block: %u\n”, devProp.sharedMemPerBlock);
fprintf(deviceProperties, “Total registers per block: %d\n”, devProp.regsPerBlock);
fprintf(deviceProperties, “Warp size: %d\n”, devProp.warpSize);
fprintf(deviceProperties, “Maximum memory pitch: %u\n”, devProp.memPitch);
fprintf(deviceProperties, “Maximum threads per block: %d\n”, devProp.maxThreadsPerBlock);
for (int i = 0; i < 3; ++i)
fprintf(deviceProperties, “Maximum dimension %d of block: %d\n”, i, devProp.maxThreadsDim[i]);
for (int i = 0; i < 3; ++i)
fprintf(deviceProperties, “Maximum dimension %d of grid: %d\n”, i, devProp.maxGridSize[i]);
fprintf(deviceProperties, “Clock rate: %d\n”, devProp.clockRate);
fprintf(deviceProperties, “Total constant memory: %u\n”, devProp.totalConstMem);
fprintf(deviceProperties, “Texture alignment: %u\n”, devProp.textureAlignment);
fprintf(deviceProperties, “Concurrent copy and execution: %s\n”, (devProp.deviceOverlap ? “Yes” : “No”));
fprintf(deviceProperties, “Number of multiprocessors: %d\n”, devProp.multiProcessorCount);
fprintf(deviceProperties, “Kernel execution timeout: %s\n”,
devProp.kernelExecTimeoutEnabled ? “Yes” : “No”));
fclose(deviceProperties);
}

And the result is as follows:

Major revision number: 2
Minor revision number: 0
Name: Tesla C2075
Total global memory: 1341849600
Total shared memory per block: 49152
Total registers per block: 32768
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 65535
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 1147000
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 14
Kernel execution timeout: No

All sort of help is appreciated!

Best regards,
Jonne

vacaloca · April 30, 2013, 11:24am

Post nvidia-smi output. I’m going to assume ECC support is on, although even with ECC on, total available memory should be more than that. Regardless, turn ECC off via nvidia-smi and reboot, and see if it changes.

As per Wiki: “With ECC on, a portion of the dedicated memory is used for ECC bits, so the available user memory is reduced by 12.5%. (e.g. 3 GB total memory yields 2.625 GB of user available memory.”

Not sure if you’re running Windows/Linux, but also check NVIDIA control panel/nvidia-settings and see what is being reported for total GPU memory there. If after setting ECC off memory usage is reported to be less than what it should be, try the card on another system and see if memory is reported correctly on the alternate system. It could just be some odd issue with an old/outdated BIOS on your test system – I’d recommend doing that as well.

Finally, is this a secondhand card? How many memory chips on the board and what are their model numbers? You can do a visual inspection to make sure someone didn’t just de-solder the chips to use them for something else. This scenario is a bit more far-fetched, but I mentioned it regardless.

mfatica · April 30, 2013, 6:34pm

You are using the wrong format in the printf.
The amount of memory needs a %lu formatting, not a %u

flipordie · May 14, 2013, 7:21am

test

flipordie · May 14, 2013, 7:23am

==============NVSMI LOG==============

Timestamp : Mon May 13 11:10:15 2013
Driver Version : 307.83

Attached GPUs : 2
GPU 0000:02:00.0
Product Name : Tesla C2075
Display Mode : Disabled
Persistence Mode : N/A
Driver Model
Current : TCC
Pending : TCC
Serial Number : 0322712006318
GPU UUID : GPU-d013094c-ff46-65b8-0c83-62f13e1a975d
VBIOS Version : 70.10.46.00.05
Inforom Version
Image Version : N/A
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : 4.0
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x02
Device : 0x00
Domain : 0x0000
Device Id : 0x109610DE
Bus Id : 0000:02:00.0
Sub System Id : 0x091010DE
GPU Link Info
PCIe Generation
Max : 2
Current : 2
Link Width
Max : 16x
Current : 16x
Fan Speed : 30 %
Performance State : P0
Clocks Throttle Reasons : N/A
Memory Usage
Total : 5375 MB
Used : 171 MB
Free : 5204 MB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
Single Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Aggregate
Single Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Temperature
Gpu : 52 C
Power Readings
Power Management : Supported
Power Draw : 78.12 W
Power Limit : 225.00 W
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Compute Processes
Process ID : 6120
Name : D:\testi\Release\DataAnalyzer.exe
Used GPU Memory : 160 MB

ECC disabled

flipordie · May 14, 2013, 7:25am

Hello again,

Thank you for the suggestions.

The earlier post is my nvidia-smi output when ECC is enabled.

The amount of memory needs a %lu formatting, not a %u
Fixing the latter did not have any effect on the outcome.

In addition to the system information. Windows 7 64-bit.

After changing the ECC to disabled the memory increased from 1.3GB to 2.1GB. Maybe I should be detaching the tesla card from the system and taking it to another one. However, it seems quite odd that nvidia-smi output displays the corrent amount of global memory. The card is not a secondhand one. Any suggestions or advices are more than helpful :).

Best regards,
Jonne

mfatica · May 14, 2013, 9:04pm

On Windows the right format is %llu

flipordie · May 15, 2013, 9:54am

Thank you for noticing! However, it does not have any effect on the outcome in this very case.

-Jonne

vacaloca · May 15, 2013, 1:31pm

Does the devicequery sample in CUDA toolkit give you the correct results? (It should) Check the source and compare the syntax to what you are using.

Source from CUDA 5.0 SDK on Windows:

sprintf(msg, "  Total amount of global memory:                 %.0f MBytes (%llu bytes)\n",
                (float)deviceProp.totalGlobalMem/1048576.0f, (unsigned long long) deviceProp.totalGlobalMem);
        printf("%s", msg);

Try an explicit unsigned long long cast.

njuffa · May 16, 2013, 1:04am

The cudeDeviceProp struct has this member:

size_t totalGlobalMem

The standard-compliant format specifier for type size_t is %zu. As far as I am aware, older versions of MSVC do not support this, and one must use %Iu (note: upper case ‘i’, not ‘l’) instead. Since on a 64-bit system, size_t is an unsigned 64-bit type, printing with %llu should work as well.

Try printing just the totalGlobalMem by itself, because if any of the other format specifiers in the printf are incorrect, printf may grab the data from the wrong location.