TX2 Memory Bus Width

This has been bothering me for a while and i’m not sure if I understand this correctly but when I run deviceQuery on my TX2, my Memory Bus Width is shown to be 64bit. But from the technical specifications of the TX2 module it’s supposed to be 128bit?

Have I misunderstood what these two values represent?

Thanks in advance!

Below is a screenshot of the deviceQuery results on my TX-2 device.

https://github.com/ShreyasSkandan/nvidia-tx1-tx2/blob/master/imgs/tx2devicequery.png

That’s exactly what I am seeing. I’d like to understand the reason. My code is not getting the expected speedup on TX2 as compared to on TX1.

Below is a snippet of the deviceQuery output:

Device 0: “GP10B”
CUDA Driver Version / Runtime Version 8.5 / 8.0
CUDA Capability Major/Minor version number: 6.2
Total amount of global memory: 7854 MBytes (8235577344 bytes)
( 2) Multiprocessors, (128) CUDA Cores/MP: 256 CUDA Cores
GPU Max Clock rate: 1301 MHz (1.30 GHz)
Memory Clock rate: 13 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 524288 bytes

Note the memory bus width.

-Robby

Yes, there definitely seems to be something wrong with that device value. I just tested out my desktop graphic card (Nvidia GTX1070) and I got 256bit as expected. This is indeed quite strange.

Actually, the memory clock rate 13Mhz is weird too. Something is not right.

When I ran bandwidthTest, the results were strange and inconsistent.

One test showed that the HtoD bandwidth is only 1/3 that of DtoH. The next test showed the opposite: HtoD bandwidth is 2X that of DtoH. Yet another test showed HtoD and DtoH have the same, but a much lower bandwidth. As shown below.

I understand that Samples are not meant for performance measurement, but the results I am getting are not even in the same ballpark.

Test 1:

Device 0: GP10B
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5277.9

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 15735.8

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 26376.7

Result = PASS

Test 2:

Device 0: GP10B
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 12789.0

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5844.9

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 19148.8

Result = PASS

Test 3:

Device 0: GP10B
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 7702.8

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 7726.5

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 27703.2

Result = PASS

Very interesting.

@Moderators : Any thoughts on the bus width and rate anomaly?

Anyone have any thoughts regarding this?

Hi singularity7 & Robby,

Thanks for pointing out.
The two fields was corrected as below and will be included the next release.

Memory Clock rate: 1600 Mhz
Memory Bus Width: 128-bit