GTX 980ti

Sounds good but cannot find a vendor with these in stock, seems like a good value.

http://www.anandtech.com/show/9306/the-nvidia-geforce-gtx-980-ti-review

Any benchmarks out there?

They seem to be 3~8% slower (depending on the parameters) than a TITAN X in my compute-bound application, and 6GB memory is ample for most tasks. Considering they’re 35% cheaper than TITAN Xes, I’d say they’re very competitive FP32 CUDA cards.

Thanks for the info.

Are the clock speeds of the two the same?

For example I think the GTX Titan Black generally ran at a lower clock speed than the GTX 780ti.

This will also depend on the manufacturer of the that particular GPU model, as a company like EVGA will add an additional ACX cooler then overclock about 10%

Also what about memory bound applications, any noticeable difference?

Tricky question! I actually bought three Gigabyte GTX 980 Tis (reference cards) and a Gigabyte GTX Titan X (also reference card). Their actual core clock all differ under full load (no OC, default voltage), presumably due to different ASIC quality, despite the typical boost clock all being the same at 1076 MHz. For the 980 Tis, higher ASIC quality ones have higher actual clock, but they all range in between 1189 to 1215 Mhz. The Titan X runs at 1177 Mhz at full load. So no, they do not run at same clock speed at default. The peak GFLOPS at default clocks are: Titan X @ 6866, 980 Tis @ 6478 to 6569 (again, higher ASIC quality ones have higher GFLOPS).

I also did some slight OC to force them all run at 1220 Mhz, the peak GFLOPS were: Titan X @ 7073, 980 Tis @ 6561 to 6583.

All of my 980 Tis can be overclocked +300 Mhz @ 1501 to 1527 Mhz at default voltage, raising the GFLOPs to 8077 to 8203, compared against the Titan X @ 1477 Mhz giving 8609 GFLOPS.

I haven’t tested any memory bound applications as I don’t have any on hand, but I would guess there wouldn’t be much difference as they are of same architecture and the advertised bandwidth are the same. Are there any free CUDA memory bandwidth benchmark software I can use?

Another thing I’ve noticed is that algorithms that I’ve optimized for Titan X need re-tuning of parameters to run at maximum speed on 980 Ti and vice-versa. This stroke me as odd as I was under the impression they’re very similar GPUs. On some parameter sets I even see the Titan X performing worse than the 980 Ti. I first thought it was due to SM workload balancing issues as the Titan X has 24 SMs and the 980 Ti has 22, but that wasn’t the case. Profiling shows the biggest difference seems to be warp issue efficiency on my program across different cards. I’m still nailing down the exact reason.

How about the cuda sample code bandwidthTest ?

The device-to-device copy reported number should be a reasonable proxy for relative comparison of different GPUs.

(some GPUs may need to have “boost clocks enabled” for comparison purposes. I don’t think that’s the case with these GPUs.)

They all clock @ 7010 Mhz, and the D to D transfer rates are around (±0.2%) 249,500 MB/s for all four of my cards. I think it’s safe to say the memory performance of Titan X and the 980 Ti are the same, provided Nvidia doesn’t pull another 970.

Interestingly CUDA-Z seems to show very different DtoD transfer rates, I’ll take it as the result of different implementation.

tt

Thank you for the detailed information.

I have an application which needs slightly higher performance than what I was achieving on a not-overclocked reference GTX Titan X.

Trying to figure if I should get a factory superclocked Titan X or the GTX 980ti.

As far as a good memory bandwidth test, an engineer who frequents this board, Jimmy Pettersson, submitted some an excellent CUDA memory bandwidth test which I use to gauge “best-case-scenario” bandwidth.

Here is a link to the source code, which(if you are interested) you can just copy-paste-compile run;

http://pastebin.com/qjkZZaPA

My output on my Titan X was this;

GeForce GTX TITAN X @ 336.480 GB/s

 N               [GB/s]          [perc]          [usec]          test
 1048576         156.60                  46.54   26.8             Pass
 2097152         198.52                  59.00   42.3             Pass
 4194304         227.90                  67.73   73.6             Pass
 8388608         255.37                  75.90   131.4            Pass
 16777216        269.79                  80.18   248.7            Pass
 33554432        277.85                  82.58   483.1            Pass
 67108864        282.06                  83.83   951.7            Pass
 134217728       284.33                  84.50   1888.2                   Pass

 Non-base 2 tests!

 N               [GB/s]          [perc]          [usec]          test
 14680102        269.23                  80.01   218.1            Pass
 14680119        269.36                  80.05   218.0            Pass
 18875600        266.36                  79.16   283.5            Pass
 7434886         155.41                  46.19   191.4            Pass
 13324075        240.67                  71.53   221.4            Pass
 15764213        252.76                  75.12   249.5            Pass
 1850154         60.04           17.84   123.3            Pass
 4991241         139.61                  41.49   143.0            Pass
Press any key to continue . . .

The champ in this test is still the GTX 780ti which was about 3% higher than the above numbers. The code was geared towards the Kepler, so maybe there are Maxwell optimizations.

Here are sample results on my four cards (factory clock, voltage). Judging from what I see across several runs on the same card, I would say the memory bandwidth differences between different cards below are mostly fluctuation.

GeForce GTX TITAN X @ 336.480 GB/s

 N               [GB/s]          [perc]          [usec]          test
 1048576         148.24                  44.06   28.3             Pass
 2097152         192.36                  57.17   43.6             Pass
 4194304         221.20                  65.74   75.8             Pass
 8388608         252.56                  75.06   132.9            Pass
 16777216        267.58                  79.52   250.8            Pass
 33554432        275.92                  82.00   486.4            Pass
 67108864        280.58                  83.39   956.7            Pass
 134217728       282.34                  83.91   1901.5                   Pass

 Non-base 2 tests!

 N               [GB/s]          [perc]          [usec]          test
 14680102        268.26                  79.72   218.9            Pass
 14680119        268.79                  79.88   218.5            Pass
 18875600        265.64                  78.95   284.2            Pass
 7434886         156.25                  46.44   190.3            Pass
 13324075        240.50                  71.48   221.6            Pass
 15764213        252.23                  74.96   250.0            Pass
 1850154         60.86           18.09   121.6            Pass
 4991241         141.14                  41.95   141.5            Pass
Press any key to continue . . .
GeForce GTX 980 Ti @ 336.480 GB/s

 N               [GB/s]          [perc]          [usec]          test
 1048576         135.25                  40.19   31.0             Pass
 2097152         183.76                  54.61   45.6             Pass
 4194304         220.88                  65.64   76.0             Pass
 8388608         247.68                  73.61   135.5            Pass
 16777216        263.54                  78.32   254.6            Pass
 33554432        273.81                  81.38   490.2            Pass
 67108864        280.07                  83.23   958.5            Pass
 134217728       282.91                  84.08   1897.7                   Pass

 Non-base 2 tests!

 N               [GB/s]          [perc]          [usec]          test
 14680102        267.88                  79.61   219.2            Pass
 14680119        268.09                  79.68   219.0            Pass
 18875600        265.39                  78.87   284.5            Pass
 7434886         156.28                  46.45   190.3            Pass
 13324075        239.71                  71.24   222.3            Pass
 15764213        252.25                  74.97   250.0            Pass
 1850154         60.76           18.06   121.8            Pass
 4991241         140.38                  41.72   142.2            Pass
Press any key to continue . . .
GeForce GTX 980 Ti @ 336.480 GB/s

 N               [GB/s]          [perc]          [usec]          test
 1048576         123.11                  36.59   34.1             Pass
 2097152         170.48                  50.67   49.2             Pass
 4194304         211.95                  62.99   79.2             Pass
 8388608         241.84                  71.87   138.7            Pass
 16777216        259.46                  77.11   258.6            Pass
 33554432        272.92                  81.11   491.8            Pass
 67108864        279.12                  82.95   961.7            Pass
 134217728       282.44                  83.94   1900.8                   Pass

 Non-base 2 tests!

 N               [GB/s]          [perc]          [usec]          test
 14680102        265.83                  79.00   220.9            Pass
 14680119        265.75                  78.98   221.0            Pass
 18875600        263.62                  78.35   286.4            Pass
 7434886         156.29                  46.45   190.3            Pass
 13324075        237.27                  70.52   224.6            Pass
 15764213        250.34                  74.40   251.9            Pass
 1850154         60.20           17.89   122.9            Pass
 4991241         139.18                  41.36   143.4            Pass
Press any key to continue . . .
GeForce GTX 980 Ti @ 336.480 GB/s

 N               [GB/s]          [perc]          [usec]          test
 1048576         138.05                  41.03   30.4             Pass
 2097152         182.31                  54.18   46.0             Pass
 4194304         221.25                  65.75   75.8             Pass
 8388608         248.17                  73.75   135.2            Pass
 16777216        261.92                  77.84   256.2            Pass
 33554432        273.67                  81.33   490.4            Pass
 67108864        280.10                  83.24   958.4            Pass
 134217728       282.96                  84.09   1897.3                   Pass

 Non-base 2 tests!

 N               [GB/s]          [perc]          [usec]          test
 14680102        266.57                  79.22   220.3            Pass
 14680119        268.35                  79.75   218.8            Pass
 18875600        265.53                  78.91   284.4            Pass
 7434886         158.88                  47.22   187.2            Pass
 13324075        240.36                  71.43   221.7            Pass
 15764213        252.70                  75.10   249.5            Pass
 1850154         61.33           18.23   120.7            Pass
 4991241         141.36                  42.01   141.2            Pass
Press any key to continue . . .