Anyone out there using the Maxwell Tesla M40 or M60?

info:

[url]Clean Technology Innovations for Oil Sands and Heavy Oil Production - Acceleware Ltd.

The M40 seems similar to the GTX Titan X, but with less global GBs.
The M60 appears to be 2 downclocked GTX 980 GPUs in one PCI-e slot.

Anyone real-life reports about performance of either GPU vs. the GTX Titan?

Also interested in this one. I’m not aware that they are for sale yet, only announced. Availability was supposed to be towards the end of the year, which has come and gone.

I was also never sure they could live outside of the GRID environment, but I was under the assumption that they could.

As with all other Maxwell cards, they do not appear to be real Teslas. No double precision and no ECC protected registers and caches. We basically have not seen a real Tesla product that can be used in any industry requiring ECC protection since the K40 (K80 can’t be used in a workstation).
Not good.

Like any market-driven company, NVIDIA can be expected to focus new product development on the most lucrative markets. After all, they would like to recoup their investment in all that software that is provided free of charge and make a tidy profit on top of that.

Right now, the hot market in compute seems to be deep learning and looking at announcements (Facebook, [url]http://nvidianews.nvidia.com/news/nvidia-gpus-power-facebook-s-new-deep-learning-machine[/url]) and news reports (e.g. Google, [url]http://www.wired.com/2015/11/googles-open-source-ai-tensorflow-signals-fast-changing-hardware-world/[/url]) that could be a significant market for NVIDIA consisting of large customers with deep pockets. I was amazed at the quick succession of cuDNN versions. That does not happen unless you put serious engineering resources behind it. Clearly the new M40 is targeted at that market.

As for Teslas in workstations: I would suspect that the portion of that market that was not cannibalized by high-end GeForce consumer cards early on subsequently lost many additional customers to the GTX Titan “pro-sumer” line. As a consequence, workstation solutions may not be viable for Tesla going forward. That is pure speculation on my part, of course.

Other than the video out capability what is the difference between the Quadro M6000 and the Tesla M40 ?

The Quadro M6000 seems to have ECC, but it is implied that this ECC is somehow different from the ECC configuration for a GPU such as the Tesla K40?

[url]http://www.anandtech.com/show/9096/nvidia-announces-quadro-m6000-quadro-vca-2015[/url]

So is there a Maxwell GPU will full ECC available?

ECC on Quadro products is only for the memory, not registers or cache.

I doubt we will see a Maxwell tesla product like we had for Kepler. Nvidia is already on the hook to deliver Pascal compute units for future supercomputers, so I assume that Pascal will be a return to form for Nvidia with non-abysmal DP performance and full ECC.

I’m hoping for it anyway.

It will be interesting to see how much of the Pascal architecture will “trickle down”. Given that the Pascal accelerators used in supercomputers will sport a new high-speed interconnect (NVLINK) and innovative high-bandwidth memory, I doubt these are units that will also plug into a PC. I would think the K80 is an initial indication of the direction these accelerators will go.

For the PC market, a “lesser” Pascal variant based on PCIe gen3 and the recently specified GDDR5x seems more likely. Will there be enough demand for those to warrant creating the additional SKUs? One can hope.

It is kind of ironic that at one time, a high-performance compute accelerator was considered a niche product if it did not offer substantial DP performance, but that much of the accelerated compute market today is actually driven by applications that use mostly single-precision computation with a smattering of double precision (and that is not just so for deep learning).

Here is a list of vendors;

[url]Page Not Found | NVIDIA

For the GTX line I noticed that EVGA offers an extended 10-year warranty on the GTX Titan X for an additional $60, which is good value. The Tesla line offers a standard 3-year warranty I believe (correct me if I am wrong) with 10-year support available as well.

HBM (version 1) is offered on current AMD gamers card. I’d be rather sad if Nvidia didn’t do the same.

NVLINK is a good point. Since Pascal will be used for the gamers cards as well, then evidently we will see a pcie enabled Pascal card. The question is will we see a compute-oriented-pcie-enabled Pascal card.

If we don’t, then this will have been quite the bait and switch, except that we have nothing to switch to except the competition.

At GTC, NVIDIA stated that Pascal will have three times the bandwidth of Maxwell (e.g. 336 GB/sec in Titan X), which would put memory throughput at a solid 1 TB/s. The memory capacity was stated as 2.7x Maxwell (e.g. 12 GB for Titan X), which would mean 32 GB of on-board memory. This will be a different beast than the HBM currently used with some AMD GPUs, which is slower and significantly smaller. I think it likely that such a configuration will initially be limited to server products where one can tightly control the environment.

I have not been involved in hardware development for 15 years now, so this is all speculation on my part, of course, and I would be happy to be proven wrong.

As for “switching to the competition”, I do not see significant competition to NVIDIA accelerators for the foreseeable future. What people tend to overlook is that a huge component of accelerated computing is the software ecosystem. I also do not see any “bait and switch”. Anybody who reads marketing slides should be aware that stated capabilities refer to top-of-the line configurations. When I read that x86 broke the 1 TFLOP (double precision) barrier in HPC Linpack for a dual socket machine, that does not mean I can expect that level of performance in a workstation.

FTFY: Since Pascal will be used for the gamers cards as well as x86 servers, then evidently we will see a pcie enabled Pascal card.

NVlink on x86 won’t happen - except for on-board inter-GPU communication (and perhaps at some point in the future across multiple boards, I guess). I suspect that ARM64 platforms are not ready for NVLink either – unless of course NVIDIA has secretly been working on what seemed to have been canceled after the IBM deal: Denver cores + Pascal in an HPC-oriented setup. I highly doubt this will happen, though, especially not within the next year. That leaves us with Power8+ the only platform with NVlink support which means that it will be niche and a server for it will cost as much as a car (or two).

Clearly NVIDIA is working on high-end products based on ARM64 and Pascal, as evidenced by the announcement of the “Drive PX2” product last week: [url]http://blogs.nvidia.com/blog/2016/01/04/automotive-nvidia-drive-px-2/[/url]. Whether this uses NVLINK is not clear, but given that NVIDIA makes both the CPUs and the GPU for this, it presumablycould.

I have not seen a published specification of NVLINK, so it is unclear how flexible this interconnect is, and what kind of power consumption one should expect, for example.

True, for some definition of “high end” (not really HPC definition in this case) :) I did not mention the PX 2 because it’s not a ready product just an announcement for the sake of the splash (and there are strong indications that the Pascal chips are not even real on the PX 2 - a bit of a “wood screw” Fermi stunt?). Hence, there is little to conclude based on the prototype board Jen-Hsun Huang claimed to be Parker+Pascal, but I’d be surprised if those MXM-looking modules were actually plugged into an NVLink port.

So, they could be working on something, but I have very strong doubts that it will happen this year, perhaps not even next given the amount of resources NVIDIA has and the focus on recent contracts, partners, bandwagons, etc.

What would be really awesome is a chip/board based on Parker scaled up to 150-200W. That has far less chance that it’ll happen.

What do you mean by flexibility?

Maybe I should have said scalability instead of flexibility. PCIe can be configured anywhere from a 1x link to a 16x (or even 32x ?) link, and from gen1 to gen3 speeds. That allows it to cover a huge spectrum of power and performance tradeoffs.

It is not clear to me whether NVLINK is designed purely as a high-performance interconnect (in which case such an interface presumably draws quite a bit of power), or whether it is intended to eventually replace PCIe across the entire non-x86 CPU landscape, in particular including all of NVIDIA’s ARM-based CPUs.

As for the PX2, I read all the reports and picked my words carefully by using the term “announcement” rather than “introduction”, where the latter would imply an actual product.

I don’t know a lot about design target of NVLINK, but given that I have not heard mobile mentioned so far, that will likely not be a target of the v1. It could be that it’s simply not designed with low-power in mind, but I guess (and hope) that such directions can be explored in later versions.

I doubt it will be as flexible as PCI-E in terms of the number of links – I’ve seen claims of 20-200 GB/s, but not sure if that whole range was for v1 that may top out at 80 GB/s (so max 4 or 10 links?).

The M40 and M60 are both available from these guys –

Michael Chen
Exxact Corporation
mchen@exxactcorp.com