Besides the Titan and Tesla line, which GPUs support Hyper-Q

CudaaduC · January 23, 2014, 1:35am

Some new work necessitates the use of Hyper-Q for multiple concurrent kernels, and I wanted to know if the GTX 780 supports Hyper-Q? My understanding is that the GTX Titan and the Tesla lines support it, and they are compute capability 3.5, as is the GTX 780.

Also, is the GTX 780 the cheapest non-mobile GPU which is 3.5 ?

cbuchner1 · January 23, 2014, 1:47pm

Some GT 640 models and GT 630 models are Compute 3.5 as well. What they have in common is that they are equipped with no more than 2GB of GDDR5 memory.

Models with 4 GB DDR3 are currently guaranteed to be either Fermi or Compute 3.0 parts…

vacaloca · January 23, 2014, 1:58pm

Take a look at this thread:

[url]Concurrent kernel and events on Kepler - CUDA Programming and Performance - NVIDIA Developer Forums

The interesting thing is that while the $45 GK208 GT 635 SM_35 card that I bought recently and have driving my displays will run the simpleHyperQ example just fine on Linux, the kernels do not run concurrently in Windows. Tried setting the same CUDA_DEVICE_MAX_CONNECTIONS environment variable in Windows, but still the same behaviour.

cbuchner1 · January 23, 2014, 3:16pm

Nice, I just ordered 4 of these poor GT635 OEM cards into brutal cryptocoin mining slavery.

allanmac · January 23, 2014, 6:40pm

Excellent! I’ve found that it’s nice to have a GK110 and a GK208 in the same machine.

Also, the GT635 might be a decent proxy for the forthcoming Tegra K1 (although with twice as many SMXs).

CudaaduC · January 23, 2014, 8:27pm

So, assuming the operating system is Ubuntu 13.04, will the GTX 780 support Hyper-Q ?

If so, how many concurrent kernels?

vacaloca · January 26, 2014, 7:04pm

I went back to an old Anandtech GTX Titan review that I knew I had posted a comment in regards to simpleHyperQ. I ran the SimpleHyperQ example in Windows (WDDM) when I had the GTX Titan and found out that you are able to do up to 8 streams concurrently. This is also supported by this post: [url]Hyper-Q and OpenMP on single GTX-Titan GPU - CUDA Programming and Performance - NVIDIA Developer Forums

However, in the other thread I also tested the GTX Titan in Linux and was able to do up to 32 streams concurrently:
[url]Concurrent kernel and events on Kepler - CUDA Programming and Performance - NVIDIA Developer Forums

Further, the GT 635 GK208 card I tested under Linux is able to do 16 streams concurrently.
Edit: This same card is able to do 16 streams in Windows 7 x64. Perhaps my Windows x64 setup had some driver issues.

If I had to take a guess, I would think that the GTX 780 would support 32 concurrent streams in Linux, and possibly one of: 8, 16, or 32 concurrent streams in Windows. Would be great if someone could confirm those suspicions.

allanmac · January 26, 2014, 8:19pm

The GK208 WDDM (Win7/x64) seem to work for me up to 16. No?

And for the K20c TCC on Win7/x64 works all the way up to 32:

allanmac · January 26, 2014, 8:30pm

One more thought, it makes sense to only support a max of 16 streams on a single SMX since Kepler only supports 16 resident blocks per SMX. I know the GK208 has two SMX’s but perhaps a 1:1 ratio of streams to total device blocks would be overkill on this chip. The GK208 is already ludicrously well-featured. :)

Perhaps this means that the Tegra K1 with its single SMX will support 16 streams? Now that would be awesome.

vacaloca · January 26, 2014, 8:56pm

I tried the same GT 635 card under Windows 7 x64 (see post below), and I see the 16 concurrent streams. Either my driver setup under Win 8 x64 was flawed or some other issue was showing up (driver bug?) Here were are my results on Windows 8 x64, TCC driver for Quadro K6000:

Also, I didn’t see CUDA 6.0 out yet on the registered developer site… how do you like it?

allanmac · January 26, 2014, 9:08pm

Here is a GT 635 on another Win7/x64 machine with the 332.21 driver. It seems to work fine:

vacaloca · January 27, 2014, 5:18am

Very strange… I re-checked the same GT 635 card using the same hardware under a Windows 7 x64 setup with the same 332.21 driver and I can now see the 16 concurrent streams on the GT 635 card. Perhaps my previous setup had some driver issues or the Win 8 x64 drivers have a bug. Here’s how it comes up now:

CudaaduC · January 28, 2014, 3:27am

A question related to the concurrent kernels launched using hyper-Q:

It stands to reason that you would want each concurrent kernel in the launch set to operate(write and read) in its own exclusive memory space, but what if a set of concurrent kernels all want to read from the same global memory space?

Topic		Replies	Views
Is HyperQ supported in GTX 750 TI? CUDA Setup and Installation	2	1300	November 5, 2014
Hyper-Q CUDA Programming and Performance	8	1823	December 21, 2023
Hyper-Q technology CUDA Programming and Performance	8	12382	August 2, 2014
I can't realize the kernel concurrent with Hyper-Q CUDA Programming and Performance	7	967	July 27, 2017
Hyper-Q and OpenMP on single GTX-Titan GPU CUDA Programming and Performance	5	4484	May 21, 2013
How does the GK110's Hyper-Q enable concurrency of multiple streams? CUDA Programming and Performance	2	829	July 28, 2013
I have the following conceptual questions : CUDA Programming and Performance	6	778	August 15, 2017
How to enable Hyper-Q on Tesla K20 CUDA Programming and Performance	0	1147	March 1, 2013
Concurrent kernel and events on Kepler CUDA Programming and Performance	16	11128	January 29, 2014
Tesla K40 vs. Quadro M6000 vs. GeForce Titan X CUDA Programming and Performance	12	45641	April 7, 2015

Besides the Titan and Tesla line, which GPUs support Hyper-Q

Related topics