What can't you do in CUDA that you'd like? Requests for the future

peastman · April 14, 2009, 10:41pm

I’m also having trouble distinguishing between what features I want and what I want to accomplish. To the extent that CUDA is Turing complete, there’s nothing I can’t do in it. All requests ultimately come down to either “Make it easier to do” or “Make it faster.”

That said, there are many algorithms that don’t map well to CUDA. There are several reasons for this.

The algorithm may be inherently serial, so that it can’t be split into many parallel threads.
The algorithm may be parallelizable, but not to the extent needed to get good performance with CUDA (thousands of parallel threads).
The algorithm may require global communication between threads: it can’t be split into independent blocks of threads with no interaction between blocks.
Work may be generated in increments that are too small to efficiently use the GPU (e.g. a server application where lots of small jobs are received asynchronously)
The algorithm’s memory access patterns may map poorly to the GPU’s memory architecture (e.g. lots of random reads and writes that can’t be coalesced).

2, 3, and 5 are the ones that are causing the most problems for me. I can suggest lots of specific features to address them. Example include:

A lightweight global thread barrier for synchronization between blocks.
Atomic operations for types other than ints (especially floats).
The ability to run multiple kernels at once, with each one using a subset of the SMs.
Reducing the cost of global memory access.

But ultimately what it comes down to is that some of the computations I need to do can’t be done efficiently on the GPU, and I’ll take any change that makes them more efficient.

Peter

Topic		Replies	Views
Wishlist Place your considered suggestions here CUDA Programming and Performance	201	206744	April 13, 2009
CUDA 2.1 FAQ Please read before posting CUDA Programming and Performance	10	211268	January 18, 2014
An Even Easier Introduction to CUDA Technical Blog	148	8514	May 26, 2026
'Computations server' application design advice CUDA Programming and Performance	24	13079	March 23, 2007
CUDA Kernel self-suspension ? Can a CUDA Kernel conditionally suspend its execution ? CUDA Programming and Performance	46	45700	April 17, 2011
Kernels launch - parallel or serial? CUDA Programming and Performance	16	7176	January 11, 2010
Some advice needed pls Doubts we have, we're starting with CUDA programming CUDA Programming and Performance	16	4906	June 22, 2011
CUDA 1.0 FAQ (OBSOLETE) Frequently asked questions about CUDA Announcements	2	75970	February 9, 2009
GPU-CPU & GPU-GPU synchronization query on advanced CUDA features CUDA Programming and Performance	12	17632	June 14, 2008
Implementation Questions arrising from Ch.5 on Performace Guidelines in the Programming Guide 2.0 CUDA Programming and Performance	12	2654	June 8, 2009

What can't you do in CUDA that you'd like? Requests for the future

Related topics