GPU processing with DAW

Hello,

I’m quite interested in the potential benefits of utilizing a GPU could hold for audio engineering and DSP. I found a presentation on NVIDIA’s website explaining some of the possibilities of using a GPU to increase the performance of Audio processing. The presentation i am referring to can be found here:

I took a look at NVIDIA’s CUDA toolkits, which seem like a good place to start.
My hypothesis is to incorporate their c libraries with MaxMSP software, found here:
https://cycling74.com/sdk/max-sdk-7.3.3/html/chapter_anatomy.html

Prehaps I can take my shot at creating a basic low-pass filter which can be processed by the GPU. I believe the primary constraint will be the capability of processing in real-time.

I would like to know if this is at any way realizable, and whether or not there is any current development or libraries available pertaining to this field.

CUDA is suitable for soft real-time processing with modest demands. After all, GPUs were initially designed as accelerators for soft real-time processing applications called computer games.

The speed-of-light lower limit on time granularity is set by the fact that one can launch at most about 200,000 CUDA kernels per second, corresponding to a minimum kernel launch overhead of about 5 microseconds.

If you search for “audio processing” and “CUDA” you can find examples of how people have experimented with the use of CUDA for convolution and FFT-based audio-processing, as well as machine-learning based audio processing (e.g. voice separation), sometimes in the context of existing audio-processing frameworks.

The fact that there is no wide-spread GPU acceleration provided by existing audio-processing frameworks would appear to be a good indication that GPUs don’t provide clear-cut advantages over CPU-only approaches for most kinds of audio processing.

While this is outside my area of expertise, I would think that the overhead of shipping data back and forth between CPU and GPU makes use of the GPU unattractive in many cases: in the time needed to ship data around one might as well perform the necessary computations (not that many) on the CPU. Various audio-processing algorithms may also expose insufficient inherent parallelism for modern GPUs, for which parallelism on the order of 10,000 threads is desirable.

The situation may be somewhat different with NVIDIA’s integrated products (Jetson, TX2, etc) where CPU and GPU share the same physical memory.