CPU faster than CUDA

matthewaustin233 · September 5, 2020, 3:02am

Hello all.

I’m starting to look into parallel processing with CUDA. Been doing most of my programming in 3D modelling software up to this point, but really want access to the CUDA library to speed things up. Nevertheless, I’ve installed it and I am getting the unepected result that the example file is running faster without the parallel. For example, the code attached is returning:
VectorAdd took for 0.7805259227752686econds
VectorAdd took for 1.8527252674102783econds

I know this is probably a fairly dumb question, but any help would be appreciated. The GPU is a Quadro P4000.

import numpy as np
import time

from numba import vectorize, cuda

def VectorAddCPU(a, b):
    return a + b

@vectorize(['float32(float32, float32)'], target='cuda')
def VectorAddGPU(a, b):
    return a + b

def main():
    N = 320000000

    A = np.ones(N, dtype=np.float32)
    B = np.ones(N, dtype=np.float32)

    start = time.time()
    C = VectorAddCPU(A, B)
    vector_add_time = time.time() - start
    print("VectorAdd took for % seconds" % vector_add_time)

    start = time.time()
    C = VectorAddGPU(A, B)
    vector_add_time = time.time() - start
    print("VectorAdd took for % seconds" % vector_add_time)

if __name__=='__main__':
    main()

Marco13 · September 6, 2020, 5:32pm

Short answer:

GPUs are fast for computations. But a vector addition mainly consists of reading and writing data (with a cheap addition in-between). Instead of doing a trivial a+b, try doing something like cos(a)*sin(a)*cos(b)*sin(b) (this doesn’t make sense - it’s just to create an “artificial” computation workload). The GPU will most likely be faster then.

Long answer: The introductory part of what I wrote here.

njuffa · September 6, 2020, 6:39pm

GPUs are not only fast for computation. They are also, in many cases, fast for data movement

Issues of “CPU faster than GPU” in my experience boil down to these major categories:

(1) Comparing high(er)-end CPU to low(er)-end GPU. The performance spectrum differs by a factor of ten on both sides of the computing universe. The average speedup on the GPU across many use cases for equally well-optimized code on high-end equipment is around 5x, with a typical range of 2x to 10x.

(2) Too much data movement between CPU and GPU (host and device). The CPU might have a memory bandwidth of 100 GB/sec and the GPU one of 500 GB/sec, but they are currently most often connected by a PCIe gen3 link with at most 12 GB/sec per direction.

(3) Memory bandwidth limited computation on small datasets. If the use case makes good use of CPU caches with their extremely high bandwidth, the memory bandwidth advantage of the GPU is nullified.

(4) Use cases that are a poor fit for present GPU architectures: Code with lots of data-dependent branches or “random” memory access patterns. Latency-dependent computations. As GPU architectures have become more flexible with every new generation, the universe of “unsuitable” use cases continues to shrink, so when it doubt, try it out.

In this case I would guess we are looking at an instance of either (2) or (3).

Topic		Replies	Views
Sequential code is faster than parallel, how is it possible? CUDA Programming and Performance	8	9347	August 3, 2016
Cuda works slower then CPU CUDA Programming and Performance	1	535	November 29, 2019
GPU is slower than CPU CUDA Programming and Performance	7	17885	August 10, 2017
GPU vs. CPU GPU is always much slower CUDA Programming and Performance	1	10271	June 5, 2009
I hope to know that, why GPU faster than CPU in Ge CUDA Programming and Performance	5	4245	December 28, 2007
Is there any problems we can solve using CPU faster rather than using GPU? CUDA Programming and Performance	5	1342	November 5, 2013
GPU (Cuda) vs. SSE3 Results of my Vector Addition CUDA Programming and Performance	13	18059	January 10, 2008
Cannot find a reason why CPU process much faster than GPU process in simple code CUDA Programming and Performance	3	498	November 19, 2018
slow speed of cuda code CUDA Programming and Performance	4	5242	October 30, 2011
CUDA slower than CPU? CUDA Programming and Performance	7	839	August 18, 2023

CPU faster than CUDA

Related topics