I’m starting to look into parallel processing with CUDA. Been doing most of my programming in 3D modelling software up to this point, but really want access to the CUDA library to speed things up. Nevertheless, I’ve installed it and I am getting the unepected result that the example file is running faster without the parallel. For example, the code attached is returning:
VectorAdd took for 0.7805259227752686econds
VectorAdd took for 1.8527252674102783econds
I know this is probably a fairly dumb question, but any help would be appreciated. The GPU is a Quadro P4000.
import numpy as np import time from numba import vectorize, cuda def VectorAddCPU(a, b): return a + b @vectorize(['float32(float32, float32)'], target='cuda') def VectorAddGPU(a, b): return a + b def main(): N = 320000000 A = np.ones(N, dtype=np.float32) B = np.ones(N, dtype=np.float32) start = time.time() C = VectorAddCPU(A, B) vector_add_time = time.time() - start print("VectorAdd took for % seconds" % vector_add_time) start = time.time() C = VectorAddGPU(A, B) vector_add_time = time.time() - start print("VectorAdd took for % seconds" % vector_add_time) if __name__=='__main__': main()