Phython programming - Performance P4000

Hi Gang,

Sorry I am new to this area but was looking for some advice on what hardware to purchase. I work in finance and we have a Quants team they are currently doing a lot of programming in Python. Some of there code/jobs take 40hr to complete. That’s on hardware with 32GB of ram and a Xeon E5-1603 processor.

I have done some reading into utilizing the GPU and wondered if you could enlighten me on and help manage expectations on what to expect. Hence is it really worthwhile?
We are looking at an NVidia Quadra P4000.
Any advice would be beneficial.

A GPU won’t speed up ordinary python code.

Your developers would need to be using some sort of CUDA-enabled technology for python to take advantage of CUDA GPUs.

Thanks for your reply,

Can we use the cuda toolkit for that?


We currently use Anaconda, and as far as can tell that is compatable with cuda

I haven’t programmed in Python in years, but that could be your performance problem right there. Does Python make good use of multithreading, and use AVX2 SIMD effectively? To me, high-performance computing on CPUs involves C++ (or even just C) and the latest Intel compiler, possibly augmented by hand vectorizing critical code with SIMD intrinsics.

I would assume a Xeon E5-1603v4? That would be a quad-core processor running at 2.8 GHz. Not exactly a speed demon. Quad-core Xeon processors currently reach as high as 4.1 GHz base, 4.5 GHz boost clock. 32 GB of RAM aren’t much, but it is unclear whether your computations would benefit from larger RAM or maybe faster RAM (multi-channel DDR4).

Would adding a Quadro help? Maybe. Would it be the most cost-effective way to boost performance? Maybe. It all depends on what exactly you are computing and how effectively Anaconda would let you use the GPU. I also note that a Quadro P4000 is a middle-class GPU, not high-end.

What does profiling the application tell you about the bottlenecks in the app(s)? With many unknowns at present, it would probably be best to do an exploratory prototype project using a consumer-grade GPU (cheaper than equivalent Quadro, and likely sufficient for your needs) with Anaconda, and see whether that provides cost-effective acceleration.

You might also want to consider / explore the use of cloud provider instances as a cost-effective way to do your computations, rather than buying hardware.