We have processes using both CPU & NVIDIA GPU (CUDA). For simplicity assume the process is one large function that takes in X inputs. Right now it goes like this:
inputs = [X1, X2, X3,…]
for X in inputs:
output = function(X)
We need to parallelize this such that function(X1), function(X2), function(X3) and so forth begin to run simultaneously as opposed to function(X2) waiting for function(X1) to finish as it currently does in the for loop.
function(X) processes text & images. function(X) will evolve. But no AI done on images right now. AI only on text
Key us using full computing power (RAM/GPU) each time. Inputs will scale and adjusting parallelization parameters each time is not sustainable.
System Specs:
- Ubuntu/Linux
- Python 3.8
- CUDA + Tensoflow being used.
- Not a certainty that CUDA will always be used
Dont want to rely on a cloud event based service to do this. Want to do it natively on OS Below approaches being considered, but advice would be appreciated
*(Multi-processing option): Multiprocessing vs. Threading in Python: What you need to know.