CUDA Alternative of MATLAB lsqnonlin

I have a MATLAB script which extensively use the lsqnonlin function a lot of times (independently, you can think of a simulation). Unfortunately, the function is GPU-incompatible. I do not share code because only the mentioned function is contextually essential.

Minimizing a squared functional (what lsqnonlin does) is pretty standard problem from mathematical point of view and I am surprised that I cannot find such function among CUDA Libraries. For example, cuSOLVER supports only linear algebra, but I (and many others) need nonlinear operations.

So I kindly ask you for an advice on a GPU-compatible function that does nonlinear minimization. Am I wrong and there is such function in CUDA Libraries? If not, could you please recommend me such a function as a third-party CUDA library or in other languages such as Python, R, etc.? Thank you a lot in advance.

The fact that Matlab’s lsqnonlin does not support GPU arrays suggests either that this functionality is (1) a difficult target for massive parallelism, or (2) that it is not in particularly high demand, or both. I certainly have never come across it, nor am I aware of a drop-in replacement in Python running on CPUs (which does not mean such a thing doesn’t exist, as it is impossible to keep track of everything happening in the Python universe).

I am not aware of any NVIDIA-provided library that implements the functionality. As the development of CUDA-based libraries is largely driven by customer demand, you could always file a bug report with NVIDIA in the form of a feature request. It will help with prioritization if you provide ample motivation why and where this is needed.

I see that GitHub has the following project (last worked on in 2019), which you might want to check out as a potential starting point for your work:

1 Like

This indeed looks like a suitable starting point to research.
Thank you very much!

Nice find @njuffa. Too bad it’s GPL, kind of a non-starter for me.