Help needed with direct sparse linear equations solver (SuperLU)

I currently develop a process modelling, simulation and optimization software and I started porting SuperLU direct sparse linear equations solver to CUDA. It works for certain cases but there are few issues with synchronization between threads. I am looking for some people good in CUDA programming interested to help in resolving the issues so I guess this is a good place to ask for it. More information and the source can be found at DAE Tools project SuperLU_CUDA port.

Cheers

You seem to be working on the assumption that the issues you are having are resolvable (or at least in a way which makes for useful performance). I suspect that they are not.

The one with synchronization between threads in a warp could be unsolvable, indeed. The synchronization between blocks might be. Perhaps.