Dynamic functions; Expression tree or other alternatives?


For the project I’m currently working on, I would like to be able to do multiple different calculations without prior knowledge about what to calculate. The formula that the kernel should execute can contain the basic arithmetic operators, and the input variables will always be the same i.e. x[n] and y[n] and each thread does one single calculation.
So at one time a kernel needs to execute [font=“Courier New”]output[i] = x[i] / (2 - y[i])[/font], while the next time it could be [font=“Courier New”]output[i] = 15.5 + y[i] - (x[i] + 0.5)[/font].

These calculations will come from another program (statistical package) which I linked to CUDA, and now I’m wondering if it’s possible to implement this. I thought the most obvious way would be to make an expression tree (like this one) which might be some work to do without recursion, since I’m far from being a programming expert…

The other option I thought about was using simple regular expressions to alter a kernel-template and include this at runtime using the driver-api (compile as cubin-file?). Although I read about this possibility, I don’t have a clue yet how to make that actually work (the RE is no problem though).

Any help or advice is highly appreciated!

  • Marcel.

Just my first thought, parse the expression on the CPU and reorder the computation to be a stack based evaluation, like reverse polish notation.
So you’d change “x/(2-y)” into the operations

push x
push 2
push y
do -
do /

Such a simple list can then be evaluated by the GPU without recursion or difficult parsing. If you know your stack has limited depth you can use registers, but even an arbitrary depth stack would work OK by using shared RAM… that may limit your thread count though.