.NET, task-parallel OpenCL question


Please excuse any ignorance on my part – I’ve read through a lot of this and other forums and I still have some questions. I’d appreciate any thoughtful tips on my situation.

I"m an artificial intelligence researcher, and we use some non-mainstream algorithms under development in our lab. I’ve implemented thread-parallel code in .NET and I can make use of multiple processors that way, but I haven’ done any OpenCL work. I’d like to know if the platform is capable of doing what I’d like, or it there is another recommendation.

We are doing computations that operated on a collection of data which is represented as a collection of graphs and lists of rather complicated data structures in memory. There are lots of pointers between elements in our data. We aren’t doing matrix manipulation, and I don’t think some convoluted mapping of our data into a matrix form is possible or reasonable at all. I’d like to do general data reads on the objects in this graph in order to compute some output values.

I want each parallel task to be doing a different computation, but involving the same input data.

Can I perform this kind of general computation across multiple GPU cores using OpenCL? My data currently is in .NET; do I have to do all the computation in C99, or is there way to do some .NET operations right on the cores? (that sounds a bit crazy, I realize… just checking). I really need to do all sorts of pointer lookup to find stuff in my data, and do some elementary loops, generate some random numbers, lots of if statements, etc. to compute the outputs.

My synchronization requirements are rather small. I think I can get by with just doing data reads, and recording all the output data in a separate place for each thread. Ideally, I’d be able to put some locks on the input data and do limited synchronization, but I think that’s asking too much. I’m happy to just make the shared data read-only and deal with synchronization off-line.

To be clear, see the pseudocode below of my dream-algorithm. Can anybody help me figure out if this can work on OpenCL or otherwise? “data” is a big collection of objects, etc, as I said, and each “MyTask” is a general bunch of fairly complicated code. There might be 20 or 30 different kinds of “MyTask” functions to be called, and each one has a bunch of input parameters in addition to needing to read arbitrary elements of the big “data” object.

Thanks so much!



// C# pseudocode…

MyHugeComplicatedDataStructure data;

do {

List results;

List bigTaskList = GenerateComputationTasks(data);

Parallel.foreach (MyTask task in bigTaskList) { // something like this… I know the OpenCL setup is different, obviously

    ComputationResult result = task.runComputation(data);   // the elements of "data" can be considered read-only if necessary, 

                                                            // although synchronization and modifiablity of data would be even more awesome



ApplyComputationResultsToData(results, data); // we won’t need to do this part if there is synchronization of “data” provided above.

problemSolved = DidWeFinishYet(data);

} while (problemSolved == False)


Well, huge, pointer-filled data structures are generally a bad idea for OpenCL. Can you flatten them to something pointerless?

Also, 20-30 tasks? That’s not nearly enough for this programming model, you’d be much better off just using .NET threads. Unless you can split MyTask into dozens or hundreds of threads.