questions of parallel algorithm and data structure in relational data

hi dears.

my goal is to do a parallel computing with relational data. some envs and define is:

  • evn: cuda 8.0, pascal device, 64bit Win or Linux
  • data set: A, B, C, D. releation: a->bs(an a with n b, n>= 1), b->cs, c...

cpu algorithm like:

var target
for a in A do
  for b in a->B do
    for c in b->C do
      doSomething with target

in a cpu, you can setting structure like:

structure a {
  Akey key;
  vector<b> Bs;
  vector<keyb> keyBs;
typeof unorder_map<Akey, a> A;
typeof unorder_map<Bkey, b> B;
B bMap;

for a in A do
  for bkey in a.keyBs do
    auto b = bMap.get(bKey)

i think this maybe accelerated in cuda, and my questions:

  1. what the best practice data structure in a cuda architecture? all is a array of memory, not any map, hash, vector…
  2. i’d like to use CUDA Dynamic Parallelism, which looks like to accelerated a multi-level for, is there any better practice?
  3. does openacc can deal with std data structures? like unorder_map, map, vector???

i would be greatful for your help and advice