How to map FEM into CUDA

I want to map properly FEM into CUDA architecture. Here I have an element with some nodes. I want to map the element into BLOCK and each element’s nodes into THREADS of that BLOCK.

Please help me how I can map this computing pattern into GPU by CUDA???

