programming using tegra k1 processor (jetson tk1)

hello… i would ask about programming using Tegra k1 processor as i want to make scheduling of tasks between the 4-ARM cores and the GPU in parallel.
can i use 1-ARM core for sending and receiving the Data from the GPU while the remaining 3-ARM core are processing other jobs without ant affect with the GPU operations.
please help me in this or suggest me other way for best heterogeneous scheduling method.

There is a sub-forum dedicated to TK1 here: You may get better / faster answers there.

yes, thanks i know and already asked my question there but my question is related to the programming and processor more than the kit itself .

scheduling of tasks on 4 ARM cores has nothing to do with “CUDA programming and performance”, the title of this forum.

There are many CPU multithreading technologies to do what you want. I would use pthreads or OpenMP to spin up multiple threads. These threads should occupy multiple ARM cores. Designate one of those threads to interact with the CUDA GPU.

“txbob” your answer is what i want from this forum with more explanation on how not making the GPU affecting the threads performance on the ARM cores if all the threads working on one loop(including for the GPU) and should finished in a predetermined timing, and how to synchronize between threads?.
and please if you can give me example with the code.