How to copy to memory when implementing depth to space operation

Linux distro and version
Ubuntu 16.04

GPU type
Nvidia 1080 TI

nvidia driver version
384.111

CUDA version
V9.0.176

CUDNN version
7.0.5

Tensorflow version
1.10

TensorRT version
4.0.16

Hello. I need to implement depth to space operation from tensorflow into tensor RT.
In order to do this I have to access the inputs parameter in the enqueue method using pointer arithmetic. My question is how do I copy the elements I need from the input to the output. I have tried to simply access the output parameter using pointers but I get a segmentation fault.
Here si my code:
for (int currentRow = 0; currentRow < inputHeight; currentRow++) {
for (int depthModifier = 0; depthModifier < mBlockSize; depthModifier = depthModifier + 1) {
for (int currentColumn = 0; currentColumn < inputWidth; currentColumn++) {
for (int currentDepth = mBlockSize * (depthModifier);
currentDepth <= mBlockSize * (depthModifier + 1) - 1; currentDepth++) {
int pointerIndex =
inputDepth * inputWidth * currentRow + inputDepth * currentColumn + currentDepth;
float value = *(inputTemp + pointerIndex);
}
}
}
}

I want to assign the value at inputTemp + pointerIndex to the output.
What is the best way to do this?

Thank you for any assistance