Uses CUDA to process data

Hi all,
Am not a software guy - so I am seeking guidance. I have a server sending a stream of data over 40 GbE. I have a GPU Server that hosts a Tesla GPU that uses CUDA to process data (right now copied from host memory).
The GPU Server also has a 40 GbE NIC that supports GPUDirect. I would like to process data coming from the remote server. Can I accomplish everything I need to from within CUDA, or do I need something that runs separately - to RDMA the data from the NIC to GPU memory. I am rather confused…

Greetings,
You can integrate transfers to your project, here is a nice project to check for transfers GitHub - NVIDIA/gdrcopy: A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

If you are talking about system design in terms of hardware, you do not need anything else as far as I know.

But then I am not a source to provide an official answer :) Take everything with a grain of salt

I hope it helps