We are an academic research group that is considering the use of DALI to accelerate data ingestion for data-parallel training that involves multiple compute nodes. In this regard, we are planning to extend DALI’s caching layer to take advantage of remote P2P transfers to avoid duplicated decoding and transformations.
We kindly ask the developer community to briefly explain how the existing caching layer is organized in the source code. What would be the best way to hook into the cache and intercept puts/gets? Is there a unified way to do this such that we can isolate the remote communication in a self-contained module if possible? Your thoughts on this are much appreciated.