I use Nvidia- Bluefueld-2.
I read the host’s data from DPU and measured the execution time for doca dma.
Doca version is 2.5.0107
I used the provided doca sample (eg.dma_copy_dpu).
/* Submit DMA task */
result = doca_task_submit(task);
if (result != DOCA_SUCCESS) {
DOCA_LOG_ERR("Failed to submit DMA task: %s", doca_error_get_descr(result));
doca_task_free(task);
goto destroy_dst_buf;
}
resources.run_main_loop = true;
/* Wait for all tasks to be completed and context stopped */
clock_gettime(CLOCK_MONOTONIC, &start);
while (resources.run_main_loop) {
if (doca_pe_progress(state->pe) == 0)
continue;
}
clock_gettime(CLOCK_MONOTONIC, &end);
I measured the time for the task as above.
The result is shown below.
[07:47:17:915830][3127561][DOCA][INF][dma_copy_dpu_main.c:63][main] Starting the sample
[07:47:17:973569][3127561][DOCA][INF][dma_common.c:314][dma_state_changed_callback] DMA context is running
[07:47:17:975177][3127561][DOCA][INF][dma_common.c:245][dma_memcpy_completed_callback] DMA task was completed successfully
[07:47:17:991954][3127561][DOCA][INF][dma_common.c:303][dma_state_changed_callback] DMA context has been stopped
[07:47:17:991989][3127561][DOCA][INF][dma_copy_dpu_sample.c:310][dma_copy_dpu] Remote DMA copy was done Successfully
[07:47:17:992007][3127561][DOCA][INF][dma_copy_dpu_sample.c:312][dma_copy_dpu] Memory content: This is a sample piece of text
[07:47:17:992027][3127561][DOCA][INF][dma_copy_dpu_sample.c:313][dma_copy_dpu] DMA Copy Latency: 16815498.00 nanoseconds
[07:47:17:992059][3127561][DOCA][INF][dma_copy_dpu_sample.c:314][dma_copy_dpu] DMA Copy Latency: 16.82 milliseconds
[07:47:17:992080][3127561][DOCA][INF][dma_copy_dpu_sample.c:321][dma_copy_dpu] Host sample can be closed, DMA copy ended
[07:47:17:993201][3127561][DOCA][INF][dma_copy_dpu_main.c:99][main] Sample finished successfully
As you can see, it came out to 16.82 ms.
Why?
Did I identify the wrong part performing the DMA operation?
In doca 1.x, latency was measured in us units.
Please provide a solution.