Memory allocated for workspace

Hi community,
I want to analyze the memory usage during the training process of CNN, which contains memory for weight, feature map, input batch and workspace.
Except workspace memory, the usage can be calculated easily by some parameters such as the number of weights and the shape of output layers.

As a result, I wonder how cudnn allocates the temporary memory(i.e. workspace) when running forward pass and backward pass in CNN.
I have read the cudnn documentation. It seems that functions such as cudnnGetConvolutionForwardWorkspaceSize can let me know the workspace size I wonder, and that the size depends on what algorithms it chooses.

If the way the APIs is implemented is known, I can clearly know the policies for workspace memory allocation, and my question will be answered.
However, since the cudnn is a closed-sourced library, I don’t know how the API is implemented. Does anyone have any clue how to resolve this? Thank you in advance.