I am trying to estimate the effect of restricting register usage on achieved occupancy of the application. While running my experiments, when I tried to restrict the number of registers of cdpBezierTessellation application found in Nvidia samples, I got an error.
Flag added to nvcc: -maxrregcount 16
Error: nvlink error : entry function ‘_Z21computeBezierLinesCDPP10BezierLinei’ with max regcount of 16 calls function ‘cudaMalloc’ with regcount of 18
I don’t understand exactly why this is happening. Can anyone help me with this?