What exactly happened when you tried to set the application clocks?
I just tried
--application-clocks on a Quadro P2000 under Windows 10 with the normal WDDM driver, and
nvidia-smi had no complaints when I tried to set the application clocks. Note that
nvidia-smi requires administrative permissions to set clocks, so you may need to run
nvidia-smi -acp UNRESTRICTED from an administrator command prompt first. Make sure to specify valid clock values, a list of which you can retrieve with
nvidia-smi -q -d SUPPORTED_CLOCKS. For example, the Quadro P2000 seems to support only one memory clock value: 3504 MHz.
When I use
nvidia-smi -i 0 -ac 3504,1721 the response from
Applications clocks set to "(MEM 3504, SM 1721)" for GPU 00000000:17:00.0.
I can’t exactly make sense of your observations. When I run with very short kernels, sufficiently spaced apart, my GPU seems to operate at about half the highest available clock rate. Whether this is for real or an artifact of how the GPU-Z utility samples the clocks I cannot say. Even if the clocks never increase to full work frequency on account of the short-running kernel, I would not expect this to cut performance by a factor of 5, as you state you observe.