What's the importance of the core clock for CUDA performance?

As the application is executed at the shader cores, what’s the importance of the core clock for CUDA performance?
Can I down clock it to improve temperature without performance loss?

If you application is bandwidth bound you can downclock until you become compute bound.

If your application is compute bound to begin with then you could benefit from downclocking the memory instead but I’m not sure that its possible.