What is "Kernel auto-tuning", I need more information?

Continuing the discussion from what is `Kernel Auto-Tuning` and `Multi-Stream Execution`?:

I really don’t understand about kernel auto-tuning, that topic can’t solve the problem.
I need more information about that content, it’s better to have a detailed example attached.