The main use cache is to free the other data channels (to keep their full bandwidth) for coefficients and other data, which are small overall or for an extended time a small subset is used (to fit into the constant cache) and where the accessed data item is the same for all lanes (threads of a warp).