Scaling a Matrix with texture on the GPU

Is it possible to scale a matrix (FP64) by the texture on the GPU in parallel with cuda core and tensor core operations?


What scaling operation are you referring to?

The texture unit on a GPU has the ability to do certain kinds of multiplications when performing “texture linear filtering”, but these (AFAIK) cannot be used to scale a matrix (e.g. sA where s is a more-or-less arbitrary scalar and A is a matrix).

The texture unit in a GPU can retrieve 8-byte textures but is not able to perform any of the usual texture linear filtering on them.

Even if you could do this somehow (&), the texture linear filtering engine has only about 9 bits of resolution, so you’d have to consider carefully the use-case for applying this on a FP64 type.

(&) in the 32-bit float case, I suspect it might be possible to have an interleaved realization of the A matrix, such that the range of possible scaling is represented by interleaved points. Suppose we have a 1D A matrix like so:

0.2 0.4 0.6 0.8

It might be possible to create an interleaved version of A, where the interleaved values represent the maximum range of scaling (let’s assume a maximum multiplier of 10 for s, i.e. s is in the range of 1 to 10):

0.2 2.0 0.4 4.0 0.6 6.0 0.8 8.0

You could then use the linear interpolator (perhaps) to scale the A matrix at the point of texture fetch, by providing an offset to the sample point that varied from 0 to 1, representing a multiplier from 1 to 10 (in this example).

This would still be potentially “coarse” scaling due to the limited representation of the scaling factor.