There are many SPs on the device, is there any difference between these SPs?
For example, one part dedicated to texture-related tasks, another part dedicated to 8-bit data, another part dedicated to 16-bit data, etc.? Or, are all SPs the same, all generic?

There are many (more than 2) types of functional units in a SM. A functional unit is designed to handle certain types of instructions. For example the LSU (Load/Store Unit) is a functional unit that handles, you guessed it, load and store instructions.

A SP is a functional unit that handles single-precision floating point add, multiply, and multiply-add instructions.

Given that definition, there is no functional difference between SPs. There are certainly functional differences between functional units of different types (e.g. between a LSU and a SP).

Given this definition, there are presumably other functional units (you may also see this referred to as “pipes” in the profiler terminology) that handle instructions that target, for example, double-precision floating point, or 16-bit floating point, or integer ops, etc.

Some of this is evident by reading any of the architecture whitepapers.

