What is the current standard for hardware interpolation in texture memory? Is it single or double precision, and what is the order of interpolation possible? Any links to a reference on this would be welcome – I’ve had trouble finding any recent information.

Check the Programming Guide, it describes the details of texture interpolation in one of the appendices. The hardware uses fixed-point computation, with a 1.8 format, if I recall correctly. [Later:] See appendix F:

That is not a standard of any kind, that is just what NVIDIA GPUs do, and it has not changed in a decade. Whether the granularity of interpolation is sufficient for your use case is up to you to find out. There have been multiple questions here in the past about “broken” interpolation which all turned out to be due to an insufficient understanding of the consequences of using low-precision interpolation.

Have I understood correctly that while the distance between grid points is stored in 9-bit fixed point format the actual calculation is done in single precision floating point?

So if tex(x) = (1-α)T[i] + αT[i+1]

it doesn’t convert my values of T[i] down to 9-bit before multiplying by α?

In my case I need to quite accurately calculate the values halfway between grid points but nowhere else.

I would suggest running some targeted experiments to get a better understanding of how the granularity of the fixed-point representation is going to affect your data. The most common effect that prompts people to post questions about “broken” interpolation is that there does not seem to be any interpolation at all, due to the coarse granularity.

GPUs offer copious single-precision throughput, so where the accuracy of the interpolation is even just potentially an issue, I always recommend approaching the issue from the other direction: First try manual interpolation using single-precision computation, it may already be fast enough. For a 1-D interpolation, only two fused multiply-adds are needed:

__forceinline__ float lerp (float a, float b, float t)
{
return fmaf (t, b, fmaf (-t, a, a));
}