Hi David, thank you for your answer.
Ok, so I misunderstood you.
The launch index is a uint2 (X,Y pixel coord) and that one I store for each rendered curve primitive, then I calculate the difference between the current and the previous launch XY as signed integers, since difference between pixels are always integers; the flow vector then is stored as float2 of course;
Doing integer computation instead of converting 4 values to float and doing 2 float subtractions should be faster, shouldn’t it ?
And for screen pixels, which have as input already uint2 I don’t see any case where they would require any float-type.
I’m not calculating any hitpoints, I simply use the final screen pixel of a primitive, where the curve was rendered (ignoring race conditions during writing the XY for that primitive).
For the texcoord I use the curve parameter; I cannot simply calculate that for t-1, cause the curve input was build from a streamOut buffer, which itself was generated from a simulated animation; So its not the same curve at all, only its topology is ensured to remain identical.
I have not understood how the part what you described about the 3x3 matrix basis works, yes, please explain that.
Here an example output:
pixel [4054] coord: 480, 461 prev: 473, 469 flow:7.000000, -8.000000
pixel [4055] coord: 480, 459 prev: 476, 461 flow:4.000000, -2.000000
pixel [4058] coord: 488, 427 prev: 486, 427 flow:2.000000, 0.000000
pixel [5608] coord: 550, 419 prev: 550, 419 flow:0.000000, 0.000000
pixel [5609] coord: 557, 405 prev: 557, 405 flow:0.000000, 0.000000
pixel [5830] coord: 641, 104 prev: 636, 103 flow:5.000000, 1.000000
From the temporal denoised visual output the above described pixel-difference flow vectors seem to work, even when they are somehow applied to all pixels of a curve primitive (if its covering more than one pixel).
Visually the output seems not to have artefacts on the test colors I used for now.