Possible different method of Frame generation

I was thinking of a new method of frame generation.

The method is rending at full resolution 5 to 10 FPS and then rendering the in between frames at a lower resolution (example: 360p) then getting the DLSS to use the full renders as a reference to apply sharpness, supersampling and whatnot to the in between frames.

So for example: 120 FPS 1080p, 10 FPS full resoluton mode, frame 1 is full resolution render, then frames 2-12 are rendered at 360p, then using a variant of DLSS to use the 1st full resolution frame a reference then getting it to upsample frames 2-12 and compensating for the differences and tries to apply sharpness and detail as best it can with the differences of each frame in mind based on frame 1. Then repeat for the rest of the frames in the second.
This could make the lower resolution frames look simular to the full resolution frames.

If using higher Full Frames like 20 FPS then have 5 in between frames. or Full resolution 1 FPS then 119 in between Frames. I’m Pretty sure I more or less explain it correctly.

It should work for still movement, but when the camera moves it will probably have issues on the edges in the direction the camera moves. Same will happen if something comes into frame from the sides or abruptly on the in between frames. It also varies on speed the camera moves, more pixelation issues the faster it moves. This more likely to occur the lower the Full resolution FPS is and by a certain percentage the resolution of the in between frames.

Reccommened adjustment options. Full resolution FPS: 1,5,10, 30 FPS, or whatever best matches the intended total locked FPS.
In between FPS resolution frame rate, and Resolution options ranging from 144p to anything around half full resolution

Hello @jayman1243 and welcome to the NVIDIA developer forums.

The DLSS super resolution models used are based on generic training on huge amounts of data from all kind of gaming content. It is not specifically trained on the single game it is used in. In that sense your approach of using full-size reference frames to apply DLSS would not work. You would rather need to do some form of real-time fine-tuning of the model, which of course would be too costly in terms of performance.

That leaves regular computer vision algorithms to create the interpolated, up-scaled frames. That again is expensive and as experiments have shown, not as good quality wise as the normal DLSS-SR algorithm.

For actual frame generation, meaning creating a frame directly from the reference frame, that is a different story. The models again are based on stable diffusion, GANs or similar and trained off-line, but inference takes the upscaled (with DLSS) reference frame and motion vectors and creates new content based on that at that resolution. In a sense this would be your case of 60fps at 360p upscaled and frame generated to 120fps 1080p. Although 360p is really low value.

The VP of applied research gave a nice explanation of DLSS 3 features that explain this quite nicely.

Please, keep these ideas coming, it is great to see such interest and innovative thinking!