Using DLSSG without IDXGISwapChain::Present()

mbucchia · March 24, 2023, 5:35am

Hello,

I was pleased to see the DLSSG SDK released, however it looks like Nvidia only released it via Streamline and did not include the source for the DLSSG plugin.

Reading through the SL documentation:

Streamline/ProgrammingGuideDLSS_G.md at main · NVIDIAGameWorks/Streamline (github.com)

DLSS-G intercepts IDXGISwapChain::Present and when using Vulkan vkQueuePresentKHR and vkAcquireNextImageKHR calls and executes them asynchronously.

While this is neat, for my application, I would like to explore using DLSSG without going through IDXGISwapChain::Present(), so I can control other processing and how the interpolated frame is presented.

I was hoping to be able to use DLSSG in a way where I submit all the required bits (color buffer, depth buffer, MV, etc) and can just “evaluate features” and either wait or retrieve the interpolated image as a texture.

I suspect this might be possible if using the DLSSG SDK directly, but it does not look like this is available. I tried looking into some of the strings/symbols in the DLL, an I can see things like DLSSG.OutputInterpolated, which I suspect would be an NGX_Parameter where one could specify a destination texture for the interpolated output…

I also suspect there might be a way to Detour IDXGISwapChain::Present() and somehow readback from the swapchain the interpolated texture before it’s been presented… but this sounds a little tedious to do.

Perhaps I am missing something fundamental here. but otherwise do you have plans to make this scenario possible (meaning more friendly) and/or release a lower level DLSSG SDK?

Thank you!

MarkusHoHo · March 27, 2023, 12:54pm

Hi @mbucchia, great to hear that you are interested in DLSS with Frame Generation!

You are correct, for the time being the Frame Generation part of DLSS is only part of Streamline.

But there are plans to add this as part of the DLSS plugin for Unreal as well as possibly integrate it fully with our Path-Tracing SDK.

But I cannot share any time-line or further details on this at this time.

Thanks!

mbucchia · March 28, 2023, 7:43am

Thank you for the reply @MarkusHoHo, however this does not answer my question at all. Will I be able to use DLSS Frame Generation without being forced to use an IDXGISwapChain? You mentioned Unreal and Path-Tracing, neither of which will solve my problem. I am looking for the traditional SDK, ie the counterpart of NVIDIA/DLSS (github.com) but for DLSSG.

Thank you.

MarkusHoHo · March 28, 2023, 9:26am

Hi again.

Right now Streamline is the only way to use DLSS Frame Generation.
The Path-Tracing SDK is an accumulation of the most relevant DL and ray-Tracing SDKs NVIDIA has to offer, including DLSS. That means once DLSS Frame Generation is released apart from Streamline, you will find it there as well.

At what time that will happen as well as how the implementation details will look like (e.g. whether you will still need to go through IDXGISwapChain or not) is something I cannot tell you more about at this time.

Please be a little bit more patient and watch our technical blog or the DLSS pages for further details and announcements.

Thanks!

mbucchia · August 16, 2023, 4:07am

Hi @MarkusHoHo,

Any update on this please? DLSSG was announce nearly a year ago now, and still no proper SDK :(

Skimming though the DLSS super resolution SDK, some of the definitions are present (eg: NVSDK_NGX_Feature_FrameGeneration). Can we please just have the header? I’ll take it even if you provide 0 documentation at this point… My scenario does not allow the use of Streamline and the hacked DXGISwapchains.

Thanks.

MarkusHoHo · August 16, 2023, 1:50pm

Hi @mbucchia,

I am sorry, but there is no change to report, DLSS Frame-Generation is tied to Streamline if you want to use a standalone SDK. Also there is no plan to release source for DLSS-FG.

I revisited your original post and will send that as a question to the DLSS team (specifically if it is possible to access the interpolated frame without going through SwapChain), if and when I get an answer I will post back here.

Thanks!

MarkusHoHo · August 17, 2023, 11:36am

Quick update, I received confirmation that usage of DLSS-FG without SwapChain::Present() is not possible today.

Sorry!

mbucchia · August 21, 2023, 8:00am

Thanks for checking.

So I’ve spend the last 3 days disassembling and Detouring your DLLs with Flight Simulator 2020, and I’ll post my findings here for anyone else who might come across this. This really should be part of your SDK, I’m not sure why you do not want to publish this information. With the code below, I am able to use frame generation at a low-level, without Streamline, which allows me to conduct experimental research on scenarios that Nvidia has not acknowledged and hopefully help advance the usage of the technology forward.

PS: Admittedly I am having some blurriness issues right now, still working through them, suspecting bad motion vectors scaling or something else dumb on my side. But I can clearly see motion interpolation happening.

Step 1)

Integrate the DLSS Super Resolution SDK, app needs to include nvsdk_ngx.h and link the corresponding lib file.
Copy the nvngx_dlss.dll from the Super Resolution SDK
Copy the nvngx_dlssg.dll from the Streamline SDK
Invoke NVSDK_NGX_D3D12_Init_with_ProjectID() to initialize DLSS and DLSSG. The presence of the nvngx_dlssg.dll in the app path will magically enable usage of DLSSG.

Step 2)

Obtain an NVSDK_NGX_Parameter* by calling NVSDK_NGX_D3D12_GetCapabilityParameters()
You will use this container to set various (undocumented) parameters for initializing and evaluating the DLSSG feature. See the snippet below for the name of the various parameters and how to set them. Look at the Streamline SDK documentation to make sense of some of them.

void interpolate(graphics::IGraphicsTexture* color,
                 XrRect2Di colorRect,
                 XrPosef cameraPose,
                 XrFovf cameraFov,
                 XrPosef prevCameraPose,
                 graphics::IGraphicsTexture* depth,
                 XrRect2Di depthRect,
                 xr::math::NearFar nearFar,
                 graphics::IGraphicsTexture* motion,
                 XrRect2Di motionRect,
                 graphics::IGraphicsTexture* outputInterpolated,
                 graphics::IGraphicsTexture* outputReal,
                 bool reset = false) {
    D3D12ReusableCommandList commandList = getCommandList();

    if (!m_dlssHandle) {
        // Common properties. We know that our usage of DLSSG is bound to a given swapchain and we never
        // redimension our swapchains or switch format, therefore we don't need to monitor for changes.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_CreationNodeMask, 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_VisibilityNodeMask, 1);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Width, outputInterpolated->getInfo().width);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Height, outputInterpolated->getInfo().height);
        NVSDK_NGX_Parameter_SetUI(
            m_ngxParameters, "DLSSG.BackbufferFormat", (unsigned int)outputInterpolated->getInfo().format);

        // These are reverse-engineered from MSFS flatscreen mode.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "Enable.OFA", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.EnableInterp", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);

        // Optional(?) signaling callbacks.
        // DLSSG.SyncWaitCallback{Data}
        // DLSSG.SyncSignalCallback{Data}
        // DLSSG.QueueSubmitCallback{Data}
        // DLSSG.SyncSignalOnlyCallback{Data}
        // DLSSG.SyncWaitOnlyCallback{Data}
        // DLSSG.SyncFlushCallback{Data}

        CHECK_NGXCMD(NVSDK_NGX_D3D12_CreateFeature(
            commandList.commandList.Get(), NVSDK_NGX_Feature_FrameGeneration, m_ngxParameters, &m_dlssHandle));
    }

    // Per DLSS documentation, section 3.3
    // Motion vectors should be DXGI_FORMAT_R16G16_FLOAT, but somehow DLSSG is not complaining upon getting
    // DXGI_FORMAT_R16G16B16A16_FLOAT. Yay~

    // See DLSS documentation, section 3.4
    D3D12_RESOURCE_BARRIER barrier[5]{};
    barrier[0].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[0].Transition.pResource = color->getNativeTexture<graphics::D3D12>();
    barrier[0].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[0].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[0].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[1].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[1].Transition.pResource = depth->getNativeTexture<graphics::D3D12>();
    barrier[1].Transition.StateBefore = D3D12_RESOURCE_STATE_DEPTH_WRITE;
    barrier[1].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[1].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[2].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[2].Transition.pResource = motion->getNativeTexture<graphics::D3D12>();
    barrier[2].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[2].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[2].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[3].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[3].Transition.pResource = outputInterpolated->getNativeTexture<graphics::D3D12>();
    barrier[3].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[3].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[3].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[4].Transition.pResource = outputReal->getNativeTexture<graphics::D3D12>();
    barrier[4].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[4].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[4].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    // The feature seems to submit/reuse the command list, which means we must specify the ID3D12CommandQueue
    // and ID3D12CommandAllocator respectively.
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CmdQueue", m_device->getNativeContext<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.CmdAlloc", commandList.allocator.Get());

    // The input seems to be synchronized on the CPU via a fence signaled immediately after invoking the
    // feature.
    ResetEvent(m_fenceEventForFeature.get());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.FenceEvent", m_fenceEventForFeature.get());

    // Input buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.HUDLess", color->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseX", colorRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseY", colorRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectWidth", colorRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectHeight", colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.ColorBuffersHDR", 0);
    // MSFS flatscreen mode also sets this one to the same as HUDLess.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Backbuffer", color->getNativeTexture<graphics::D3D12>());

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.MVecs", motion->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseX", motionRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseY", motionRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectWidth", motionRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectHeight", motionRect.extent.height);
#if 0
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleX", (float)colorRect.extent.width / motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleY", (float)colorRect.extent.height / motionRect.extent.height);
#else
    // TODO: This is what MSFS flatscreen mode does, this looks incorrect though?
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleX", (float)motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleY", (float)motionRect.extent.height);
#endif
    // TODO: Set run_low_res_mvec_pass?
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MvecDilated", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.CameraMotionIncluded", 1);

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Depth", depth->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseX", depthRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseY", depthRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectWidth", depthRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectHeight", depthRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthInverted", nearFar.Near > nearFar.Far);
    // This seems to be needed if you want additional HUD to be blitted after the generation.
    NVSDK_NGX_Parameter_SetD3d12Resource(m_ngxParameters, "DLSSG.UI", nullptr);

    // Output buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputInterpolated", outputInterpolated->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputReal", outputReal->getNativeTexture<graphics::D3D12>());

    // Camera properties
    const auto camera = xr::math::LoadXrPose(cameraPose);
    // Use our modified version of the Constants container so we can reuse Streamline SDK helpers such
    // as recalculateCameraMatrices().
    sl::Constants cameraConstants(camera, xr::math::LoadXrPose(prevCameraPose));
    recalculateCameraMatrices(cameraConstants);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CameraViewToClip", &cameraConstants.cameraViewToClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToCameraView", &cameraConstants.clipToCameraView);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToPrevClip", &cameraConstants.clipToPrevClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.PrevClipToClip", &cameraConstants.prevClipToClip);

    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosX", cameraConstants.cameraPos.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosY", cameraConstants.cameraPos.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosZ", cameraConstants.cameraPos.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightX", cameraConstants.cameraRight.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightY", cameraConstants.cameraRight.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightZ", cameraConstants.cameraRight.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpX", cameraConstants.cameraUp.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpY", cameraConstants.cameraUp.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpZ", cameraConstants.cameraUp.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardX", cameraConstants.cameraFwd.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardY", cameraConstants.cameraFwd.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardZ", cameraConstants.cameraFwd.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraNear", nearFar.Near);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraFar", nearFar.Far);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraFOV", std::abs(cameraFov.angleLeft) + std::abs(cameraFov.angleRight));
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraAspectRatio", (float)colorRect.extent.width / colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.OrthoProjection", 0);

    // Optional per SL documentation 2.10
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetY", 0);

    // TODO: Our layer does not control/know about these.
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetY", 0);

    // Miscellaneous. All values are reverse-engineered from MSFS flatscreen mode.
    NVSDK_NGX_Parameter_SetI(m_ngxParameters, "DLSSG.NumFrames", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.NotRenderingGameFrames", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameCount", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameIndex", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);
    // Whatever this one does, it looks really important w.r.t command list management, and without it the
    // feature evaluation will cause an invalid D3D12 operation.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.IsRecording", 1);

    // Upon frame discontinuity, we flush the generator.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.Reset", reset);

    CHECK_NGXCMD(
        NVSDK_NGX_D3D12_EvaluateFeature(commandList.commandList.Get(), m_dlssHandle, m_ngxParameters, nullptr));

    std::swap(barrier[0].Transition.StateBefore, barrier[0].Transition.StateAfter);
    std::swap(barrier[1].Transition.StateBefore, barrier[1].Transition.StateAfter);
    std::swap(barrier[2].Transition.StateBefore, barrier[2].Transition.StateAfter);
    std::swap(barrier[3].Transition.StateBefore, barrier[3].Transition.StateAfter);
    std::swap(barrier[4].Transition.StateBefore, barrier[4].Transition.StateAfter);
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    submitCommandList(std::move(commandList));

    // Signal the input. I'm not really sure why this is synchronized on the CPU via an event.
    // This is how MSFS flatscreen mode seems to do it.
    m_device->getNativeContext<graphics::D3D12>()->Signal(m_inputFence.Get(), ++m_inputFenceValue);
    m_inputFence->SetEventOnCompletion(m_inputFenceValue, m_fenceEventForFeature.get());
}

I will probably clean this up eventually and make a small GitHub repository with a header file for all the parameter names and helper functions. Something someone can drop into your DLSS Super Resolution SDK and just use DLSS Frame Generation from there.

It took many hours of API tracing and debugger to get to where I am, here is an example showing the tedious task of instrumenting Flight Simulator 2020 at runtime to extract the parameter names and using disassembly and debuggers too try to make sense of them, especially the command list flow and how DLSSG is submitting then reusing it on the go. I am also still not quite sure about the FenceEvent waited on the CPU(?) by DLSSG, in particular the performance implications of that.

I don’t think developers should have to go through this pain to use the technology, just because they have requirements that diverge from what Streamline offers. Streamline is great, it really is, but it’s not a one-size-fits-all solution. Please consider that.

I will talk to you soon once I start having more questions (and we can pretend that I am using Streamline SDK ;) for my application).

belakampis1 · June 13, 2024, 11:22pm

Hi, did you ever make a github repo with this work?

I need to do the same for Vulkan.

Having to use Streamline just for FrameGen when I don’t need it / use it for anything else DLSS-related (DLSS-SR and DLSS-RR), or even NRD (which Nvidia started implementing but then never finished, and now are in the process of removing it entirely), seems way overkill to me.

I either want to use Streamline for everything, or nothing, not some strange combination of features and dependencies. This whole SL library is ill-conceived and badly maintained and actually made things harder to debug as it’s just another layer of interposers and wrappers around the actual DLSS libraries and DLL function calls.

mbucchia · June 14, 2024, 6:28am

I did not. My project was VR related and didn’t go anywhere due to the really bad performance I observed in stereo and at the resolutions I wanted.
While the approach above did yield “frame generation”, there was also a noticeable amount of ghosting which I never managed to get rid off - probably one of the properties not set correctly or a scale factor off. Without proper documentation of the low-level API, this sort of issue would take too long to investigate.

I believe FSR3 would be a better solution, since their code is actually truly open AFAICT.

belakampis1 · June 14, 2024, 6:43pm

I hear ya. I may end up using FSR framegen instead anyway, but Nvidia has some dedicated optical flow hardware in their 4000 series (or at least optimized compute stack), and AMD still several steps behind everything Nvidia does, in terms of quality and features.

As much as people complain about framegen, I think it’s the future, including of VR. It makes no sense trying to render above 120 FPS especially. But we aren’t there yet. Anyway, lots of people have issues with ghosting / smearing with AppSW (and SSW on Virtual Desktop using Qualcomm’s optical flow support), but if there’s a choice between 72 FPS and 60 FPS upscaled to 120 FPS at 120 Hz, I’d much rather 60 FPS extrapolated to 120 personally.

I think you probably just had a bug in your motion vector scale or something. Framegen is quite mature at this point and Nvidia keeps updating it for 2D games, there’s absolutely nothing special about increasing animation framerate in VR vs 2D. If anything, doing it in VR makes even more sense because VR is far more demanding on GPUs and game engines.

Still, I think you’re right, I’ll go with AMD’s FSR framegen instead to begin with and see how that goes.

And since Nvidia abandoned NRD integration in Streamline, I have zero interest in using it now. What a pointless wrapper library if you can’t use it as a one-stop shop. SL wasted more of my time debugging why NRD didn’t work than directly integrating NRD did (which was a pain originally). And it doesn’t support DLSS-RR yet either, which is another frankly scandalous thing given how many AAA games shipped with it already and us poor indie VR studios get the cold shoulder.

mbucchia · June 14, 2024, 10:02pm

DLSS and the temporal techniques are only going to be as good as the motion vectors provided. The Optical Flow engine for estimation is still at 4x4 block resolution and the runtimes require significant downscaling of the input (the NVOF implementation I wrote for WMR required to downscale inputs to 1280x1280 in order to compute motion vectors within a 5ms time budget on a 4000 series). So that is motion vectors that will be 1/64th of your full resolution (assuming a 2560x2560 per eye resolution).
These techniques are meant to be used with app-provided motion vectors, not estimated ones. At lower motion vectors granularity, you are going to get the same issues as the ones you mentioned for SSW, that is “waving” because the motion propagation happens on 8x8 blocks.

Reminder: DLSSG is frame interpolation. I don’t know if I would say frame interpolation is the future of VR, especially as we’re seeing a move towards high(er)-latency transport (wireless or even cloud).

Interpolation as opposed to extrapolation makes you wait for 1 extra app frame (so two headset frame periods when you are doubling your framerate).

In addition, the performance I observed with DLSSG would add another full frame of latency. Add your own optical flow estimation that’s another full frame. We’re now at +3 or +4 headset frames latency, vs 1 frame for a traditional extrapolation-based technique (ASW). On a wired headset this is fine, but on a wireless or cloud solution with added round-trip time, whether it’s acceptable is yet to be confirmed. Plus most people who want 120 Hz, they want it for the lower latency. Adding +4 frames of latency isn’t going to work.

From the measurements I made for VR, DLSSG would barely allow me to upscale framerate from 45 to 90 FPS at a resolution of 2200x2200 per eye. Any resolution higher, the computational time of DLSSG would make me miss 90 Hz deadlines. This (stereo) resolution isn’t OK for 2024. I never got any response from Nvidia: Frame Generation performance (execution time) - Gaming and Visualization Technologies / DLSS - NVIDIA Developer Forums

deniz343 · August 12, 2024, 4:57pm

I had an idea to use Streamline solely as a Proxy DLL and piggyback sl.common’s slPresentHook to be able to get both the real and interpolated frames. I haven’t been successful in saving or using the back buffer in any way possible but my preliminary evaluation indicated that even with this method, ::Present doesn’t seem to be called between real frames. I mean for 30 FPS rendering with FG, I expected 16ms gaps between calls to ::Present. That wasn’t the case so I’m not really sure how DLSSG actually presents the interpolated frame.

With FSR3 that was exactly the case. So theoretically, it’s possible to intercept FSR3 interpolated frames without messing with FSR3 library. But it’s open source anyway so this doesn’t have any advantage for FSR3 alone.

I was trying to compare FSR3 and DLSSG frame-by-frame using over-the-counter quality metrics but without access to interpolated frames, I don’t know how that would be possible.

Open to anny suggestions but I just wanted to share my experiences.

Topic		Replies	Views
DLSS 3 and Frame Generation on Linux Linux	44	10402	April 4, 2025
Unreal Engine 5.5 <> DLSS 3.7.20 compilation error DLSS dlss , unreal-engine	90	10586	April 8, 2025
Nvstreammux (new) plugin is broken in DS 6.2 release DeepStream SDK	22	1427	May 22, 2023
Enabling camera on Jetson TX1 board Jetson TX1	73	49447	March 12, 2018
DLSS in Unreal Engine 5.1 and VR DLSS unreal-5 , unreal-engine	73	10711	January 9, 2025
Nvargus-daemon crashing during recording Jetson TX2 camera , gstreamer	18	202	December 17, 2024
Deepstream-test5 for usb camera fail DeepStream SDK	16	1195	September 13, 2023
Deepstream-app custom CSI v4l2src Camera frame drop DeepStream SDK	6	1264	October 12, 2021
Poor multithreading performance compared to DX12 Vulkan	17	5432	September 29, 2020
[REOPEN] Nvstreamdemux does not copy obj_meta parent structure to src pad DeepStream SDK deepstream	39	1461	February 26, 2024

Using DLSSG without IDXGISwapChain::Present()

Related topics