Using DLSSG without IDXGISwapChain::Present()

Hello,

I was pleased to see the DLSSG SDK released, however it looks like Nvidia only released it via Streamline and did not include the source for the DLSSG plugin.

Reading through the SL documentation:

Streamline/ProgrammingGuideDLSS_G.md at main · NVIDIAGameWorks/Streamline (github.com)

DLSS-G intercepts IDXGISwapChain::Present and when using Vulkan vkQueuePresentKHR and vkAcquireNextImageKHR calls and executes them asynchronously.

While this is neat, for my application, I would like to explore using DLSSG without going through IDXGISwapChain::Present(), so I can control other processing and how the interpolated frame is presented.

I was hoping to be able to use DLSSG in a way where I submit all the required bits (color buffer, depth buffer, MV, etc) and can just “evaluate features” and either wait or retrieve the interpolated image as a texture.

I suspect this might be possible if using the DLSSG SDK directly, but it does not look like this is available. I tried looking into some of the strings/symbols in the DLL, an I can see things like DLSSG.OutputInterpolated, which I suspect would be an NGX_Parameter where one could specify a destination texture for the interpolated output…

I also suspect there might be a way to Detour IDXGISwapChain::Present() and somehow readback from the swapchain the interpolated texture before it’s been presented… but this sounds a little tedious to do.

Perhaps I am missing something fundamental here. but otherwise do you have plans to make this scenario possible (meaning more friendly) and/or release a lower level DLSSG SDK?

Thank you!

Hi @mbucchia, great to hear that you are interested in DLSS with Frame Generation!

You are correct, for the time being the Frame Generation part of DLSS is only part of Streamline.

But there are plans to add this as part of the DLSS plugin for Unreal as well as possibly integrate it fully with our Path-Tracing SDK.

But I cannot share any time-line or further details on this at this time.

Thanks!

Thank you for the reply @MarkusHoHo, however this does not answer my question at all. Will I be able to use DLSS Frame Generation without being forced to use an IDXGISwapChain? You mentioned Unreal and Path-Tracing, neither of which will solve my problem. I am looking for the traditional SDK, ie the counterpart of NVIDIA/DLSS (github.com) but for DLSSG.

Thank you.

Hi again.

Right now Streamline is the only way to use DLSS Frame Generation.
The Path-Tracing SDK is an accumulation of the most relevant DL and ray-Tracing SDKs NVIDIA has to offer, including DLSS. That means once DLSS Frame Generation is released apart from Streamline, you will find it there as well.

At what time that will happen as well as how the implementation details will look like (e.g. whether you will still need to go through IDXGISwapChain or not) is something I cannot tell you more about at this time.

Please be a little bit more patient and watch our technical blog or the DLSS pages for further details and announcements.

Thanks!

Hi @MarkusHoHo,

Any update on this please? DLSSG was announce nearly a year ago now, and still no proper SDK :(

Skimming though the DLSS super resolution SDK, some of the definitions are present (eg: NVSDK_NGX_Feature_FrameGeneration). Can we please just have the header? I’ll take it even if you provide 0 documentation at this point… My scenario does not allow the use of Streamline and the hacked DXGISwapchains.

Thanks.

Hi @mbucchia,

I am sorry, but there is no change to report, DLSS Frame-Generation is tied to Streamline if you want to use a standalone SDK. Also there is no plan to release source for DLSS-FG.

I revisited your original post and will send that as a question to the DLSS team (specifically if it is possible to access the interpolated frame without going through SwapChain), if and when I get an answer I will post back here.

Thanks!

Quick update, I received confirmation that usage of DLSS-FG without SwapChain::Present() is not possible today.

Sorry!

Thanks for checking.

So I’ve spend the last 3 days disassembling and Detouring your DLLs with Flight Simulator 2020, and I’ll post my findings here for anyone else who might come across this. This really should be part of your SDK, I’m not sure why you do not want to publish this information. With the code below, I am able to use frame generation at a low-level, without Streamline, which allows me to conduct experimental research on scenarios that Nvidia has not acknowledged and hopefully help advance the usage of the technology forward.

PS: Admittedly I am having some blurriness issues right now, still working through them, suspecting bad motion vectors scaling or something else dumb on my side. But I can clearly see motion interpolation happening.

Step 1)

  • Integrate the DLSS Super Resolution SDK, app needs to include nvsdk_ngx.h and link the corresponding lib file.
  • Copy the nvngx_dlss.dll from the Super Resolution SDK
  • Copy the nvngx_dlssg.dll from the Streamline SDK
  • Invoke NVSDK_NGX_D3D12_Init_with_ProjectID() to initialize DLSS and DLSSG. The presence of the nvngx_dlssg.dll in the app path will magically enable usage of DLSSG.

Step 2)

  • Obtain an NVSDK_NGX_Parameter* by calling NVSDK_NGX_D3D12_GetCapabilityParameters()
  • You will use this container to set various (undocumented) parameters for initializing and evaluating the DLSSG feature. See the snippet below for the name of the various parameters and how to set them. Look at the Streamline SDK documentation to make sense of some of them.
void interpolate(graphics::IGraphicsTexture* color,
                 XrRect2Di colorRect,
                 XrPosef cameraPose,
                 XrFovf cameraFov,
                 XrPosef prevCameraPose,
                 graphics::IGraphicsTexture* depth,
                 XrRect2Di depthRect,
                 xr::math::NearFar nearFar,
                 graphics::IGraphicsTexture* motion,
                 XrRect2Di motionRect,
                 graphics::IGraphicsTexture* outputInterpolated,
                 graphics::IGraphicsTexture* outputReal,
                 bool reset = false) {
    D3D12ReusableCommandList commandList = getCommandList();

    if (!m_dlssHandle) {
        // Common properties. We know that our usage of DLSSG is bound to a given swapchain and we never
        // redimension our swapchains or switch format, therefore we don't need to monitor for changes.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_CreationNodeMask, 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_VisibilityNodeMask, 1);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Width, outputInterpolated->getInfo().width);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Height, outputInterpolated->getInfo().height);
        NVSDK_NGX_Parameter_SetUI(
            m_ngxParameters, "DLSSG.BackbufferFormat", (unsigned int)outputInterpolated->getInfo().format);

        // These are reverse-engineered from MSFS flatscreen mode.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "Enable.OFA", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.EnableInterp", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);

        // Optional(?) signaling callbacks.
        // DLSSG.SyncWaitCallback{Data}
        // DLSSG.SyncSignalCallback{Data}
        // DLSSG.QueueSubmitCallback{Data}
        // DLSSG.SyncSignalOnlyCallback{Data}
        // DLSSG.SyncWaitOnlyCallback{Data}
        // DLSSG.SyncFlushCallback{Data}

        CHECK_NGXCMD(NVSDK_NGX_D3D12_CreateFeature(
            commandList.commandList.Get(), NVSDK_NGX_Feature_FrameGeneration, m_ngxParameters, &m_dlssHandle));
    }

    // Per DLSS documentation, section 3.3
    // Motion vectors should be DXGI_FORMAT_R16G16_FLOAT, but somehow DLSSG is not complaining upon getting
    // DXGI_FORMAT_R16G16B16A16_FLOAT. Yay~

    // See DLSS documentation, section 3.4
    D3D12_RESOURCE_BARRIER barrier[5]{};
    barrier[0].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[0].Transition.pResource = color->getNativeTexture<graphics::D3D12>();
    barrier[0].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[0].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[0].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[1].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[1].Transition.pResource = depth->getNativeTexture<graphics::D3D12>();
    barrier[1].Transition.StateBefore = D3D12_RESOURCE_STATE_DEPTH_WRITE;
    barrier[1].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[1].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[2].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[2].Transition.pResource = motion->getNativeTexture<graphics::D3D12>();
    barrier[2].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[2].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[2].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[3].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[3].Transition.pResource = outputInterpolated->getNativeTexture<graphics::D3D12>();
    barrier[3].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[3].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[3].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[4].Transition.pResource = outputReal->getNativeTexture<graphics::D3D12>();
    barrier[4].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[4].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[4].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    // The feature seems to submit/reuse the command list, which means we must specify the ID3D12CommandQueue
    // and ID3D12CommandAllocator respectively.
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CmdQueue", m_device->getNativeContext<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.CmdAlloc", commandList.allocator.Get());

    // The input seems to be synchronized on the CPU via a fence signaled immediately after invoking the
    // feature.
    ResetEvent(m_fenceEventForFeature.get());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.FenceEvent", m_fenceEventForFeature.get());

    // Input buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.HUDLess", color->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseX", colorRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseY", colorRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectWidth", colorRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectHeight", colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.ColorBuffersHDR", 0);
    // MSFS flatscreen mode also sets this one to the same as HUDLess.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Backbuffer", color->getNativeTexture<graphics::D3D12>());

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.MVecs", motion->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseX", motionRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseY", motionRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectWidth", motionRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectHeight", motionRect.extent.height);
#if 0
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleX", (float)colorRect.extent.width / motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleY", (float)colorRect.extent.height / motionRect.extent.height);
#else
    // TODO: This is what MSFS flatscreen mode does, this looks incorrect though?
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleX", (float)motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleY", (float)motionRect.extent.height);
#endif
    // TODO: Set run_low_res_mvec_pass?
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MvecDilated", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.CameraMotionIncluded", 1);

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Depth", depth->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseX", depthRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseY", depthRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectWidth", depthRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectHeight", depthRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthInverted", nearFar.Near > nearFar.Far);
    // This seems to be needed if you want additional HUD to be blitted after the generation.
    NVSDK_NGX_Parameter_SetD3d12Resource(m_ngxParameters, "DLSSG.UI", nullptr);

    // Output buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputInterpolated", outputInterpolated->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputReal", outputReal->getNativeTexture<graphics::D3D12>());

    // Camera properties
    const auto camera = xr::math::LoadXrPose(cameraPose);
    // Use our modified version of the Constants container so we can reuse Streamline SDK helpers such
    // as recalculateCameraMatrices().
    sl::Constants cameraConstants(camera, xr::math::LoadXrPose(prevCameraPose));
    recalculateCameraMatrices(cameraConstants);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CameraViewToClip", &cameraConstants.cameraViewToClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToCameraView", &cameraConstants.clipToCameraView);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToPrevClip", &cameraConstants.clipToPrevClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.PrevClipToClip", &cameraConstants.prevClipToClip);

    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosX", cameraConstants.cameraPos.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosY", cameraConstants.cameraPos.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosZ", cameraConstants.cameraPos.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightX", cameraConstants.cameraRight.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightY", cameraConstants.cameraRight.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightZ", cameraConstants.cameraRight.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpX", cameraConstants.cameraUp.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpY", cameraConstants.cameraUp.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpZ", cameraConstants.cameraUp.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardX", cameraConstants.cameraFwd.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardY", cameraConstants.cameraFwd.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardZ", cameraConstants.cameraFwd.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraNear", nearFar.Near);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraFar", nearFar.Far);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraFOV", std::abs(cameraFov.angleLeft) + std::abs(cameraFov.angleRight));
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraAspectRatio", (float)colorRect.extent.width / colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.OrthoProjection", 0);

    // Optional per SL documentation 2.10
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetY", 0);

    // TODO: Our layer does not control/know about these.
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetY", 0);

    // Miscellaneous. All values are reverse-engineered from MSFS flatscreen mode.
    NVSDK_NGX_Parameter_SetI(m_ngxParameters, "DLSSG.NumFrames", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.NotRenderingGameFrames", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameCount", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameIndex", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);
    // Whatever this one does, it looks really important w.r.t command list management, and without it the
    // feature evaluation will cause an invalid D3D12 operation.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.IsRecording", 1);

    // Upon frame discontinuity, we flush the generator.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.Reset", reset);

    CHECK_NGXCMD(
        NVSDK_NGX_D3D12_EvaluateFeature(commandList.commandList.Get(), m_dlssHandle, m_ngxParameters, nullptr));

    std::swap(barrier[0].Transition.StateBefore, barrier[0].Transition.StateAfter);
    std::swap(barrier[1].Transition.StateBefore, barrier[1].Transition.StateAfter);
    std::swap(barrier[2].Transition.StateBefore, barrier[2].Transition.StateAfter);
    std::swap(barrier[3].Transition.StateBefore, barrier[3].Transition.StateAfter);
    std::swap(barrier[4].Transition.StateBefore, barrier[4].Transition.StateAfter);
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    submitCommandList(std::move(commandList));

    // Signal the input. I'm not really sure why this is synchronized on the CPU via an event.
    // This is how MSFS flatscreen mode seems to do it.
    m_device->getNativeContext<graphics::D3D12>()->Signal(m_inputFence.Get(), ++m_inputFenceValue);
    m_inputFence->SetEventOnCompletion(m_inputFenceValue, m_fenceEventForFeature.get());
}

I will probably clean this up eventually and make a small GitHub repository with a header file for all the parameter names and helper functions. Something someone can drop into your DLSS Super Resolution SDK and just use DLSS Frame Generation from there.

It took many hours of API tracing and debugger to get to where I am, here is an example showing the tedious task of instrumenting Flight Simulator 2020 at runtime to extract the parameter names and using disassembly and debuggers too try to make sense of them, especially the command list flow and how DLSSG is submitting then reusing it on the go. I am also still not quite sure about the FenceEvent waited on the CPU(?) by DLSSG, in particular the performance implications of that.

I don’t think developers should have to go through this pain to use the technology, just because they have requirements that diverge from what Streamline offers. Streamline is great, it really is, but it’s not a one-size-fits-all solution. Please consider that.

I will talk to you soon once I start having more questions (and we can pretend that I am using Streamline SDK ;) for my application).

1 Like