Using DLSSG without IDXGISwapChain::Present()

Thanks for checking.

So I’ve spend the last 3 days disassembling and Detouring your DLLs with Flight Simulator 2020, and I’ll post my findings here for anyone else who might come across this. This really should be part of your SDK, I’m not sure why you do not want to publish this information. With the code below, I am able to use frame generation at a low-level, without Streamline, which allows me to conduct experimental research on scenarios that Nvidia has not acknowledged and hopefully help advance the usage of the technology forward.

PS: Admittedly I am having some blurriness issues right now, still working through them, suspecting bad motion vectors scaling or something else dumb on my side. But I can clearly see motion interpolation happening.

Step 1)

  • Integrate the DLSS Super Resolution SDK, app needs to include nvsdk_ngx.h and link the corresponding lib file.
  • Copy the nvngx_dlss.dll from the Super Resolution SDK
  • Copy the nvngx_dlssg.dll from the Streamline SDK
  • Invoke NVSDK_NGX_D3D12_Init_with_ProjectID() to initialize DLSS and DLSSG. The presence of the nvngx_dlssg.dll in the app path will magically enable usage of DLSSG.

Step 2)

  • Obtain an NVSDK_NGX_Parameter* by calling NVSDK_NGX_D3D12_GetCapabilityParameters()
  • You will use this container to set various (undocumented) parameters for initializing and evaluating the DLSSG feature. See the snippet below for the name of the various parameters and how to set them. Look at the Streamline SDK documentation to make sense of some of them.
void interpolate(graphics::IGraphicsTexture* color,
                 XrRect2Di colorRect,
                 XrPosef cameraPose,
                 XrFovf cameraFov,
                 XrPosef prevCameraPose,
                 graphics::IGraphicsTexture* depth,
                 XrRect2Di depthRect,
                 xr::math::NearFar nearFar,
                 graphics::IGraphicsTexture* motion,
                 XrRect2Di motionRect,
                 graphics::IGraphicsTexture* outputInterpolated,
                 graphics::IGraphicsTexture* outputReal,
                 bool reset = false) {
    D3D12ReusableCommandList commandList = getCommandList();

    if (!m_dlssHandle) {
        // Common properties. We know that our usage of DLSSG is bound to a given swapchain and we never
        // redimension our swapchains or switch format, therefore we don't need to monitor for changes.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_CreationNodeMask, 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, NVSDK_NGX_Parameter_VisibilityNodeMask, 1);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Width, outputInterpolated->getInfo().width);
        NVSDK_NGX_Parameter_SetI(
            m_ngxParameters, NVSDK_NGX_Parameter_Height, outputInterpolated->getInfo().height);
        NVSDK_NGX_Parameter_SetUI(
            m_ngxParameters, "DLSSG.BackbufferFormat", (unsigned int)outputInterpolated->getInfo().format);

        // These are reverse-engineered from MSFS flatscreen mode.
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "Enable.OFA", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.EnableInterp", 1);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
        NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);

        // Optional(?) signaling callbacks.
        // DLSSG.SyncWaitCallback{Data}
        // DLSSG.SyncSignalCallback{Data}
        // DLSSG.QueueSubmitCallback{Data}
        // DLSSG.SyncSignalOnlyCallback{Data}
        // DLSSG.SyncWaitOnlyCallback{Data}
        // DLSSG.SyncFlushCallback{Data}

        CHECK_NGXCMD(NVSDK_NGX_D3D12_CreateFeature(
            commandList.commandList.Get(), NVSDK_NGX_Feature_FrameGeneration, m_ngxParameters, &m_dlssHandle));
    }

    // Per DLSS documentation, section 3.3
    // Motion vectors should be DXGI_FORMAT_R16G16_FLOAT, but somehow DLSSG is not complaining upon getting
    // DXGI_FORMAT_R16G16B16A16_FLOAT. Yay~

    // See DLSS documentation, section 3.4
    D3D12_RESOURCE_BARRIER barrier[5]{};
    barrier[0].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[0].Transition.pResource = color->getNativeTexture<graphics::D3D12>();
    barrier[0].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[0].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[0].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[1].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[1].Transition.pResource = depth->getNativeTexture<graphics::D3D12>();
    barrier[1].Transition.StateBefore = D3D12_RESOURCE_STATE_DEPTH_WRITE;
    barrier[1].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[1].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[2].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[2].Transition.pResource = motion->getNativeTexture<graphics::D3D12>();
    barrier[2].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[2].Transition.StateAfter = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE;
    barrier[2].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[3].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    barrier[3].Transition.pResource = outputInterpolated->getNativeTexture<graphics::D3D12>();
    barrier[3].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[3].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[3].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    barrier[4].Transition.pResource = outputReal->getNativeTexture<graphics::D3D12>();
    barrier[4].Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    barrier[4].Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
    barrier[4].Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    // The feature seems to submit/reuse the command list, which means we must specify the ID3D12CommandQueue
    // and ID3D12CommandAllocator respectively.
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CmdQueue", m_device->getNativeContext<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.CmdAlloc", commandList.allocator.Get());

    // The input seems to be synchronized on the CPU via a fence signaled immediately after invoking the
    // feature.
    ResetEvent(m_fenceEventForFeature.get());
    NVSDK_NGX_Parameter_SetVoidPointer(m_ngxParameters, "DLSSG.FenceEvent", m_fenceEventForFeature.get());

    // Input buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.HUDLess", color->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseX", colorRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectBaseY", colorRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectWidth", colorRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.HUDLessSubrectHeight", colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.ColorBuffersHDR", 0);
    // MSFS flatscreen mode also sets this one to the same as HUDLess.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Backbuffer", color->getNativeTexture<graphics::D3D12>());

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.MVecs", motion->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseX", motionRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectBaseY", motionRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectWidth", motionRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MVecsSubrectHeight", motionRect.extent.height);
#if 0
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleX", (float)colorRect.extent.width / motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.MvecScaleY", (float)colorRect.extent.height / motionRect.extent.height);
#else
    // TODO: This is what MSFS flatscreen mode does, this looks incorrect though?
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleX", (float)motionRect.extent.width);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.MvecScaleY", (float)motionRect.extent.height);
#endif
    // TODO: Set run_low_res_mvec_pass?
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MvecDilated", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.CameraMotionIncluded", 1);

    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.Depth", depth->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseX", depthRect.offset.x);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectBaseY", depthRect.offset.y);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectWidth", depthRect.extent.width);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthSubrectHeight", depthRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DepthInverted", nearFar.Near > nearFar.Far);
    // This seems to be needed if you want additional HUD to be blitted after the generation.
    NVSDK_NGX_Parameter_SetD3d12Resource(m_ngxParameters, "DLSSG.UI", nullptr);

    // Output buffers.
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputInterpolated", outputInterpolated->getNativeTexture<graphics::D3D12>());
    NVSDK_NGX_Parameter_SetD3d12Resource(
        m_ngxParameters, "DLSSG.OutputReal", outputReal->getNativeTexture<graphics::D3D12>());

    // Camera properties
    const auto camera = xr::math::LoadXrPose(cameraPose);
    // Use our modified version of the Constants container so we can reuse Streamline SDK helpers such
    // as recalculateCameraMatrices().
    sl::Constants cameraConstants(camera, xr::math::LoadXrPose(prevCameraPose));
    recalculateCameraMatrices(cameraConstants);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.CameraViewToClip", &cameraConstants.cameraViewToClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToCameraView", &cameraConstants.clipToCameraView);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.ClipToPrevClip", &cameraConstants.clipToPrevClip);
    NVSDK_NGX_Parameter_SetVoidPointer(
        m_ngxParameters, "DLSSG.PrevClipToClip", &cameraConstants.prevClipToClip);

    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosX", cameraConstants.cameraPos.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosY", cameraConstants.cameraPos.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPosZ", cameraConstants.cameraPos.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightX", cameraConstants.cameraRight.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightY", cameraConstants.cameraRight.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraRightZ", cameraConstants.cameraRight.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpX", cameraConstants.cameraUp.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpY", cameraConstants.cameraUp.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraUpZ", cameraConstants.cameraUp.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardX", cameraConstants.cameraFwd.x);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardY", cameraConstants.cameraFwd.y);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraForwardZ", cameraConstants.cameraFwd.z);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraNear", nearFar.Near);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraFar", nearFar.Far);
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraFOV", std::abs(cameraFov.angleLeft) + std::abs(cameraFov.angleRight));
    NVSDK_NGX_Parameter_SetF(
        m_ngxParameters, "DLSSG.CameraAspectRatio", (float)colorRect.extent.width / colorRect.extent.height);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.OrthoProjection", 0);

    // Optional per SL documentation 2.10
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.CameraPinholeOffsetY", 0);

    // TODO: Our layer does not control/know about these.
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetX", 0);
    NVSDK_NGX_Parameter_SetF(m_ngxParameters, "DLSSG.JitterOffsetY", 0);

    // Miscellaneous. All values are reverse-engineered from MSFS flatscreen mode.
    NVSDK_NGX_Parameter_SetI(m_ngxParameters, "DLSSG.NumFrames", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.NotRenderingGameFrames", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameCount", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.MultiFrameIndex", 1);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.DynamicResolution", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalHeight", 0);
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.InternalWidth", 0);
    // Whatever this one does, it looks really important w.r.t command list management, and without it the
    // feature evaluation will cause an invalid D3D12 operation.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.IsRecording", 1);

    // Upon frame discontinuity, we flush the generator.
    NVSDK_NGX_Parameter_SetUI(m_ngxParameters, "DLSSG.Reset", reset);

    CHECK_NGXCMD(
        NVSDK_NGX_D3D12_EvaluateFeature(commandList.commandList.Get(), m_dlssHandle, m_ngxParameters, nullptr));

    std::swap(barrier[0].Transition.StateBefore, barrier[0].Transition.StateAfter);
    std::swap(barrier[1].Transition.StateBefore, barrier[1].Transition.StateAfter);
    std::swap(barrier[2].Transition.StateBefore, barrier[2].Transition.StateAfter);
    std::swap(barrier[3].Transition.StateBefore, barrier[3].Transition.StateAfter);
    std::swap(barrier[4].Transition.StateBefore, barrier[4].Transition.StateAfter);
    commandList.commandList->ResourceBarrier((UINT)std::size(barrier), barrier);

    submitCommandList(std::move(commandList));

    // Signal the input. I'm not really sure why this is synchronized on the CPU via an event.
    // This is how MSFS flatscreen mode seems to do it.
    m_device->getNativeContext<graphics::D3D12>()->Signal(m_inputFence.Get(), ++m_inputFenceValue);
    m_inputFence->SetEventOnCompletion(m_inputFenceValue, m_fenceEventForFeature.get());
}

I will probably clean this up eventually and make a small GitHub repository with a header file for all the parameter names and helper functions. Something someone can drop into your DLSS Super Resolution SDK and just use DLSS Frame Generation from there.

It took many hours of API tracing and debugger to get to where I am, here is an example showing the tedious task of instrumenting Flight Simulator 2020 at runtime to extract the parameter names and using disassembly and debuggers too try to make sense of them, especially the command list flow and how DLSSG is submitting then reusing it on the go. I am also still not quite sure about the FenceEvent waited on the CPU(?) by DLSSG, in particular the performance implications of that.

I don’t think developers should have to go through this pain to use the technology, just because they have requirements that diverge from what Streamline offers. Streamline is great, it really is, but it’s not a one-size-fits-all solution. Please consider that.

I will talk to you soon once I start having more questions (and we can pretend that I am using Streamline SDK ;) for my application).

3 Likes