cuEGLStreamProducerConnect returns error 801 on 525.53 driver

It seems that the new beta driver is causing the cuEGLStreamProducerConnect CUDA function to return the error 'operation not supported' (801). I’ve not yet upgraded to the 525 driver to test this myself.

The issue is happening with the nvidia-vaapi-driver, on this line, which is odd because that code has been working fine up to now.

For reference, the nvidia-vaapi-driver issue is here.

Regards
elFarto

11 Likes

As reported in the nvidia-vaapi-driver issue, this is still not fixed in 525.60.11. Are there plans on the fix?

2 Likes

I’ve retested this with 525.60.11, and I’m still getting the same issue. I’ve tried a few things, but nothing will allow cuEGLStreamProducerConnect to succeed.

Interestingly, if I remove the call to eglStreamImageConsumerConnectNVprior to the cuEGLStreamProducerConnect call, it correctly errors saying there are no consumer connected.

@amrits Is this a know issue?

Thanks & Regards
elFarto

1 Like

I saw some stuff in here about direct backend, unsure if this would help at all or is related, my understanding is not great but would like to help: cuEGLStreamProducerConnect returns error 801 on 525 series · Issue #414 · NVIDIA/open-gpu-kernel-modules · GitHub

No, the direct backend is a potential work around for the issue, as it does not use EGLStreams. It doesn’t solve the issue that EGLStream with CUDA is broken as of the 525 release.

I have filed a bug 3893338 internally for tracking purpose.
Can someone share precise repro steps so that I can try locally on my setup to reproduce issue.

I’ve whipped up a quick test case which reproduces the issue on my machine:

//compile with gcc test.c -lEGL -lcuda
#include <stdio.h>
#include <cuda.h>
#include <cudaEGL.h>
#include <EGL/egl.h>

static PFNEGLSTREAMIMAGECONSUMERCONNECTNVPROC eglStreamImageConsumerConnectNV;
static PFNEGLCREATESTREAMKHRPROC eglCreateStreamKHR;

void __checkCudaErrors(CUresult err, const char *file, const int line) {
    if (CUDA_SUCCESS != err) {
        const char *errStr = NULL, *errName = NULL;
        cuGetErrorName(err, &errName);
        cuGetErrorString(err, &errStr);
        printf("cuda error '%s' (%d):'%s' at file <%s>, line %i.\n", errName, err, errStr, file, line);
    }
}
#define CHECK_CUDA_RESULT(err)  __checkCudaErrors(err, __FILE__, __LINE__)
#define LOG printf

int main(){
    LOG("Reconnecting to stream\n");

    CUcontext cudaContext;

    CHECK_CUDA_RESULT(cuInit(0));
    CHECK_CUDA_RESULT(cuCtxCreate(&cudaContext, CU_CTX_SCHED_BLOCKING_SYNC, 0));

    eglStreamImageConsumerConnectNV = (PFNEGLSTREAMIMAGECONSUMERCONNECTNVPROC) eglGetProcAddress("eglStreamImageConsumerConnectNV");
    eglCreateStreamKHR = (PFNEGLCREATESTREAMKHRPROC) eglGetProcAddress("eglCreateStreamKHR");

    EGLDisplay eglDisplay = eglGetDisplay(NULL);
    eglInitialize(eglDisplay, NULL, NULL);

    EGLint stream_attrib_list[] = { EGL_SUPPORT_REUSE_NV, EGL_FALSE, EGL_NONE };
    EGLStreamKHR eglStream = eglCreateStreamKHR(eglDisplay, stream_attrib_list);
    if (eglStream == EGL_NO_STREAM_KHR) {
        LOG("Unable to create EGLStream\n");
        return -1;
    }

    if (!eglStreamImageConsumerConnectNV(eglDisplay, eglStream, 0, 0, NULL)) {
        LOG("Unable to connect EGLImage stream consumer\n");
        return -1;
    }
    CUeglStreamConnection cuStreamConnection;
    CHECK_CUDA_RESULT(cuEGLStreamProducerConnect(&cuStreamConnection, eglStream, 1024, 1024));
    return 1;
}

It produces this output on my machine:

> gcc -o test12 test12.c -lEGL -lcuda && ./test12
Reconnecting to stream
cuda error 'CUDA_ERROR_NOT_SUPPORTED' (801):'operation not supported' at file <test12.c>, line 47.

Regards
elFarto

@elFarto
Thanks for sharing the sample code, I am able to reproduce issue locally.
We will investigate on it and update.

4 Likes

525.78.01 driver released and the issue is still there.

We have root caused the issue successfully and fix will be incorporated in future released driver.

with future release you mean 525.85.05?

greetings

530.30.02 beta driver released and the issue isn’t fixed.

I can confirm that driver version 525.105.17 does NOT have the fix incorporated: it still has the same bug.

1 Like

One of the worst mistakes I’ve made was buying a GPU from NVIDIA, specifically the 3070. My desktop runs atrociously bad while the GPU is idling, and the CPU is at 100% usage. I’ve tried different distros and settings, but the problem remains. Even my laptop with an Intel iGPU runs better than my PC with a 3070. This thread really represents how NVIDIA treats their customers. This problem has persisted for literally months. In December, they acknowledged the problem, but it took them until late January to say “the fix will be incorporated in future released driver,” and the problem still remains. They clearly don’t care about their customers at all. And don’t get me started on how the VRAM on this GPU limits all the recent games.

This issue is still present in 535.43.02.

@amrits Is there any more information on which future driver release this fix will be in?

1 Like

@elFarto
Unfortunately, I will not be able to comment about exact release, but I have followed up with the team on the same.

Still not in 535.86.05…

1 Like

seems fixed in 535.95?

└───╼  vainfo 
Trying display: wayland
vainfo: VA-API version: 1.19 (libva 2.19.0)
vainfo: Driver version: VA-API NVDEC driver [egl backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain12             : VAEntrypointVLD

but testing tou snipplet:

./test12
Reconnecting to stream
libEGL warning: egl: failed to create dri2 screen
Unable to create EGLStream

in wayland

greetings

Still not fixed in 535.113.01…

1 Like

Do you know if this is included with driver 545? I am waiting on Manjaro to package it in order to test