movit library crash when used in kdenlive

We are able to reproduce this issue. We’ll keep you posted :

Config: Ubuntu 17.04 unity desktop + GP102 [GeForce GTX 1080 Ti] + 384.69 + kdenlive application

Repro steps :

  1. Install kdenlive from Download - Kdenlive . Launch kdenlive and add video SIGGRAPH 2017_ NVIDIA News Highlights.mp4 in it.
  2. Add at least 3 gpu effects and play video.
  3. Once played video Scroll around on the timeline. then press the Spacebar in a quick motion to pause the video, then press play again quickly. Pressing spacebar multiple times while scrolling around timeline trigger App will crash.

root@test-Precision-Tower-7910:~# dpkg -l | grep -i kdenlive
ii kdenlive 4:17.08.3+git201711050049~ubuntu17.04.1 amd64 non-linear video editor
ii kdenlive-data 4:17.08.3+git201711050049~ubuntu17.04.1 all non-linear video editor (data files)
root@test-Precision-Tower-7910:~# dpkg -l | grep -i movit
ii libmovit5:amd64 1.4.0-1 amd64 GPU video filter library
root@test-Precision-Tower-7910:~#
kdenlive_crashed.txt (954 Bytes)

I can confirm the problem still exists. I don’t have to add any GPU effects, just hold down the space bar and it aborts after a few seconds.

All compiled from source yesterday (except Qt)

Qt 5.6
kdenlive v18.03
MLT 6.7.0
movit v36
gl driver: nvidia 384.11

I have dug a bit deeper, and I believe the context management is what causes the fault. I tested my theory by modifying this behavior and now it doesn’t abort anymore.

Background:
kdenlive uses multiple OpenGL contexts.

  1. The main thread long-lived context from QML
  2. A shared context child of the main one (m_shareContext)
  3. A shared context child of m_sharedContext created on-demand via a callback from MLT (createThread()) and stored in RenderThread object. The callback is from another thread (not the main thread) managed by MLT.

The problem seems to be (3). There are many contexts created and destroyed here. Almost every time the play/pause is activated, MLT asks for a new thread, a context is created, some rendering in that thread, then the thread is destroyed with the context. At some stage, the context state is corrupted which leads the the abort due to failing GL calls.

If I prevent the creation of new contexts in (3), by simply reusing m_shareContext instead, no crashes. This has the limitation that MLT must not be allowed to create more than one render thread.

If I understand your explanation correctly, that implies this is a Kdenlive bug?
As this point we have no evidence of a NVIDIA driver bug, and according to Sesse no evidence of a Movit bug either.
375713 – Kdenlive crashes if GPU processing (Movit library) is enabled doesn’t seem to have seen much traffic, unfortunately.

Does not seem to be a kdenlive bug or movit bug, rather a a consequence of what it is doing. The repeated construction/destruction of many shared contexts by different threads, appears to cause the corruption of the context leading to the abort.

If you have any specific element that indicates that this is a driver bug, please share it. MLT may also be at fault here (it’s been in the past); and we don’t see evidence of a driver bug.

There is no indication that kdenlive is doing something wrong with context management. MLT/movit doesn’t do anything regarding creation/deletion of contexts, but rather relies on the host application (kdenlive).

I believe the problem is the shared state in the parent context is corrupted due to destruction of child contexts. I assume because the texture ids are lost in the parent context. How could I prove this was a driver bug?

It looks like the driver is throwing an INVALID_OPERATION error. It doesn’t generally do that as a result of a bug within it, but as a result of a misuse of OpenGL by the application. Enable KHR_debug output in the application (or use Apitrace or similar) to get an extended debug message explaining what exactly lead to the INVALID_OPERATION message. If that reason appears bogus after debugging the application then that shall indicate a possible driver bug.
Most likely here the application is getting confused about its context management and is either making current the wrong context, or mismanaging resources in shared contexts, or making the same context current in two threads at once, or anything of the sort.