Mutex Error "Mutex must be unlocked only by thread that has already acquired lock"

Hi,

We’re upgrading from PhysX 3.2 to 3.4. We’re getting a failure in debug builds in the Linux on Windows environment, which is effectively Ubuntu.

The error is “Mutex must be unlocked only by thread that has already acquired lock”, which seems quite strange since the mutex unlock call comes from ScopedLock::~ScopedLock, which would indicate that the mutex was locked by the same thread that constructed the ScopedLock.

Callstack:

#3  0x00007ffffb9cda45 in physx::shdfnd::MutexImpl::unlock (this=0x7fffdc0019c0) at ./../../../../PxShared/src/foundation/src/unix/PsUnixMutex.cpp:108
#4  0x00007ffffb9c64b3 in physx::shdfnd::MutexT<physx::shdfnd::Allocator>::unlock (this=0x7fffdc001480) at ./../../../../PxShared/src/foundation/include/PsMutex.h:152
#5  0x00007ffffb9c633b in physx::shdfnd::MutexT<physx::shdfnd::Allocator>::ScopedLock::~ScopedLock (this=0x7ffff63afb90, __in_chrg=<optimized out>) at ./../../../../PxShared/src/foundation/include/PsMutex.h:104
#6  0x00007ffffb9c5fed in physx::shdfnd::NamedAllocator::allocate (this=0x7fffdc0182e8, size=14720, filename=0x7ffffb444d98 "./../../../../PxShared/src/foundation/include/PsArray.h", line=605) at ./../../../../PxShared/src/foundation/src/PsAllocator.cpp:100
#7  0x00007ffffb377f3f in physx::shdfnd::Array<physx::IG::Edge, physx::shdfnd::NamedAllocator>::allocate (this=0x7fffdc0182e8, size=920) at ./../../../../PxShared/src/foundation/include/PsArray.h:605
#8  0x00007ffffb377640 in physx::shdfnd::Array<physx::IG::Edge, physx::shdfnd::NamedAllocator>::recreate (this=0x7fffdc0182e8, capacity=920) at ./../../../../PxShared/src/foundation/include/PsArray.h:781
#9  0x00007ffffb376db9 in physx::shdfnd::Array<physx::IG::Edge, physx::shdfnd::NamedAllocator>::grow (this=0x7fffdc0182e8, capacity=920) at ./../../../../PxShared/src/foundation/include/PsArray.h:692
#10 0x00007ffffb376122 in physx::shdfnd::Array<physx::IG::Edge, physx::shdfnd::NamedAllocator>::reserve (this=0x7fffdc0182e8, capacity=920) at ./../../../../PxShared/src/foundation/include/PsArray.h:531
#11 0x00007ffffb36b08c in physx::IG::IslandSim::addConnection (this=0x7fffdc0182b0, nodeHandle1=..., nodeHandle2=..., edgeType=physx::IG::Edge::eCONTACT_MANAGER, handle=459) at ./../../LowLevel/software/src/PxsIslandSim.cpp:195
#12 0x00007ffffb36c359 in physx::IG::IslandSim::addContactManager (this=0x7fffdc0182b0, nodeHandle1=..., nodeHandle2=..., handle=459) at ./../../LowLevel/software/src/PxsIslandSim.cpp:389
#13 0x00007ffffb3629a7 in physx::IG::SimpleIslandManager::setEdgeConnected (this=0x7fffdc018200, edgeIndex=459) at ./../../LowLevel/software/src/PxsSimpleIslandManager.cpp:317
#14 0x00007ffffb2f4c55 in physx::Sc::Scene::setEdgesConnected (this=0x7fffdc00e440) at ./../../SimulationController/src/ScScene.cpp:2637
#15 0x00007ffffb13fda3 in physx::Cm::DelegateTask<physx::Sc::Scene, &physx::Sc::Scene::setEdgesConnected>::runInternal (this=0x7fffdc00fef8) at ./../../Common/src/CmTask.h:100
#16 0x00007ffffb129711 in physx::Cm::Task::run (this=0x7fffdc00fef8) at ./../../Common/src/CmTask.h:66
#17 0x00007ffffcb0b48a in physx::Ext::DefaultCpuDispatcher::runTask (this=0x7fffdc00add0, task=...) at ./../../PhysXExtensions/src/ExtDefaultCpuDispatcher.h:96
#18 0x00007ffffcbea76a in physx::Ext::CpuWorkerThread::execute (this=0x7fffdc00bf48) at ./../../PhysXExtensions/src/ExtCpuWorkerThread.cpp:96
#19 0x00007ffffb9cf815 in physx::shdfnd::(anonymous namespace)::PxThreadStart (arg=0x7fffdc00d080) at ./../../../../PxShared/src/foundation/src/unix/PsUnixThread.cpp:133
#20 0x00007ffffe938184 in start_thread (arg=0x7ffff63b0700) at pthread_create.c:312
#21 0x00007ffffe12dffd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This is using CpuDispatcher: PxDefaultCpuDispatcherCreate() with num cores = 2. This I only use 1 core then the problem doesn’t happen.

Has any similar issues been reported or fixed in a newer version of PhysX? We’re using the PhysX SDK 3.4.1 September 2017.

Regards,
Morten Skaaning

Sorry for the late response. Did you get anywhere with this issue? We haven’t had any reports like this and we do have quite a lot of testing on Linux. I’m not sure about Linux-on-windows though.

From the error, it suggests that the threadID doesn’t match the threadID that locked the mutex. Given that the lock/unlock pair is in a scoped lock that should execute on a single thread and the fact that this code works on other platforms, that seems unlikely. Does Linux-on-windows doesn’t provide consistent or correct Thread IDs?

Hi Kier,

Thank you for the reply.

I tried some heavier debugging of the posix locks themselves and observed the weird behavior that the mutex threadId and usage count changed before the mutex was released.
It seems nonsensical, so I’ll try upgrading my linux-on-windows environment before pursuing this further.

Regards,
Morten