PhysX 3.2.3 Possible bug - Random invalid pointer after deleting Actor (trigger)

Hi.

I´ve found maybe a possible bug ( or uncommented bad feature ) in PhysX.

I´m using filterShader, callbacks and trigger shapes, on
Windows 7 64 Bit, VS 2012, with debug library, in debug mode.
Its unimportant to show code I guess, because this is a common error.

My PxBoxController touch more than 1 trigger shape ( which will be deleted on touch. )
( Collecting items as example )
I know that its not possible (and a very bad idea) to delete an actor while the simulation is running, so
I bufferd each call to onTrigger / onContact in a list. Outside the simulation, I am updating
this list and the object which contains the actor is marked as deletable,
and on the next call of my object manager delete it and any reference on it

  • which is after the whole update from PhysX. (Fixed timestep , mostly 1 - 3 calls for updating PhysX )

My problem is that sometimes in the simulation phase Physx thinks that is a good idea to set
the trigger/contact data ( mostly otherShape, but everything can it be ) to a non valid pointer like
0xfeeefeee. This happens a lot when PxPairFlag::eNOTIFY_TOUCH_LOST was set. But often - there
are only invalid pointer - or values which cant be read or be validaded.
( The whole PxTriggerPair* data is sometimes invalid ! )


{ Maybe you can understand what happend when you think of a stack with 3 boxes.
Delete the box in the middle - now the boxes received the PxPairFlag::eNOTIFY_TOUCH_LOST
with the (almost) deleted middle box. Should work in on the first update, on the next, it crashed.
}

Its possible to check against these bit pattern, but there are more pattern like
0xabababab or 0xcdcdcdcd - even those which arent have a special context - but mostly readable like
0xcdcd200f or similar.

Of course, its possible to catch these invalid pointer when you trying to read them with
a try catch block, ( with enabled \Eha in VS 2012 )
But … using a try catch block on invalid pointer … is a very very bad idea I think.

I dont know why PhysX should tell me: Hey, there was a deleted actor, here was the valid adress -
but I changed it to non NULL so you think its probably still valid.

When I call mScene->flush() its non deterministic when it crashs - and I dont get any error message from
PhysX - only from VS 2012 which says that I try to read an invalid adress.

And yes, as I said, its possible to catch that interrupt, but … it should´nt be there!

So, why does PhysX give me invalid pointer instead of NULL pointer?

Not sure if this will help you or not, but you can check PxTriggerPairFlags (or PxContactPairHeaderFlags for contacts) for deleted shapes/actors. If these flags are set, then that shape or actor pointer may be invalid.

For triggers:

void MyScene::onTrigger( PxTriggerPair * pairs, PxU32 count )
{
  for( PxU32 i = 0; i < count; ++i )
  {
    if( pairs[i].flags & (PxTriggerPairFlag::eDELETED_SHAPE_TRIGGER | PxTriggerPairFlag::eDELETED_SHAPE_OTHER) )
      continue;
    ...
  }
}

For regular contacts:

void MyScene::onContact(const PxContactPairHeader & pairHeader, const PxContactPair * pairs, PxU32 nbPairs)
{
  if( pairHeader.flags & (PxContactPairHeaderFlag::eDELETED_ACTOR_0 | PxContactPairHeaderFlag::eDELETED_ACTOR_1) )
    return;
  ...
}

Hi,

thank you for your response.

It wont help me because I forgot to say it that I already checked against these flags - but it wont help when you cant check against these flags ( values ) due to the fact that the whole *pairs or &pairHeader is somtimes ( as I said ) for some reasons at some time invalid.

The only possible and safe way to check against invalid pointer ( or data ) is to use a try catch block with enabled /EHa compiler option in VS , which makes this code unportable for other operating systems.
( And of course as I mentioned in the previous post, its not the best way. )

I already read it, but here is some information about its “special” dangeling pointer behaviour.
… Thats very interesting, because when someone delete an PxActor, they should delete the
userData too…

And another weird thing is that I never get pairs[i].flags != 0 - so its impossible to check
flags against it. That is really bad.

/**
\brief Descriptor for a trigger pair.

An array of these structs gets passed to the PxSimulationEventCallback::onTrigger() report.

\note The shape pointers might reference deleted shapes.
This will be the case if #PxPairFlag::eNOTIFY_TOUCH_LOST
events were requested for the pair and one of the involved shapes gets deleted.
Check the #flags member to see whether that is the case.
Do not dereference a pointer to a deleted shape.
The pointer to a deleted shape is only provided such that user data structures which
might depend on the pointer value can be updated.

@see PxSimulationEventCallback.onTrigger()
*/

Hi,

I also used the PxController->invalidadeCache() method, which improves the behaviour.
But it wont fix it. Also this is only a dirty quick fix ( better known as workaround ) - with
other undesired behaviour of the PxController.

This or maybe a very similiar bug is known as bug inside the PxController.

Quote from release notes: 3.2.2 → it doesent changed in 3.2.3
Character Controller:
Fixed a bug where releasing a shape of a touching actor and then releasing the character controller would crash. Releasing shapes of actors in touch may still lead to crashes in other situations. PxController::invalidateCache() can be used to work around these situations.
.
.( From 3.2 to 3.2.2 Nvidia fixed a few bugs… here is the main bug: )
.
Ouote from release note: 3.2:
Character Controller:
Releasing shapes of actors that are in touch with a character controller may lead to crashes. Releasing whole actors doesn’t lead to the same problems. PxController::invalidateCache() can be used to work around these issues.

But … I didnt release shapes from an actor, I delete the whole actor…

MikePhysX, what do you think about it? Do you have any ideas?

Since the incident with the old forum … most user didn´t came back - thus they knowledge is lost.

One other question is: Why don´t I get flags inside the onTrigger callback?
( pairs[i].flags is always 0 )

Does anyone have an idea?

Hi,

hm. Nobody answers on my questions. Even the admins wont answer me.

Here is a little eye candy:
Quote from PhysXGuide:

Invalidating internal caches
The character controller library caches the geometry around each character, in order to speed up collision queries. In PhysX 3.3 and above, those caches should be automatically invalidated when a cached object gets updated. However it is also possible to manually flush those caches using the following function: void PxController::invalidateCache();

Thus - with this function the PxController will update its internal cache and will send flags / events when an actor is deleted while he touchs it…
I cant use this function because this won´t work for many trigger shapes and the whole scene stops for
a few milliseconds. It seems to me that this function “reset” everthing releated to the PxController - and PhysX needs immediately an update for it. Thus - other collisions will be ignored.
After the next simulation step it will check again against collisions which he already had.
( Maybe there is a way to buffer this, but I wont use it because it makes more trouble )

Okay. I´ve found another way to partially solve my problem:
As I said in the posts above, I didnt get any PxTriggerFlag / PxPairFlags. Never.
→ I´ve solved it using a deletion queue which deletes the desired actor(s) while the
simulation is running. I dont like this way but this is the only way to get the flags fired.
Removing the actors in the simulation phase? Due to this fix there
are other issueses: All deletion of actors have to be while the simulation is running. All.

The scene works 98% - sometimes, I get non deterministic crashs - with guess what:
Invalid pointer with bit pattern, maybe even with random values.
But the scariest thing is: There are no flags raised - never when it crashs.
( Impossible to check against a 0 flag )
This behavior is possible to check against the bit pattern … which brings up to the first post.

Hm. Maybe this awesome bug will be fixed in PhysX 3.3.
Or something went completly wrong here.

I’m sorry for the puzzling problem.

Would you mind providing us a repro? We have unit tests for these cases. Thanks much.

Hey finally a reply :)

What do you mean with a repro - what do you want? - The application, a code snippet - pvd dump?
Since the null pointer exception is random and thus non deterministic you will get an overhead
of data because I have to create a lots of cubes (200 maybe) which is created and almost immendiatley deleted. The PxController have 3 trigger shapes - but only one will send the event that the box touched a triggershape and delete it ( buffered - while the simulation is running ).

Please describe me what you want.

Thank you for your respone.

We’ll need the actual code to repro on our end. It could be a simple cpp file or snippet saying creating 200 cubes here, this is where I delete them, these are trigger shapes created here, CCTs here, onTrigger does this, etc, as long as we can follow to actually repro the crash.

Hi,
okay I will code an example app which shows the null pointer exceptions.
But it will be the first version of that error.
That means that the PxActors are deleted outside the simulation phase like it should be.
( And not the workaround - where I delete the PxActors while the simulation phase,
to get the “on deleted shape” flags - so I could check against them )
So in the example there is no way to check against the “on deleted shape” flag.

→ This will be a thrown together code snippet from my application.
→ I will use Ogre3D as render - so there are few dlls which makes the zip size bigger.
( And you need to port it to OpenGL or download the free Ogre3D SDK 1.9rc to compile it )
→ I want to send you the whole VS 2012 solution for it so you can directly compile it.
( After you set the path of Ogre3D and boost and OIS of course)
→ I will add comments so you dont have to waste time to study the code :)

I will code it when I have enough time for it.
Should I send it via pm?

Hi,
hm. I will try to code it in the PhysX Samples, because everybody can see what happens there and you dont have to study other code. Plus, I dont have to release my code snippets :)
I guess its easier for everyone - and everybody can compile it.

Hi,

I´ve probably solve the issue - with a small change in my code.
In short terms:
I have to do all onTrigger / onContact updates while that was triggered while the simulation is running.
→ I set different kind of flags so I have to buffer them and call them after the simulation is done.
This need an code refracture in my code, because it was designed to do it outside the simulation phase.

But when I change it to the “buffered” mode , which updates the onTrigger / onContact events
( to call them outside the simulation phase ) - it will change the pointer back to random / bit pattern.
Hm. So every onTrigger / onContact event must be immediately be handled while the callback is active - but no write changes must be done. ( Of course! )

Another good news: Physx 3.2.4 is released - I will take a deep look in it so maybe there are some
related bugs fixed.

Yay :D

Small update:

In PhysX 3.2.4 is the same issue like I said in the first post - but when I change it like I said in this post, everything works like in 3.2.3. It seems to be (probably) fixed with that.