Run several models on people crop images

Hi,
I’m trying to detect if people are wearing: face mask, helmet and protective glasses on a contruction site (several people on the image). I already have a mask-not mask model working but now I have to detect these other accesories. What would you guys recommend is the best apporach since all accesories are worn in the face?

  1. Build a model that detects has/doesn’t have each of the elements (This might get complicated when a person has or lacks more than one accesory. Multilabel?)
  2. Build a model for each accesory and execute them at the same time for each frame (Will these be efficient on a Jetson Xavier NX? Any documentation on doing this?)
  3. Someone recomended finding the person (maybe with peoplenet, or just the face), crop that part of the image and run a detection model of the accesories. (Can this be done in DeepStream?)
  4. Any other approach you can recomment will be appretiated!

I’m working on Python, Jetson Xavier NX and Deepstream 5.0.1.

Hi,

You can train a multi-label detector directly.
The output will be several detected bbox and its corresponding label (ex. mask, glasses, …).
The training process is similar to a detector but with a class information.

You can find a similar usage in our primary detector in Deepstream.

Thanks.

Thanks! That would be a great approach.
Although I’ve never trained a multilabel detector before. (Label images, train the model, config deepstream…)
Is there a tutorial or documentation on this subject for the Xavier NX?
Thanks in advance!

Hi,

It’s more recommended to apply training on a desktop GPU.
You can use our TLT toolkit for a multi-label detector:

Thanks.