Help request regarding implementing a Morpheus-based Sensitive Information Detection workflow on local machine to demo utility in redacting free text

Hi there,

Before going much into details, below presenting the very idea behind my query.

I am wondering if the Morpheus-based workflow could be tweaked in such a way that it could not only detect e.g. ‘anomaly’ in web data traffick but also sensitive information in plain text files that to be redacted prior publication.(?)

Thanks very much for any offered help/ recommendations.


Thanks for posting this question - I am not part of the Morpheus engineering team - but since today is an NVIDIA holiday I thought I would jump in.
This is some text explaining one of the Morpheus models for detecting phishing: “Traditional methods for detecting phishing emails rely on URL-only detection, complex lookups against known attacks, and following suspicious links in a sandbox environment. A better way of protecting the environment from phishing attacks would be to analyze the entire raw body of the email, including syntax and semantics of the text, and the structure of the email, in addition to the words and links used. This wasn’t previously possible, due to compute limits and the lack of generic tools to enable natural language processing (NLP) models to be deployed in cybersecurity environments seamlessly.
Your proposal would be essentially this - but the model would have to be trained to determine what is sensitive information. So I would imagine the answer to your question is yes , the workflow supports this .
If you make progress on this - please come back and let us know more, and I would even suggest you consider presenting your work at a future GTC.
Thanks !

1 Like

Thanks very much @nadeem for pointing me at the right direction. Indeed, this is something I am after. Will try to make some progress and connect dots with those mentioned and some of the technologies presented (incl. BERT-models) during an Edinburgh-based NLP conference. Will let you know if any reasonable progression in the foreseable future (or posting further, more specific questions).