Dynamic Scale Weighting Through Multiscale Speaker Diarization

Originally published at: https://developer.nvidia.com/blog/dynamic-scale-weighting-through-multiscale-speaker-diarization/

MSDD is a neural model that can be trained on 2-speaker dataset and the proposed model enables overlap-aware speaker diarization on flexible number of speakers.