Originally published at: https://developer.nvidia.com/blog/dynamic-scale-weighting-through-multiscale-speaker-diarization/
MSDD is a neural model that can be trained on 2-speaker dataset and the proposed model enables overlap-aware speaker diarization on flexible number of speakers.