Conference article

Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces

Tuomas Kaseva

Hemant Kumar Kathania

Aku Rouhe

Mikko Kurimo

Download article

Published in: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021.

Linköping Electronic Conference Proceedings 178:9, p. 86-93

NEALT Proceedings Series 45:9, p. 86-93

Show more +

Published: 2021-05-21

ISBN: 978-91-7929-614-8

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

For children, the system trained on a large corpus of adult speakers performed worse than a system trained on a much smaller corpus of children’s speech. This is due to the acoustic mismatch between training and testing data. To capture more acoustic variability we trained a shared system with mixed data from adults and children. The shared system yields the best EER for children with no degradation for adults. Thus, the single system trained with mixed data is applicable for speaker verification for both adults and children.

Keywords

additive margin softmax loss, NetVLAD aggregation, recurrent neural network, speaker verification for children

References

No references available

Citations in Crossref