New Datasets to Democratize Speech Recognition Technology
thegradientpub.substack.com
Presenting the The People’s Speech, a massive English-language dataset of audio transcriptions, and the Multilingual Spoken Words Corpus (MSWC), a 50-language, 6000-hour dataset of individual words
New Datasets to Democratize Speech Recognition Technology
New Datasets to Democratize Speech…
New Datasets to Democratize Speech Recognition Technology
Presenting the The People’s Speech, a massive English-language dataset of audio transcriptions, and the Multilingual Spoken Words Corpus (MSWC), a 50-language, 6000-hour dataset of individual words