
For English-only datasets, you might find interest in the Mozilla Common Voice, which contains 7,335 validated hours of spoken English across 60 languages, including demographic metadata like age and sex, thereby focusing on various English speakers[1].
Another excellent option is the People’s Speech Dataset, which is noted as among the world's largest English speech recognition corpus available under CC-BY-SA and CC-BY 4.0 licenses, comprising over 30,000 hours of transcribed English speech with diverse speakers[2].
Get more accurate answers with Super Pandi, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: