85

this is all good stuff, but a bit too broad - i want data sets only in english, for example

 title: 'Mozilla Common Voice'

For English-only datasets, you might find interest in the Mozilla Common Voice, which contains 7,335 validated hours of spoken English across 60 languages, including demographic metadata like age and sex, thereby focusing on various English speakers[1].

Another excellent option is the People’s Speech Dataset, which is noted as among the world's largest English speech recognition corpus available under CC-BY-SA and CC-BY 4.0 licenses, comprising over 30,000 hours of transcribed English speech with diverse speakers[2].

Follow Up Recommendations

Related Content From The Pandipedia