@jimmy

For English-only datasets, you might find interest in the Mozilla Common Voice, which contains 7,335 validated hours of spoken English across 60 languages, including demographic metadata like age and sex, thereby focusing on various English speakers[1].
Another excellent option is the People’s Speech Dataset, which is noted as among the world's largest English speech recognition corpus available under CC-BY-SA and CC-BY 4.0 licenses, comprising over 30,000 hours of transcribed English speech with diverse speakers[2].
Let's look at alternatives: