The development of ultra-realistic human voices is upon us, and Mahmoud Felfel's Play.ht is leading the next generation of text-to-voice models. In this episode we discuss the challenges and opportunities of automating a more human voice, as well as concerns about deep fakes and user safety.
Timestamps for E10: Mahmoud Felfel of Play.ht
(
0:00) Preview of Mahmoud on this episode
(
0:55) Sponsor: Omneky.com
(
1:45) Nathan clones his voice using Play.ht
(
6:11) Why Mahmoud started Play.ht and the problem they tried to solve
(
13:08) The job to be done for Play.ht & how they’re thinking about APIs and models
(
24:45) Mahmoud breaks down the architecture of Play.ht
(
29:30) How the use cases have evolved
(
30:00) New markets and opportunities with creators
(
37:00) Are we all about to become prompt engineers/directors?
(
44:50) Roadmap to other languages beyond English
(
48:00) Managing the compute
(
52:00) If AI-generated voices becomes a commodity, what will happen?
(
55:00) Why bigger companies are late adopters of AI tools
(
56:30) The long-term moat of Play.ht and other applications
(
1:00:00) Controversial voice-cloning and potential for societal abuse
(
1:10:32) Commonly abused voices
(
1:12:36) Rapid fire questions