Turning Podcasts into AI Insights: The Making of a 20VC-GPT Bot

A DALL·E 3 generated image.

After the OpenAI developer event yesterday investor Harry Stebbings  tweeted, “Holy shit. Can you imagine a GPT where you could ask any question and it uses advice from 3,000 20VC episodes to answer your questions from the best VCs in the world.” That possibility is in fact already a reality. Here’s how I spun up a working prototype rapidly. Demo videos below.

Some background: I built something similar six months ago with a different dataset. However, that process took several hours over a few evenings. Today, you can make custom bots in minutes. Let’s walk through the broad steps I took for the 20VC bot.

First, I downloaded a sample of 60 episodes from the 20VC podcast. I then used AssemblyAI’s API to transcribe the MP3s in a big batch. You could also use Whisper, which is cheaper and perhaps even faster. I went for AssemblyAI instead because of familiarity and a need to prototype quickly.

The next step was to convert these transcripts into a database that GPT4 could use. For that, I used Retool — a platform that lets you drag and drop files into a database of embeddings that language models understand. Retool also provides chatbot interfaces you can use right off the shelf. And voila! I had a bot that could query 60 episodes of 20VC for knowledge and advice.

To create a full version of the 20VC bot you would need all 3,000 episodes and a reasonable budget for large language model services. This process will rack up a bill in the hundreds (maybe thousands) of dollars, but it’s small change for a media or investment business.

To Harry and his team, I hope this demo shows what’s possible. Even without the upcoming OpenAI feature that enables anyone to create their own GPT, you can build custom GPTs already and with impressive speed.

Happy building all!