OpenAI just released GPT-3 as an API for a closed beta and has committed to public release by the end of 2020. So why bother with its predecessor, GPT-2?
The current closed beta prevents everyone but Silicon Valley insiders, well-known internet bloggers, and published researchers from accessing GPT-3. Based on the manner of release for GPT-3, OpenAI will probably never release the core code. Unlike with GPT-3, OpenAI has open sourced the full model for GPT-2.
How much of what you learn about GPT-2 will apply to GPT-3?
Almost everything. The architecture for GPT-3 is a scaled-up GPT-2. All the underlying design theory remains the same.
Why should you continue reading?
Over the past three months, I have read everything I can find about GPT-2. I am going to provide you with a comprehensive overview. I will point you to the best existing resources and create new ones where necessary. All you have to do is read, think until your brain hurts, take a break, and repeat. I will use Google Colab Notebooks so you can learn by running the code.
Articles in the series:
Everything GPT-2: 0. Intro
Everything GPT-2: 1. Architecture Overview
Everything GPT-2: 2. Architecture In-Depth
Everything GPT-2: 3. Tools
Everything GPT-2: 4. Data Preparation
Everything GPT-2: 5. Fine-Tuning
Everything GPT-2: 6. Optimizations
Everything GPT-2: 7. Production
All resources for articles in the series are centralized in this google drive folder.
(aside) OpenAI does not seem so open.
Indeed. They just sold Microsoft the exclusive rights to GPT-3. Let’s go down the rabbit hole. Who has funded OpenAI to the tune of over 1 billion dollars? Microsoft. To balance my bias, a perspective from VOX. Something seems amuck but the study of nepotism, big business concerns, and tech monopolies will have to wait for another blog series.