Deploy GPT-2 as an Autoscaling Web API
Prerequisites:
sudo apt-get install python3-pip
and pip3 install cortex
from the WSL Bash window because Ubuntu 20.04 …GPT-2 is good, now you will it make it fast.
At this point you have a finetuned 774M variant of GPT-2, but it has some problems. It is formatted for Tensorflow 1.x which has been deprecated and generating text with it is slow.
To fix the formatting problems, we will use a script from Huggingface. The result will be a finetuned model formatted for Pytorch 1.x (Pytorch started with 0.x). I could have converted it to Tensorflow 2.x but ONNX has better support for Pytorch. …
Specialize GPT-2 for enhanced performance on any text
This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.
What is fine-tuning?
GPT-2 was trained on 40 gigabytes of text ranging across many subjects. It is very good at generating text, but it can be improved by training it on text specific to its application. This process is called transfer learning.
Prior to running either tutorial see this article for setup. The best way to go through this article is interactively:
If you think your data is clean, you haven’t looked at it hard enough.
This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.
In the next tutorial, you will fine tune (train) GPT-2 on any topic that you want with a single large text file or folder containing a lot of text files. In the example, I will work with a large selection of Pulitzer Prize winning novels. You can select any text you would like as long as there is a lot of it and the text is very clean. …
Open your toolbox, fill it up…
This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.
In this article, I will talk about the resources you will use and direct you to tutorials. Some of the tutorials are built by others, some I built for this article series.
Initial Setup:
Feel the burn …
This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.
The existing resources for GPT-2’s architecture are very good, but are written for experienced scientists and developers. This article is a concept roadmap to make GPT-2 more accessible to technically minded people who have not had formally schooling in the Natural Language Processing (NLP). It contains the best resources I discovered while learning about GPT-2.
Prerequisites:
Prepare for brain melt in 3, 2, 1 …
This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.
This article is intended to inform your intuition rather than going through every point in depth.
In different resources, GPT-2 refers to different things. Some resources will mean the whole thing that takes in some words and gives you some more words. For this article and the next, I will refer to GPT-2 as the parts that take in some words and generate a single word piece (token). My reasoning will become more clear as you learn more. …
Why GPT-2?
What is Generative Pre-trained Transformer (GPT)-2?
GPT-2 is a state-of-the-art machine learning architecture for Natural Language Processing (NLP) released by OpenAI in 2019.
OpenAI just released GPT-3 as an API for a closed beta and has committed to public release by the end of 2020. So why bother with its predecessor, GPT-2?
The current closed beta prevents everyone but Silicon Valley insiders, well-known internet bloggers, and published researchers from accessing GPT-3. Based on the manner of release for GPT-3, OpenAI will probably never release the core code. …
About