Image for post
Image for post

Efficient, pythonic code nicely packaged for consumption.

Python and many of its packages have incredible official documentation, but it can be hard to get through it. Meet Python Quick Tips! For elegant, generalizable solutions to data science problems, see Python Deep Tips (forthcoming).

Why take my word on it?
Don’t. Explore for yourself. Each tip has links to documentation and each block of code runs in a Jupyter Notebook.

Python Resources

  1. W3 Schools. W3 Schools documentation and tutorials are very light. They are a great primer to get you mind around a topic before trying out the official documentation.
  2. Real Python. Real…

Image for post
Image for post

One idea, three ways of looking at in.

For learning, no one method works for all people. This is especially true for tough concepts. Things that come to mind are Python decorators, and Python regex.

Basic methods that help learning and/or retention always:

  1. Try, take a break, try again. Your brain is working on problems in the background. I have found that taking a break and exercising helps me a lot.
  2. Try, sleep on it, try again. When you sleep your brain reinforces pathways that have recently formed.
  3. The more senses you engage while learning the better. As such, each…


Image for post
Image for post

Deploy GPT-2 as an Autoscaling Web API

Prerequisites:

  1. AWS — Free account with administrative username setup for programmatic access. When you setup the account, you will create a root user. Use the root user to setup an administrator with programmatic access. Use this one for Cortex and your own use. Set up at least one empty S3 Storage bucket.
  2. Docker — Installed.
  3. Cortex — Installed per their website. Cortex basic tutorials completed. If you are running windows, then check out their Windows installation guide for running Cortex using WSL (Windows Subsystem for Linux) version 2. Note: If you install Ubuntu…

Image for post
Image for post

GPT-2 is good, now you will it make it fast.

At this point you have a finetuned 774M variant of GPT-2, but it has some problems. It is formatted for Tensorflow 1.x which has been deprecated and generating text with it is slow.

To fix the formatting problems, we will use a script from Huggingface. The result will be a finetuned model formatted for Pytorch 1.x (Pytorch started with 0.x). I could have converted it to Tensorflow 2.x but ONNX has better support for Pytorch. …


Image for post
Image for post

Specialize GPT-2 for enhanced performance on any text

This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.

What is fine-tuning?
GPT-2 was trained on 40 gigabytes of text ranging across many subjects. It is very good at generating text, but it can be improved by training it on text specific to its application. This process is called transfer learning.

Prior to running either tutorial see this article for setup. The best way to go through this article is interactively:

  1. Finetune with GPT-2…


Image for post
Image for post

If you think your data is clean, you haven’t looked at it hard enough.

This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.

In the next tutorial, you will fine tune (train) GPT-2 on any topic that you want with a single large text file or folder containing a lot of text files. In the example, I will work with a large selection of Pulitzer Prize winning novels. You can select any text you would like as long as there is a…


Image for post
Image for post

Open your toolbox, fill it up…

This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.

In this article, I will talk about the resources you will use and direct you to tutorials. Some of the tutorials are built by others, some I built for this article series.

Initial Setup:

  1. Unless you have 15 gigabytes of storage free on your google drive, you should set up another one to prevent storage issues.
  2. Go to this google drive folder. Download the folder “Everything GPT-2…

Image for post
Image for post

Feel the burn …

This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.

The existing resources for GPT-2’s architecture are very good, but are written for experienced scientists and developers. This article is a concept roadmap to make GPT-2 more accessible to technically minded people who have not had formally schooling in the Natural Language Processing (NLP). It contains the best resources I discovered while learning about GPT-2.

Prerequisites:

  1. Linear Algebra — specifically matrix multiplication, vectors, and projections from one space onto…


Image for post
Image for post

Prepare for brain melt in 3, 2, 1 …

This article is part of a series on GPT-2. It’s best if you start in the beginning. The links are located at the bottom of the page.

This article is intended to inform your intuition rather than going through every point in depth.

In different resources, GPT-2 refers to different things. Some resources will mean the whole thing that takes in some words and gives you some more words. For this article and the next, I will refer to GPT-2 as the parts that take in some words and generate a…


Image for post
Image for post

Why GPT-2?

What is Generative Pre-trained Transformer (GPT)-2?
GPT-2 is a state-of-the-art machine learning architecture for Natural Language Processing (NLP) released by OpenAI in 2019.

OpenAI just released GPT-3 as an API for a closed beta and has committed to public release by the end of 2020. So why bother with its predecessor, GPT-2?
The current closed beta prevents everyone but Silicon Valley insiders, well-known internet bloggers, and published researchers from accessing GPT-3. Based on the manner of release for GPT-3, OpenAI will probably never release the core code. …

Edward Girling

Mathematician, enjoys his knowledge distilled. Find my insight deep, my jokes laughable, my resources useful, connect with me on twitter @Rowlando_13

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store