gpt-2-simple

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts.

Project description

A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI GPT-2 text generation model (specifically the "small", 117M hyperparameter version). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase.

Usage

An example for downloading the model to the local system, fineturning it on a dataset. and generating some text.

Warning: the pretrained model, and thus any finetuned model, is 500 MB!

import gpt_2_simple as gpt2

gpt2.download_gpt2()   # model is saved into current directory under /models/117M/

sess = gpt2.start_tf_sess()
gpt2.finetune(sess, 'shakespeare.txt', steps=1000)   # steps is max number of training steps

gpt2.generate(sess)

The generated model checkpoints are by default in /checkpoint/run1. If you want to load a model from that folder and generate text from it:

import gpt_2_simple as gpt2

sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess)

gpt2.generate(sess)

As with textgenrnn, you can generate and save text for later use (e.g. an API or a bot) by using the return_as_list parameter.

single_text = gpt2.generate(sess, return_as_list=True)[0]
print(single_text)

You can pass a run_name parameter to finetune and load_gpt2 if you want to store/load multiple models in a checkpoint folder.

NB: Restart the Python session first if you want to finetune on another dataset or load another model.

Project details

Release history Release notifications | RSS feed

0.8.1

Oct 18, 2021

0.8.0

Oct 18, 2021

0.7.2

Feb 14, 2021

0.7.1

Dec 28, 2019

0.7

Dec 1, 2019

0.6

Aug 28, 2019

0.5.4

Jul 29, 2019

This version

0.5.3

Jun 19, 2019

0.5.2

Jun 18, 2019

0.5.1

Jun 16, 2019

0.5

May 20, 2019

0.4.2

May 5, 2019

0.4.1

May 5, 2019

0.4

May 5, 2019

0.3.1

Apr 23, 2019

0.3

Apr 21, 2019

0.2

Apr 20, 2019

0.1

Apr 19, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt_2_simple-0.5.3.tar.gz (25.5 kB view hashes)

Uploaded Jun 19, 2019 Source

Built Distribution

gpt_2_simple-0.5.3-py3.7.egg (51.4 kB view hashes)

Uploaded Jun 19, 2019 Source

Hashes for gpt_2_simple-0.5.3.tar.gz

Hashes for gpt_2_simple-0.5.3.tar.gz
Algorithm	Hash digest
SHA256	`b8b4e4dc71c2876a6c03638848de0e419ec2314eeeee2e54a7eac2167ae54d56`
MD5	`0075513eb17c60b6a36ef3bfdb1c59b8`
BLAKE2b-256	`0584785cae5fffe8752d1175216d5b653e36c289f0cf0b58017144ff9e7a11c3`

Hashes for gpt_2_simple-0.5.3-py3.7.egg

Hashes for gpt_2_simple-0.5.3-py3.7.egg
Algorithm	Hash digest
SHA256	`d35b1a58b3c97b62264f8aed770851e88547d1b08b498985e39138e6701eb48e`
MD5	`87682c5fa472c102829a40d04be23baa`
BLAKE2b-256	`4d206e2b9f79d5d9777882b3de22da55e603302db5708e18074352c3bb9b0c78`