Skip to main content

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts.

Project description

A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI GPT-2 text generation model (specifically the "small", 124M hyperparameter version). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase.

Usage

An example for downloading the model to the local system, fineturning it on a dataset. and generating some text.

Warning: the pretrained model, and thus any finetuned model, is 500 MB!

import gpt_2_simple as gpt2

gpt2.download_gpt2()   # model is saved into current directory under /models/124M/

sess = gpt2.start_tf_sess()
gpt2.finetune(sess, 'shakespeare.txt', steps=1000)   # steps is max number of training steps

gpt2.generate(sess)

The generated model checkpoints are by default in /checkpoint/run1. If you want to load a model from that folder and generate text from it:

import gpt_2_simple as gpt2

sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess)

gpt2.generate(sess)

As with textgenrnn, you can generate and save text for later use (e.g. an API or a bot) by using the return_as_list parameter.

single_text = gpt2.generate(sess, return_as_list=True)[0]
print(single_text)

You can pass a run_name parameter to finetune and load_gpt2 if you want to store/load multiple models in a checkpoint folder.

NB: Restart the Python session first if you want to finetune on another dataset or load another model.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt2_plot-0.0.7.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

gpt2_plot-0.0.7-py3-none-any.whl (46.2 kB view details)

Uploaded Python 3

File details

Details for the file gpt2_plot-0.0.7.tar.gz.

File metadata

  • Download URL: gpt2_plot-0.0.7.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for gpt2_plot-0.0.7.tar.gz
Algorithm Hash digest
SHA256 a05de6c263cd9c2a98622b13cc585c7b01c59de46f6b9a5200240c4bd78a1220
MD5 a8bc171dfc5586e5e89387047b76a9fc
BLAKE2b-256 3fc59ea5ed27064fcd1e81a676c19e2721d64a2cac801d8231d6f1d33504fad6

See more details on using hashes here.

File details

Details for the file gpt2_plot-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: gpt2_plot-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 46.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for gpt2_plot-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 4093bcde8ccf453546fa0f6268ae48e3a2e56ac2189c90e354bdd905ae2e1309
MD5 908ca61e0f8a2ce3b24892d6c2579162
BLAKE2b-256 5aac4877e22bd5a65c8a86266875c423393c8c4223aeac7cea84b677eb0c3790

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page