Skip to main content

Datasets and models for instruction-tuning

Project description

Datasets and models for instruction-tuning

txtinstruct is a framework for training instruction-tuned models.


The objective of this project is to support open data, open models and integration with your own data. One of the biggest problems today is the lack of licensing clarity with instruction-following datasets and large language models. txtinstruct makes it easy to build your own instruction-following datasets and use those datasets to train instructed-tuned models.

txtinstruct is built with Python 3.7+ and txtai.


The easiest way to install is via pip and PyPI

pip install txtinstruct

You can also install txtinstruct directly from GitHub. Using a Python Virtual Environment is recommended.

pip install git+

Python 3.7+ is supported

See this link to help resolve environment-specific install issues.


The following example notebooks show how to build models with txtinstruct.

Notebook Description
Introducing txtinstruct Build instruction-tuned datasets and models Open In Colab

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

txtinstruct-0.1.0.tar.gz (7.4 kB view hashes)

Uploaded Source

Built Distribution

txtinstruct-0.1.0-py3-none-any.whl (13.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page