Model for use with Autodistill
Project description
Autodistill GPT Module
This repository contains the code supporting the GPT (text) base model for use with Autodistill.
You can use Autodistill GPT to classify text using OpenAI's GPT models for use in training smaller, fine-tuned text classification models. You can also use Autodistill GPT to use LLaMAfile text generation models.
Read the full Autodistill documentation.
Installation
To use GPT or LLaMAfile models with Autodistill, you need to install the following dependency:
pip3 install autodistill-gpt-text
Quickstart (LLaMAfile)
from autodistill_gpt_text import GPTClassifier
# define an ontology to map class names to our GPT prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = GPTClassifier(
ontology=CaptionOntology(
{
"computer vision": "computer vision",
"natural language processing": "nlp"
}
),
base_url = "http://localhost:8080/v1", # your llamafile server
model_id="LLaMA_CPP"
)
# label a single text
result = GPTClassifier.predict("This is a blog post about computer vision.")
# label a JSONl file of texts
base_model.label("data.jsonl", output="output.jsonl")
Quickstart (GPT)
from autodistill_gpt_text import GPTClassifier
# define an ontology to map class names to our GPT prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = GPTClassifier(
ontology=CaptionOntology(
{
"computer vision": "computer vision",
"natural language processing": "nlp"
}
)
)
# label a single text
result = GPTClassifier.predict("This is a blog post about computer vision.")
# label a JSONl file of texts
base_model.label("data.jsonl", output="output.jsonl")
The output JSONl file will contain all the data in your original file, with a new classification key in each entry that contains the predicted text label associated with that entry.
License
This project is licensed under an MIT license.
🏆 Contributing
We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autodistill_gpt_text-0.1.0.tar.gz.
File metadata
- Download URL: autodistill_gpt_text-0.1.0.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebe1ed58dab790f2d8b7dcfefb88175ad3e6d9cdf1e3183c34939c0449915962
|
|
| MD5 |
8a69b8a618dd7bc8b5e1cbe1f6e3fbd5
|
|
| BLAKE2b-256 |
3fcaf0dafb84a5d0fca25cbc3e5c5bd9fe46ea370f9e759bce8f31e9fc23cb6b
|
File details
Details for the file autodistill_gpt_text-0.1.0-py3-none-any.whl.
File metadata
- Download URL: autodistill_gpt_text-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31e6f5dd7031a1ad5be93f3fdf070d396a385b95f0dd49ecf2f719fd988be1f2
|
|
| MD5 |
94985d2d51f1bfdbb0f0f485f2008587
|
|
| BLAKE2b-256 |
ff39914b1dd5080d421b263ffa49d019597f034cb64c5a095e3539658058ffff
|