Skip to main content

Model for use with Autodistill

Project description

Autodistill GPT Module

This repository contains the code supporting the GPT (text) base model for use with Autodistill.

You can use Autodistill GPT to classify text using OpenAI's GPT models for use in training smaller, fine-tuned text classification models. You can also use Autodistill GPT to use LLaMAfile text generation models.

Read the full Autodistill documentation.

Installation

To use GPT or LLaMAfile models with Autodistill, you need to install the following dependency:

pip3 install autodistill-gpt-text

Quickstart (LLaMAfile)

from autodistill_gpt_text import GPTClassifier

# define an ontology to map class names to our GPT prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = GPTClassifier(
    ontology=CaptionOntology(
        {
            "computer vision": "computer vision",
            "natural language processing": "nlp"
        }
    ),
    base_url = "http://localhost:8080/v1", # your llamafile server
    model_id="LLaMA_CPP"
)

# label a single text
result = GPTClassifier.predict("This is a blog post about computer vision.")

# label a JSONl file of texts
base_model.label("data.jsonl", output="output.jsonl")

Quickstart (GPT)

from autodistill_gpt_text import GPTClassifier

# define an ontology to map class names to our GPT prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = GPTClassifier(
    ontology=CaptionOntology(
        {
            "computer vision": "computer vision",
            "natural language processing": "nlp"
        }
    )
)

# label a single text
result = GPTClassifier.predict("This is a blog post about computer vision.")

# label a JSONl file of texts
base_model.label("data.jsonl", output="output.jsonl")

The output JSONl file will contain all the data in your original file, with a new classification key in each entry that contains the predicted text label associated with that entry.

License

This project is licensed under an MIT license.

🏆 Contributing

We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodistill_gpt_text-0.1.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autodistill_gpt_text-0.1.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file autodistill_gpt_text-0.1.0.tar.gz.

File metadata

  • Download URL: autodistill_gpt_text-0.1.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for autodistill_gpt_text-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ebe1ed58dab790f2d8b7dcfefb88175ad3e6d9cdf1e3183c34939c0449915962
MD5 8a69b8a618dd7bc8b5e1cbe1f6e3fbd5
BLAKE2b-256 3fcaf0dafb84a5d0fca25cbc3e5c5bd9fe46ea370f9e759bce8f31e9fc23cb6b

See more details on using hashes here.

File details

Details for the file autodistill_gpt_text-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for autodistill_gpt_text-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31e6f5dd7031a1ad5be93f3fdf070d396a385b95f0dd49ecf2f719fd988be1f2
MD5 94985d2d51f1bfdbb0f0f485f2008587
BLAKE2b-256 ff39914b1dd5080d421b263ffa49d019597f034cb64c5a095e3539658058ffff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page