Skip to main content

AiDetector provides a simple interface to train and run models to classify if text was generated by AI or not.

Project description

AI Detector: Detecting AI Generated Text

Overview

AI Detector is a Python module, based on PyTorch, that simplifies the process of training and deploying a classification model to detect whether a given text has been generated by AI. It is designed to be platform-agnostic, making AI detection capabilities accessible to users across different work environments.


Installation

There are two methods available for installing the AI Detector module:

  • Using pip: You can install AI Detector directly from PyPI using pip by running the following command:

    pip3 install aidetector

  • From this repository: Alternatively, you can clone this repository and install it locally:

    git clone https://github.com/baileytec-labs/aidetector.git
    cd aidetector
    pip3 install .
    

Usage

AI Detector can be operated in two modes: training and inference.

Training

To train a new model, you need a CSV dataset with a classification column (labels: 0 for human-written and 1 for AI-generated text) and a text column (the text data). The script takes the following command-line arguments:

aidetector train --datafile [path_to_data] --modeloutputfile [path_to_model] --vocaboutputfile [path_to_vocab] --tokenmodel [SpaCy model] --percentsplit [percentage_for_test_split] --classificationlabel [classification_label_in_data] --textlabel [text_label_in_data] --download  --lowerbound [lower_bound_for_early_stopping] --upperbound [upper_bound_for_early_stopping] --epochs [number_of_epochs]

Inference

To make predictions with a trained model, you need to provide the text you want to classify. The script takes the following command-line arguments:

aidetector infer --modelfile [path_to_trained_model] --vocabfile [path_to_vocab] --text [text_to_classify] --tokenmodel [SpaCy_model] --threshold [probability_threshold_for_classification] --download [flag_to_download_SpaCy_model]

The prediction will be printed to the console: "This was written by AI" or "This was written by a human."


Python API

You can use all the functionality of AiDetector in your python programs, it's as simple as starting with


from aidetector.aidetectorclass import *
from aidetector.inference import *
from aidetector.training import *
from aidetector.tokenization import *
#or
import aidetector as ad

From there, you have access to all of the training, inference, and tokenization capabilities.

for example,

#Getting inference of an AI model in python
from aidetector.tokenization import *
from aidetector.inference import *
from aidetector.aidetectorclass import *

tokenizer=get_tokenizer()
vocab=load_vocab("./myvocabfile.vocab")
model = AiDetector(len(vocab))

testtext="Is this written by AI?"


model.load_state_dict(torch.load("./mymodelfile.model"))
isai=check_input(
    model,
    vocab,
    testtext,
    tokenizer=tokenizer,
)

#returns 0 if human, 1 if AI.




Dependencies

The main dependencies for this project include:

PyTorch SpaCy Torchtext scikit-learn pandas argparse Halo

Note: For tokenization, the project uses SpaCy models. By default, it uses the multi-language model xx_ent_wiki_sm, but other models can be specified using the --tokenmodel argument. If the model is not already downloaded, you can use the --download flag to download the model.

Contributing

Contributions to the AI Detector project are welcome. Please review CONTRIBUTION.md for further instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aidetector-0.0.2.tar.gz (13.5 kB view hashes)

Uploaded Source

Built Distribution

aidetector-0.0.2-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page