Skip to main content

AiDetector provides a simple interface to train and run models to classify if text was generated by AI or not.

Project description

AI Detector: Detecting AI Generated Text

Overview

AI Detector is a Python module, based on PyTorch, that simplifies the process of training and deploying a classification model to detect whether a given text has been generated by AI. It is designed to be platform-agnostic, making AI detection capabilities accessible to users across different work environments.


Installation

There are two methods available for installing the AI Detector module:

  • Using pip: You can install AI Detector directly from PyPI using pip by running the following command:

    pip3 install aidetector

  • From this repository: Alternatively, you can clone this repository and install it locally:

    git clone https://github.com/baileytec-labs/aidetector.git
    cd aidetector
    pip3 install .
    

Usage

AI Detector can be operated in two modes: training and inference.

Training

To train a new model, you need a CSV dataset with a classification column (labels: 0 for human-written and 1 for AI-generated text) and a text column (the text data). The script takes the following command-line arguments:

aidetector train --datafile [path_to_data] --modeloutputfile [path_to_model] --vocaboutputfile [path_to_vocab] --tokenmodel [SpaCy model] --percentsplit [percentage_for_test_split] --classificationlabel [classification_label_in_data] --textlabel [text_label_in_data] --download  --lowerbound [lower_bound_for_early_stopping] --upperbound [upper_bound_for_early_stopping] --epochs [number_of_epochs]

Inference

To make predictions with a trained model, you need to provide the text you want to classify. The script takes the following command-line arguments:

aidetector infer --modelfile [path_to_trained_model] --vocabfile [path_to_vocab] --text [text_to_classify] --tokenmodel [SpaCy_model] --threshold [probability_threshold_for_classification] --download [flag_to_download_SpaCy_model]

The prediction will be printed to the console: "This was written by AI" or "This was written by a human."


Python API

You can use all the functionality of AiDetector in your python programs, it's as simple as starting with


from aidetector.aidetectorclass import *
from aidetector.inference import *
from aidetector.training import *
from aidetector.tokenization import *
#or
import aidetector as ad

From there, you have access to all of the training, inference, and tokenization capabilities.

for example,

#Getting inference of an AI model in python
from aidetector.tokenization import *
from aidetector.inference import *
from aidetector.aidetectorclass import *

tokenizer=get_tokenizer()
vocab=load_vocab("./myvocabfile.vocab")
model = AiDetector(len(vocab))

testtext="Is this written by AI?"


model.load_state_dict(torch.load("./mymodelfile.model"))
isai=check_input(
    model,
    vocab,
    testtext,
    tokenizer=tokenizer,
)

#returns 0 if human, 1 if AI.




Dependencies

The main dependencies for this project include:

PyTorch SpaCy Torchtext scikit-learn pandas argparse Halo

Note: For tokenization, the project uses SpaCy models. By default, it uses the multi-language model xx_ent_wiki_sm, but other models can be specified using the --tokenmodel argument. If the model is not already downloaded, you can use the --download flag to download the model.

Contributing

Contributions to the AI Detector project are welcome. Please review CONTRIBUTION.md for further instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aidetector-0.0.2.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

aidetector-0.0.2-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file aidetector-0.0.2.tar.gz.

File metadata

  • Download URL: aidetector-0.0.2.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for aidetector-0.0.2.tar.gz
Algorithm Hash digest
SHA256 4e582d5d206671d5051bc9d32f3a5f8cbdd879f38de91e44a6fac63cce62032b
MD5 1375503d450a63301a4b05af81f2570c
BLAKE2b-256 f6a10dfe5c8499688333fe1fb59c730e52da81ece3af653624386a5738b8b756

See more details on using hashes here.

File details

Details for the file aidetector-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: aidetector-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for aidetector-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3bc058ae0a443309cc72e1621ffde655b15a452fc925ab63ffd335f243b40214
MD5 cb954bb5e88529e9bf1aa106647af3af
BLAKE2b-256 b9d320e3fae4136b417c8a765a2d03a5a64b890e861ecbb1af693152cac97d3f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page