Skip to main content

A lightweight framework for training and deploying language models for thematic text classification.

Project description

🐝 Beevibe

Feel the vibe of building smarter models.

Current Version: 0.1


🐝 About Beevibe

Beevibe is a Python package designed to make it easier to train advanced language models on text datasets with specific themes and perform accurate inference on new sentences. Beevibe is built to empower developers and researchers with tools that are efficient, intuitive, and scalable.

Beevibe leverages modern features to simplify workflows and enhance user experience.


Features

It integrates:

  • Simplified Usage: Designed for simplicity and efficiency, making it lightweight and easy to use.
  • Powered Functionalities:
    • Energy-efficient training using QLoRA.
    • Seamless creation of classification heads.
    • High-level functions to manage holdout and cross-validation.
  • Tutorial-Ready:
    • Comes with synthetic datasets for Elegana Customer Relationship Management
    • Covering binary, multi-class, and multi-label classification.
    • Comprehensive tutorials for CamemBERT, CamemBERTV2, and ModernBERT,
  • Streamlined Development:
    • Supports GitHub Codespaces and VSCode for development.
    • Colab integration for GPU testing.
  • Quality Assurance:
    • Implements ruff for code linting and pydantic for parameter validation.
    • Comprehensive test suite for non-regression verification.

📦 Installation

Install Beevibe using pip:

pip install beevibe

🚀 Quickstart

1. Training a Model

Train CamemBERT on your custom thematic dataset:

from Beevibe import BeeMLMClassifier, BeeTrainer
import torch.nn as nn

# Define classification head
head_layers = [
        {"input_size": 768, "output_size": 512, "activation": nn.ReLU, "batch_norm": True},
        {"input_size": 512, "output_size": 256, "activation": nn.ReLU, "layer_norm": True},
        {"input_size": 256, "output_size": num_classes, "activation": None},
    ]

# Initialize model for multi-classes
bee_mlm_model = BeeMLMClassifier(
    model_name = "camembert-base",
    num_labels = 5,
    head_layers=head_layers
)

# Initialize the trainer
trainer = BeeTrainer(
  model=bee_mlm_model,
  use_lora=True,
  lora_r = 64,
  lora_alpha= 128,
  lora_dropout = 0.01
  )

# Train the model with QLora
ret = trainer.train(
  texts=texts, 
  labels=labels, 
  num_epochs=10
  )

# Save the trained model
trainer.save_model("./sav_model")

# Free CPU/GPU memory
trainer.release_model()

2. Performing Inference

Use the trained model to classify or extract themes from new sentences:

from Beevibe import BeeMLMClassifier

# Load the trained model
bee_mlm_model = BeeMLMClassifier.load_model_safetensors("./sav_model")

# Infer themes for a new sentence
result = bee_mlm_model.predict(["This is a new sentence to classify."])

print("Predicted Theme:", result)

📜 License

Beevibe is licensed under the MIT License. See the LICENSE file for details.


📖 Citing Beevibe

If you use Beevibe in your research, projects, or publications, please cite it as follows:

@misc{Beevibe2024,
  title={Beevibe: Feel the vibe of building smarter models},
  author={François Bullier},
  year={2024},
  url={https://github.com/fbullier/Beevibe},
  note={Version 0.1}
}

By citing Beevibe, you help others discover and build upon this work!


🌟 Acknowledgments

  • Created with the assistance of AI tools like ChatGPT.
  • Inspired by the brilliance of BERT and the power of Python.
  • Special thanks to the vibrant community of developers and data scientists who make innovation possible.

Beevibe Logo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beevibe-0.1.0.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beevibe-0.1.0-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file beevibe-0.1.0.tar.gz.

File metadata

  • Download URL: beevibe-0.1.0.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.1

File hashes

Hashes for beevibe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 08d800217ac4f2cf4c038d001af9e574586b387b888b111f388a7b55a9e8b05f
MD5 9b45719b5520f4dc310a9b135560b0f9
BLAKE2b-256 4b366d1c71768da1db396149bfca5b160327d6de251a9c7e712072a416e7c7db

See more details on using hashes here.

File details

Details for the file beevibe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: beevibe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.1

File hashes

Hashes for beevibe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9eacbebae591494647cb407fc45426039b1e5bcea687a75562023b46f0d86a81
MD5 6d9065c14dd0554353b7ff7f3ac905b1
BLAKE2b-256 6f482910286ddd30f3fcbd4c00b07fd7fe0365afd6488233960ad3742fb90429

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page