An advanced, from-scratch NLP framework for training and deploying modern transformer models.

These details have not been verified by PyPI

Project links

Homepage

Project description

██████ ██████ ██   ██ ██████ ██████ ██  ██ ██   ██ ██     ██████  
    ██ ██     ███  ██   ██     ██   ██  ██ ███  ██ ██     ██   ██ 
   ██  █████  ████ ██   ██     ██   ██████ ████ ██ ██     ██████  
  ██   ██     ██ ████   ██     ██   ██  ██ ██ ████ ██     ██      
 ██    ██     ██  ███   ██     ██   ██  ██ ██  ███ ██     ██      
██████ ██████ ██   ██ ██████   ██   ██  ██ ██   ██ ██████ ██

Zenith NLP Framework

A Framework for Advanced Natural Language Processing

Python PyTorch Hydra MLflow Docker FastAPI Pytest GitHub Actions

ZenithNLP is an advanced, from-scratch NLP framework built with PyTorch for training, fine-tuning, and deploying modern transformer-based models. It serves as a comprehensive toolkit for NLP practitioners and researchers, featuring a modular architecture and a full suite of MLOps capabilities.

✨ Features

State-of-the-Art Model Architectures: From-scratch implementations of:
- BERT (Encoder-only) for tasks like classification and NER.
- GPT (Decoder-only) for causal language modeling and text generation.
- Seq2SeqTransformer (Encoder-Decoder) for translation and summarization.
Advanced Training Techniques:
- Parameter-Efficient Fine-Tuning (PEFT): Integrated LoRA (Low-Rank Adaptation) for efficient fine-tuning of large models.
- Distributed Training: Support for multi-GPU training using PyTorch's DistributedDataParallel.
- Advanced Optimization: Includes learning rate scheduling with warm-up and gradient clipping.
Full MLOps Pipeline:
- Configuration Management: Powered by Hydra, allowing for flexible and reproducible experiments through YAML files.
- Experiment Tracking: Integrated with MLflow to log parameters, metrics, and model artifacts automatically.
- Containerization: Fully containerized with Docker and Docker Compose for reproducible environments and easy deployment of the MLflow UI.
- Continuous Integration: Automated testing pipeline with GitHub Actions and pytest.
Flexible API for Deployment:
- A ready-to-use FastAPI server that can dynamically load and serve any model trained with the framework.
Custom Core Components:
- A trainable Byte-Pair Encoding (BPE) Tokenizer built from scratch.
- Modular implementations of MultiHeadAttention, PositionalEncoding, and other core transformer building blocks.

🚀 Getting Started

1. Installation (from PyPI)

Note: Once published, you will be able to install the framework directly from PyPI.

pip install zenith-nlp-framework

2. Local Development Setup

# 1. Clone the repository
git clone https://github.com/cattolatte/zenith-nlp-framework.git
cd zenith-nlp-framework

# 2. Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate

# 3. Install all dependencies
pip install -r requirements.txt

# 4. Install the project in editable mode
pip install -e .

📖 Tutorial: Training a Text Classifier

This framework is designed for flexibility. Here’s how you can train your own text classification model.

1. Prepare Your Data and Configs

Place your training data (e.g., my_data.csv) in a local data/ directory. Use the configs/ directory as a template. You can modify config.yaml or create a new one to point to your data file and adjust model/training parameters.

2. Run Training

Run the text classification task script. All parameters are managed by the Hydra configuration files in the configs/ directory.

# Run with default settings from the config files
python3 -m my_nlp_framework.tasks.text_classification

You can easily override any parameter from the command line:

# Train for more epochs with a different learning rate
python3 -m my_nlp_framework.tasks.text_classification training.epochs=10 training.learning_rate=0.0005

# Train with LoRA enabled
python3 -m my_nlp_framework.tasks.text_classification model.use_lora=True model.lora_rank=8

3. Track Experiments with MLflow

Before training, launch the MLflow UI to track your experiments in real-time. The docker-compose.yml file is pre-configured for you.

# Start the MLflow server in the background
docker-compose up -d

Navigate to http://localhost:5000 in your browser to view the MLflow dashboard.

🌐 Serving Your Model via API

Once you have a trained model (.pth file) and tokenizer (.json file), you can easily deploy it with the built-in FastAPI server.

python3 -m my_nlp_framework.inference.api \
    --model-path /path/to/your/trained_model.pth \
    --tokenizer-path /path/to/your/tokenizer.json \
    --vocab-size 10000 \
    --num-classes 2

The API will be available at http://localhost:8000/docs for interactive testing.

🐳 Running with Docker

You can also run the entire training process within a Docker container for perfect reproducibility.

# 1. Build the Docker image
docker build -t zenith-nlp-framework:latest .

# 2. Run a task (mounting your local data directory)
docker run --rm -v "$(pwd)/data":/app/data zenith-nlp-framework:latest \
  python -m my_nlp_framework.tasks.text_classification

🏛️ Framework Architecture

This framework is organized into several key modules:

src/my_nlp_framework/core: Contains the fundamental building blocks like attention mechanisms, LoRA layers, and tokenizers.
src/my_nlp_framework/models: Defines high-level model architectures like BERT and GPT.
src/my_nlp_framework/data: Includes flexible data loaders.
src/my_nlp_framework/training: A powerful, centralized training engine with advanced features.
src/my_nlp_framework/tasks: Example scripts that show how to use the framework to solve end-to-end problems.
src/my_nlp_framework/inference: Code for deploying and serving trained models.
configs/: Centralized YAML configuration files for Hydra.
tests/: Unit and integration tests for the framework.

🤝 Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Made with ❤️ by K Satya Sai Nischal

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.0

Oct 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zenith_nlp_framework-1.0.0.tar.gz (17.4 kB view details)

Uploaded Oct 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zenith_nlp_framework-1.0.0-py3-none-any.whl (18.4 kB view details)

Uploaded Oct 14, 2025 Python 3

File details

Details for the file zenith_nlp_framework-1.0.0.tar.gz.

File metadata

Download URL: zenith_nlp_framework-1.0.0.tar.gz
Upload date: Oct 14, 2025
Size: 17.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for zenith_nlp_framework-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`a8a5f1bc5bb3568dc2008fc0621a90dbd32a8251afb757c227319ce1ae2ee9b2`
MD5	`e6ea9986403b3a0a82afa7642ea8e245`
BLAKE2b-256	`d0d9127a1146885679966b6d1d430c02078faed4963d1c328313454d41b3a922`

See more details on using hashes here.

File details

Details for the file zenith_nlp_framework-1.0.0-py3-none-any.whl.

File metadata

Download URL: zenith_nlp_framework-1.0.0-py3-none-any.whl
Upload date: Oct 14, 2025
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for zenith_nlp_framework-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b52c51190fc1165282459bb679eff1bc4b055a6f74f720c4081d0ade014940ca`
MD5	`7ff4156c6f93be4ed54b26f4dc2c76b5`
BLAKE2b-256	`5fb9fcf3a83812e6c45b9dbc35031c1b09fce941f4c2a49fbfa912c6f43762c3`

See more details on using hashes here.

zenith-nlp-framework 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Zenith NLP Framework

A Framework for Advanced Natural Language Processing

📜 Table of Contents

✨ Features

🚀 Getting Started

1. Installation (from PyPI)

2. Local Development Setup

📖 Tutorial: Training a Text Classifier

1. Prepare Your Data and Configs

2. Run Training

3. Track Experiments with MLflow

🌐 Serving Your Model via API

🐳 Running with Docker

🏛️ Framework Architecture

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes