Skip to main content

Comprehensive AI/ML library integrating machine learning, computer vision, audio processing, and conversational AI

Project description

Version Python License Typed

Gurulearn

A unified AI/ML toolkit for deep learning, computer vision, audio processing, and conversational AI.

Built with lazy loading for minimal import overhead (~0.001s). Production-ready with type hints.


📦 Installation

pip install gurulearn              # Core only
pip install gurulearn[vision]      # + PyTorch image classification
pip install gurulearn[audio]       # + TensorFlow audio recognition  
pip install gurulearn[agent]       # + LangChain RAG agent
pip install gurulearn[ocr]         # + CTC-based character OCR
pip install gurulearn[full]        # All features

🖼️ ImageClassifier

PyTorch-based image classification with 9 model architectures.

Data Loading Options

from gurulearn import ImageClassifier

clf = ImageClassifier()

# Option 1: From directory structure (data/train/{class_name}/*.jpg)
model, history = clf.train(train_dir="data/train", test_dir="data/test")

# Option 2: From CSV file
model, history = clf.train(
    csv_file="data.csv",
    img_column="image_path",
    label_column="class"
)

Training Parameters

model, history = clf.train(
    train_dir="data/train",
    epochs=20,
    batch_size=32,
    model_name="resnet50",     # See models below
    finetune=True,             # Finetune all layers
    learning_rate=0.001,
    use_amp=True,              # Mixed precision (GPU)
    save_path="model.pth"
)

Available Models

Model Best For Parameters
simple_cnn Small datasets (<1K) 3M
vgg16 General purpose 138M
resnet50 Large datasets 25M
mobilenet Mobile deployment 3.5M
inceptionv3 Fine-grained 23M
densenet Feature reuse 8M
efficientnet Accuracy/size balance 5M
convnext Modern CNN 28M
vit Vision Transformer 86M

Prediction

# Load saved model
clf.load("model.pth", model_name="resnet50")

# Single image prediction
result = clf.predict("image.jpg", top_k=3)
print(result.class_name)       # "cat"
print(result.probability)      # 0.95
print(result.top_k)            # [("cat", 0.95), ("dog", 0.03), ...]

# From PIL Image
from PIL import Image
result = clf.predict(image=Image.open("image.jpg"))

# Export for production
clf.export_onnx("model.onnx")

🎵 AudioRecognition

TensorFlow/Keras CNN-LSTM for audio classification.

Data Loading

from gurulearn import AudioRecognition

audio = AudioRecognition(sample_rate=16000, n_mfcc=20)

# From directory structure (data/{class_name}/*.wav)
# Supports: .wav, .mp3, .flac, .ogg, .m4a
history = audio.audiotrain(
    data_path="data/audio",
    epochs=50,
    batch_size=32,
    augment=True,              # Time stretch, pitch shift, noise
    model_dir="models"
)

Training Output

  • models/audio_recognition_model.keras - Trained model
  • models/label_mapping.json - Class labels
  • models/confusion_matrix.png - Evaluation plot
  • models/training_history.png - Loss/accuracy curves

Prediction

# Single file
result = audio.predict("sample.wav", model_dir="models")
print(result.label)            # "speech"
print(result.confidence)       # 0.92
print(result.all_probabilities)  # [0.92, 0.05, 0.03]

# Batch prediction
results = audio.predict_batch(
    ["file1.wav", "file2.wav"], 
    model_dir="models"
)

📊 MLModelAnalysis

AutoML for regression and classification with 10+ algorithms.

Data Loading

from gurulearn import MLModelAnalysis

ml = MLModelAnalysis(
    task_type="auto",              # "auto", "regression", "classification"
    auto_feature_engineering=True  # Extract date features
)

# From CSV
result = ml.train_and_evaluate(
    csv_file="data.csv",
    target_column="price",
    test_size=0.2,
    model_name=None,               # Auto-select best model
    save_path="model.joblib"
)

Available Models

Regression: linear_regression, decision_tree, random_forest, gradient_boosting, svm, knn, ada_boost, mlp, xgboost, lightgbm

Classification: logistic_regression, decision_tree, random_forest, gradient_boosting, svm, knn, ada_boost, mlp, xgboost, lightgbm

*Optional dependencies

Prediction

# Load and predict
ml.load_model("model.joblib")

# From dictionary
prediction = ml.predict({"feature1": 42, "category": "A"})

# From DataFrame
predictions = ml.predict(test_df)

# Compare all models
comparison = ml.compare_models("data.csv", "target", cv=5)

💬 FlowBot

Guided conversation flows with real-time data filtering.

Data Loading

from gurulearn import FlowBot
import pandas as pd

# From DataFrame
bot = FlowBot(pd.read_csv("hotels.csv"), data_dir="user_sessions")

# From list of dicts
bot = FlowBot([
    {"city": "Paris", "price": "$$$", "name": "Le Grand"},
    {"city": "Tokyo", "price": "$$", "name": "Sakura Inn"}
])

Building Flows

# Add filter steps
bot.add("city", "Select destination:", required=True)
bot.add("price", "Choose budget:")

# Define output columns
bot.finish("name", "price")

# Validate flow
errors = bot.validate()

Processing & Prediction

# Process user input (maintains session state)
response = bot.process("user123", "Paris")

# Response structure
{
    "message": "Choose budget:",
    "suggestions": ["$$$", "$$"],
    "completed": False
}

# Final response
{
    "completed": True,
    "results": [{"name": "Le Grand", "price": "$$$"}],
    "message": "Found 1 matching options"
}

# Async support
response = await bot.aprocess("user123", "Paris")

# Export history
history_df = bot.export_history("user123", format="dataframe")

🤖 QAAgent

RAG-based question answering with LangChain + Ollama.

Data Loading

from gurulearn import QAAgent
import pandas as pd

# From DataFrame
agent = QAAgent(
    data=pd.read_csv("docs.csv"),
    page_content_fields=["title", "content"],
    metadata_fields=["category", "date"],
    llm_model="llama3.2",
    embedding_model="mxbai-embed-large",
    db_location="./vector_db"
)

# From list of dicts
agent = QAAgent(
    data=[{"title": "Policy", "content": "..."}],
    page_content_fields="content"
)

# Load existing index (no data needed)
agent = QAAgent(db_location="./existing_db")

Querying

# Simple query
answer = agent.query("What is the refund policy?")

# With source documents
result = agent.query("What is the refund policy?", return_sources=True)
print(result["answer"])
print(result["sources"])

# Direct similarity search (no LLM)
docs = agent.similarity_search("refund", k=5)

# Interactive mode
agent.interactive_mode()

# Add more documents
agent.add_documents(new_df, "content", ["category"])

🏥 CTScanProcessor

Medical image enhancement with quality metrics.

Processing

from gurulearn import CTScanProcessor

processor = CTScanProcessor(
    kernel_size=5,
    clip_limit=2.0,
    tile_grid_size=(8, 8)
)

# Single image - supports .jpg, .png, .dcm, .nii
result = processor.process_ct_scan(
    "scan.jpg",
    output_folder="output/",
    compare=True               # Save side-by-side comparison
)

# Batch processing
results = processor.process_batch(
    input_folder="scans/",
    output_folder="processed/"
)

Quality Metrics

# result.metrics contains:
print(result.metrics.mse)      # Mean Squared Error
print(result.metrics.psnr)     # Peak Signal-to-Noise Ratio (dB)
print(result.metrics.snr)      # Signal-to-Noise Ratio (dB)
print(result.metrics.detail_preservation)  # Percentage

Individual Operations

import numpy as np

# Apply individual filters
sharpened = processor.sharpen(image)
denoised = processor.median_denoise(image)
enhanced = processor.enhance_contrast(image)
bilateral = processor.bilateral_denoise(image)

# Compare quality
metrics = processor.evaluate_quality(original, processed)

🔤 OCR Module

Character-level OCR with VGG-BiLSTM + CTC decoding. Auto-discovers classes from data.yaml — works with any character set.(YOLO DATASET FORMAT)

Installation

pip install gurulearn[ocr]

Quick Inference

from gurulearn.ocr import OCRPredictor

# Load model — NO dataset needed, everything is in the .guruocr file
predictor = OCRPredictor("best_model.guruocr")

result = predictor.predict("image.jpg")
print(result.text)          # "ASDF"
print(result.confidence)    # 0.97

# Batch prediction
results = predictor.predict_batch(["img1.jpg", "img2.jpg"])

# Visualize with overlay
predictor.visualize("image.jpg", save_path="output.png")

Training

from gurulearn.ocr import OCRTrainer

trainer = OCRTrainer(
    data_dir="path/to/yolo_dataset",  # Must have data.yaml + train/valid/test splits
    output_dir="output/",
    img_h=48, img_w=128,               # Model input size
    hidden=256, num_layers=3,           # Architecture config
    focus_tokens=["I", "O"],            # Optional: boost learning for confusable chars
)

history = trainer.train(
    epochs=150,
    batch_size=64,
    lr=1e-4,
    patience=5,
)

# Evaluate on test set
result = trainer.evaluate()  # accuracy, CER, loss

# Save training curves
trainer.plot_results()

The trainer saves a .guruocr file — a self-contained archive with model weights + metadata (class names, image dimensions, architecture config). This means inference never needs the original dataset.

Dataset Utilities

from gurulearn.ocr import split_datasets, merge_datasets, rebalance_splits, shuffle_augment

# Split datasets by filename keywords
result = split_datasets(
    source_dirs=["dataset_v1", "dataset_v2"],
    output_root="segregated_datasets",
    keywords={"aircraft": "aircraft", "supplier": "suppliers"},
)

# Generate synthetic augmented images with double-letter support
result = shuffle_augment(
    source_dir="segregated_datasets/aircraft",
    num_output=30000,
    doubles=5200,   # ~200 per letter for CTC double-character learning
)

# Merge multiple YOLO datasets
result = merge_datasets(source_root="segregated_datasets", output_name="merged")

# Rebalance train/valid/test split ratios
result = rebalance_splits("path/to/dataset", train_ratio=0.8, valid_ratio=0.1, test_ratio=0.1)

Automated Pipeline

Run the full workflow — split → augment → train → evaluate — in one command:

from gurulearn.ocr import OCRPipeline

pipeline = OCRPipeline(
    source_dirs=["dataset_v1", "dataset_v2"],
    output_root="segregated_datasets",
    dataset_name="aircraft",
    split_keywords={"aircraft": "aircraft", "supplier": "suppliers"},
    augment_count=30000,
    doubles_count=5200,
    train_epochs=150,
)

# Everything at once
result = pipeline.run_all()

# Or step by step
pipeline.step_split()
pipeline.step_augment()
pipeline.step_merge()
pipeline.step_rebalance()
pipeline.step_train()
pipeline.step_evaluate()

# Get a predictor from the trained model
predictor = pipeline.get_predictor()
print(predictor.predict("test.jpg").text)

⚡ Performance

  • Lazy Loading: ~0.001s import time
  • GPU Auto-Detection: CUDA for PyTorch/TensorFlow
  • Mixed Precision: Automatic FP16 on compatible GPUs
  • Batch Processing: All modules support batch inference

📄 License

MIT License - Guru Dharsan T

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gurulearn-5.1.0.tar.gz (58.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gurulearn-5.1.0-py3-none-any.whl (60.6 kB view details)

Uploaded Python 3

File details

Details for the file gurulearn-5.1.0.tar.gz.

File metadata

  • Download URL: gurulearn-5.1.0.tar.gz
  • Upload date:
  • Size: 58.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for gurulearn-5.1.0.tar.gz
Algorithm Hash digest
SHA256 690fe38e80147db106edae8b0ce9f2e0f3956a1a491767f5ce3b5eb2ea8bdcca
MD5 7bc01c11a0bc6502e9879ed6e29315c8
BLAKE2b-256 b58c0431b84d8bf00836dd8b8cb544cb1ca78ede39ea677457b4530060622617

See more details on using hashes here.

File details

Details for the file gurulearn-5.1.0-py3-none-any.whl.

File metadata

  • Download URL: gurulearn-5.1.0-py3-none-any.whl
  • Upload date:
  • Size: 60.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for gurulearn-5.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e14090cbe4e852ba77f3a29192d564ccec1a4f15b90972871c6c5b20e8cdcf60
MD5 fdffb105ac7fe829e1f8c443803cd2b2
BLAKE2b-256 c876b31165f51247efe1d390b6ecc61b4d87ed86c643d5583aad4824fb2e0683

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page