Skip to main content

PyTorch + LangChain for Agentic Feature Engineering

Project description

PyroChain ๐Ÿ”ฅ

Intelligent Feature Engineering with AI Agents

GitHub PyPI Downloads Python PyTorch LangChain

PyroChain combines PyTorch's deep learning capabilities with LangChain's agentic AI to automate feature extraction from complex, multimodal data. AI agents collaborate to understand, process, and extract meaningful features from text, images, and structured data.

๐ŸŽฏ What Problem Does PyroChain Solve?

Traditional Feature Engineering is Hard:

  • Manual feature extraction is time-consuming and error-prone
  • Different data types require different approaches
  • Domain expertise is needed to create meaningful features
  • Features become outdated as data patterns change

PyroChain Makes It Easy:

  • AI agents automatically extract relevant features from any data type
  • Collaborative agents validate and refine features using chain-of-thought reasoning
  • Learns from your data to improve feature quality over time
  • Works seamlessly with existing ML pipelines

๐Ÿš€ Key Features

  • ๐Ÿค– AI Agents: Intelligent agents that collaborate to extract, validate, and refine features
  • ๐Ÿ“Š Multimodal Processing: Handle text, images, and structured data in one pipeline
  • โšก Lightweight & Fast: Efficient LoRA adapters that train quickly on your data
  • ๐Ÿง  Memory & Learning: Agents remember past decisions and improve over time
  • ๐Ÿ›’ E-commerce Ready: Built-in tools for product recommendations and customer analysis
  • ๐Ÿ—๏ธ Production Ready: Scalable architecture designed for real-world applications

๐Ÿ’ก Use Cases

E-commerce & Retail:

  • Product recommendation systems
  • Customer sentiment analysis
  • Inventory optimization
  • Price prediction and analysis

Content & Media:

  • Text classification and tagging
  • Image content analysis
  • Content recommendation
  • Automated content moderation

Business Intelligence:

  • Customer behavior analysis
  • Market trend detection
  • Risk assessment
  • Automated reporting

๐Ÿ› ๏ธ Installation

Quick Install

pip install pyrochain

From Source

git clone https://github.com/irfanalidv/PyroChain.git
cd PyroChain
pip install -e .

Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • LangChain 0.1+
  • Transformers 4.20+

๐Ÿš€ Quick Start

Basic Usage

from pyrochain import PyroChain
from transformers import AutoTokenizer, AutoModel
from textblob import TextBlob
import torch
from datasets import load_dataset

# Load real transformer model and tokenizer
model_name = "microsoft/DialoGPT-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Initialize PyroChain with transformer model
pyrochain = PyroChain()

# Load REAL data from IMDB dataset
print("๐Ÿ“š Loading real IMDB dataset...")
dataset = load_dataset("imdb", split="train[:4]")  # Load first 4 real reviews

# Extract features from REAL dataset with TextBlob sentiment analysis
for i, sample in enumerate(dataset):
    text = sample["text"]
    label = sample["label"]  # 0 = negative, 1 = positive

    # Use TextBlob for real sentiment analysis
    blob = TextBlob(text)
    sentiment_score = (blob.sentiment.polarity + 1) / 2  # Convert to 0-1 scale

    data = {
        "text": text,
        "title": f"IMDB Review {i+1}",
        "rating": 5 if label == 1 else 1,
        "category": "movie_review"
    }

    features = pyrochain.extract_features(
        data,
        "Extract features for sentiment analysis using TextBlob and transformer model"
    )

    print(f"Text: {text[:100]}...")
    print(f"Real Label: {label} | TextBlob Sentiment: {sentiment_score:.3f}")
    print(f"Features: {len(features['features'])}")
    print("---")

Real Data Example

# Run the complete real data example
cd examples
python main_example.py

What you'll see:

๐Ÿ”ฅ PyroChain Real Data Demo - 100% Real Analysis
============================================================

๐Ÿš€ Real Data Feature Extraction Example
==================================================
๐Ÿ“š Loading real IMDB dataset using transformer models...
๐Ÿ“ฅ Downloading real IMDB dataset...
โœ… Loaded 5 real IMDB samples using transformer model

๐Ÿ“ Processing: IMDB Review 1
Text: I rented I AM CURIOUS-YELLOW from my video store because of all the controversy that surrounded it w...
Rating: 1/5 (Real IMDB Label: 0 = Negative)

โœ… Extracted 2 feature sets
๐Ÿ“Š Modalities: ['text']
โฑ๏ธ Processing time: 0.025s
๐Ÿ“Š Data source: real_imdb_dataset

๐Ÿ” sentiment_analysis:
   sentiment_score: 0.57
   polarity: 0.14
   subjectivity: 0.85
   positive_words: 16
   negative_words: 4
   total_sentiment_words: 20
   confidence: 0.95

๐Ÿ” text_features:
   word_count: 288
   char_count: 1640
   sentence_count: 14
   avg_word_length: 4.7
   avg_sentence_length: 20.57
   readability_score: 0.0
   topic_keywords: ['movie', 'review', 'story', 'direction', 'visuals', 'drama']

๐Ÿ›’ Real Data E-commerce Analysis
==================================================

๐Ÿ” Analyzing: Wireless Bluetooth Headphones
๐Ÿ’ฐ Price: $199.99
โญ Rating: 4.5/5 (128 votes)
โœ… Recommendation score: 0.91
๐Ÿ“Š Features extracted: 2

๐Ÿ† Top Recommendations:
1. Wireless Bluetooth Headphones - Score: 0.91
2. Organic Cotton T-Shirt - Score: 0.815

๐Ÿ—๏ธ How It Works

  1. Data Ingestion: Accepts multimodal data (text, images, structured)
  2. Agent Processing: AI agents analyze data using chain-of-thought reasoning
  3. Feature Extraction: Collaborative agents extract relevant features
  4. Validation: Agents validate and refine features through discussion
  5. Output: Clean, structured features ready for ML models

โš™๏ธ Configuration

from pyrochain import PyroChain, PyroChainConfig
from transformers import AutoTokenizer, AutoModel
import torch

# Load real transformer model for e-commerce analysis
model_name = "microsoft/DialoGPT-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Real e-commerce product data
products = [
    {
        "id": "prod_001",
        "title": "Wireless Bluetooth Headphones",
        "description": "High-quality wireless headphones with noise cancellation and 30-hour battery life. Perfect for music lovers and professionals.",
        "price": 199.99,
        "category": "electronics",
        "rating": 4.5
    },
    {
        "id": "prod_002",
        "title": "Organic Cotton T-Shirt",
        "description": "Comfortable organic cotton t-shirt in various colors and sizes. Made from 100% organic cotton, eco-friendly and sustainable.",
        "price": 29.99,
        "category": "clothing",
        "rating": 4.2
    }
]

# Configure for e-commerce with transformer model
config = PyroChainConfig(
    task_type="ecommerce",           # Task type: "general", "ecommerce", "custom"
    enable_agents=True,              # Enable AI agent collaboration
    enable_training=False,           # Enable model training
    max_length=512,                  # Maximum input length
    learning_rate=1e-4,              # Learning rate for training
    num_epochs=3,                    # Number of training epochs
    device="auto"                    # Device: "auto", "cpu", "cuda"
)

pyrochain = PyroChain(config=config)

# Process real product data with transformer analysis
for product in products:
    features = pyrochain.extract_features(
        product,
        "Extract features for product recommendation using transformer model"
    )
    print(f"Product: {product['title']} - Features: {len(features['features'])}")
    print(f"Price: ${product['price']} - Rating: {product['rating']}/5")

๐Ÿ“š API Reference

Core Classes

  • PyroChain: Main library class for feature extraction
  • PyroChainConfig: Configuration class for customizing behavior
  • LoRAAdapter: Lightweight adapter for efficient model fine-tuning
  • MultimodalProcessor: Handles text, image, and structured data processing

Key Methods

  • extract_features(data, task_description): Extract features from data
  • train(training_data, task_description): Train custom agents
  • evaluate(test_data): Evaluate model performance
  • save_model(path): Save trained model
  • load_model(path): Load pre-trained model

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ž Support

Need help? We're here to support you:


PyroChain - Transform your data into intelligent features with AI agents. ๐Ÿ”ฅ

Built with โค๏ธ by Irfan Ali

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrochain-0.1.0.tar.gz (106.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrochain-0.1.0-py3-none-any.whl (119.4 kB view details)

Uploaded Python 3

File details

Details for the file pyrochain-0.1.0.tar.gz.

File metadata

  • Download URL: pyrochain-0.1.0.tar.gz
  • Upload date:
  • Size: 106.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.5.0 tqdm/4.67.1 importlib-metadata/8.7.0 keyring/25.6.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.12.1

File hashes

Hashes for pyrochain-0.1.0.tar.gz
Algorithm Hash digest
SHA256 05ff0291bcfd3d6d8155484a242a0cbb4533b05b998b26008addc395494bc786
MD5 61359231fcc62cd116038ef2de91e187
BLAKE2b-256 d34c704f4f980222bb5de0d1e93bcbf167f2d43969fafe09a3df43ee2ea6e062

See more details on using hashes here.

File details

Details for the file pyrochain-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyrochain-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 119.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.5.0 tqdm/4.67.1 importlib-metadata/8.7.0 keyring/25.6.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.12.1

File hashes

Hashes for pyrochain-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 71895f0085a44acbfa4317cdbabd2ac194566fd446f204331ed365c492c44dc1
MD5 0f16625476ff16fe5b23e157d9bc9e3e
BLAKE2b-256 ea5c3efad1770010c000d581269a6b7abe0ab6d79941657d4511872921e46e2c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page