Skip to main content

Universal multi-domain research system with RAG (Retrieval-Augmented Generation) capabilities

Project description

Viincci-RAG

Universal multi-domain research system with RAG (Retrieval-Augmented Generation) capabilities

Python 3.9+ License: MIT Status: Beta

๐ŸŽ“ Try It Now (Google Colab)

No installation required! Run these notebooks in your browser:

  • Open in Colab Minimal Examples โ€” Safe mock mode + real SerpAPI integration
  • Open in Colab Complete Testing โ€” All domains (poetry, medical, botany, art, carpentry)

๐Ÿš€ Quick Start

Installation

# Install from source
pip install -e .

# With development dependencies
pip install -e ".[dev]"

# All features
pip install -e ".[all]"

Basic Usage

from viincci_rag import ConfigManager, RAGSystem, UniversalResearchSpider

# Initialize configuration
config = ConfigManager(domain="botany")

# Create RAG system
rag = RAGSystem(config)
rag.load_llm()

# Create research spider
spider = UniversalResearchSpider(config)

# Or import all classes
from viincci_rag import *

๐Ÿ“ฆ What's Included

Component Purpose
ConfigManager Configuration management with domain support
RAGSystem Retrieval-Augmented Generation pipeline
UniversalResearchSpider Multi-domain research and web scraping
UniversalArticleGenerator Content generation for any domain
SerpAPIMonitor API credit monitoring and management
FloraDatabase Database operations and management

๐Ÿงช Testing

# Run all tests
pytest tests/

# Run integration tests
pytest tests/test_integration.py -v

# With coverage report
pytest tests/ --cov=viincci_rag --cov-report=html

๐Ÿ“š Documentation

All documentation has been moved to the docs/ folder:

๐Ÿ”„ Backward Compatibility

All old imports continue to work:

# Old import (still works)
from V4 import ConfigManager, RAGSystem

# New import (recommended)
from viincci_rag import ConfigManager, RAGSystem

# Both are identical

๐Ÿ“‹ Project Structure

viincci_rag/
โ”œโ”€โ”€ core/              # Core RAG modules with wrappers
โ”œโ”€โ”€ database/          # Database adapters
โ”œโ”€โ”€ utils/             # Utility functions
โ”œโ”€โ”€ config/            # Configuration files
โ””โ”€โ”€ templates/         # Output templates

V4/                    # Original codebase (unchanged)
docs/                  # Documentation
tests/                 # Test suite

๐Ÿค Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest tests/
  5. Submit a Pull Request

๐Ÿ“„ License

MIT License - See LICENSE file for details

๐Ÿ“ž Support


Version: 4.0.0 | Status: Beta | License: MIT


## โœจ Features

- ๐Ÿ”ฌ **Multi-Domain Research**: Botany, medical, mathematics, carpentry, and more
- ๐Ÿค– **RAG System**: Retrieval-Augmented Generation for intelligent answers
- ๏ฟฝ **Multiple Database Backends**: SQLite, PostgreSQL, MongoDB, MySQL
- ๐ŸŽฏ **API Monitoring**: Built-in SerpAPI credit tracking
- โš™๏ธ **Fully Configurable**: Models, databases, content processing
- โœ… **Tested & Documented**: Comprehensive test suite and documentation
- ๐Ÿ”„ **Backward Compatible**: All old imports still work
# Clone the repository
git clone https://github.com/yourusername/viincci-rag.git
cd viincci-rag

# Install in development mode
pip install -e .

# Or install with all dependencies
pip install -e ".[all]"

For Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Or use the built-in test command
viincci-test

๐Ÿ”‘ Setup

  1. Get a SerpAPI Key: Sign up at serpapi.com

  2. Set Environment Variable:

    export SERP_API_KEY='your_api_key_here'
    
  3. Verify Installation:

    viincci-research --list-domains
    

๐Ÿ“– Usage

Command Line Interface

Basic Research Article

viincci-research -q "Rosa rubiginosa" -d botany

Plain Text Output

viincci-research -q "diabetes" -d medical --format text

JSON Structured Output

viincci-research -q "Pythagorean theorem" -d mathematics --format json

Arts & Humanities Research

viincci-research -q "Impressionism" -d art_history
viincci-research -q "Shakespeare sonnets" -d literature

Creative Writing with RAG

# Generate a poem
viincci-research -q "Van Gogh" -d art_history --content-type poem --rag

# Generate an essay
viincci-research -q "Baroque music" -d music --content-type essay --rag

Check API Status

viincci-research --check-credits

Python API

from V4 import ConfigManager, UniversalResearchSpider, RAGSystem, UniversalArticleGenerator

# Initialize configuration for a domain
config = ConfigManager(domain="mathematics", verbose=True)

# Perform research
spider = UniversalResearchSpider(config)
sources = spider.research("Pythagorean theorem")

# Generate article with RAG
rag = RAGSystem(config)
texts = [s['text'] for s in sources]
metadata = [s['metadata'] for s in sources]
rag.build_index(texts, metadata)
rag.load_llm()

generator = UniversalArticleGenerator(config, rag_system=rag)
article = generator.generate_full_article("Pythagorean theorem", sources)

# Save article
with open("article.html", "w") as f:
    f.write(article)

Domain Information

# List all domains
viincci-research --list-domains

# Get detailed domain info
viincci-research --domain-info medical

๐Ÿ—๏ธ Architecture

viincci-rag/
โ”œโ”€โ”€ V4/                          # Main package
โ”‚   โ”œโ”€โ”€ __init__.py             # Package exports
โ”‚   โ”œโ”€โ”€ ConfigManager.py        # Configuration management
โ”‚   โ”œโ”€โ”€ Spider.py               # Web scraping & search
โ”‚   โ”œโ”€โ”€ RagSys.py              # RAG system implementation
โ”‚   โ”œโ”€โ”€ UniversalArticleGenerator.py  # Article generation
โ”‚   โ”œโ”€โ”€ ApiMonitor.py          # API credit monitoring
โ”‚   โ”œโ”€โ”€ FloraDatabase.py       # Database operations
โ”‚   โ”œโ”€โ”€ config/                # Configuration files
โ”‚   โ”‚   โ”œโ”€โ”€ domains.json       # Domain definitions
โ”‚   โ”‚   โ”œโ”€โ”€ ai_settings.json   # AI model settings
โ”‚   โ”‚   โ”œโ”€โ”€ api_monitor.json   # API monitoring config
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ db/                    # Database directory
โ”œโ”€โ”€ research_cli.py            # Command-line interface
โ”œโ”€โ”€ test_v4.py                # Test suite
โ”œโ”€โ”€ requirements.txt          # Dependencies
โ”œโ”€โ”€ setup.py                  # Package setup
โ””โ”€โ”€ README.md                 # This file

๐Ÿงช Testing

Run the comprehensive test suite:

# Using pytest
pytest

# Using built-in test runner
viincci-test

# With verbose output
viincci-test --verbose

# Run specific test
pytest tests/test_config.py -v

๐Ÿ“Š Configuration

All configuration is stored in V4/config/ as JSON files:

  • domains.json: Define research domains, sources, questions
  • ai_settings.json: LLM and embedding model settings
  • api_monitor.json: API usage thresholds and alerts
  • search_config.json: Web scraping parameters
  • domain_reliability.json: Source reliability scores

Example: Add a New Domain

Edit V4/config/domains.json:

{
  "your_domain": {
    "name": "Your Domain Name",
    "description": "Description of your domain",
    "primary_sources": ["university", "research_institute"],
    "questions": [
      "what are the key concepts",
      "what are the applications"
    ],
    "keywords": ["keyword1", "keyword2"]
  }
}

๐Ÿ”ง Advanced Features

RAG System Customization

from V4 import ConfigManager, RAGSystem

config = ConfigManager()

# Change LLM model
config.set_llm_model("LiquidAI/LFM-40B-MoE")

# Initialize RAG with custom settings
rag = RAGSystem(config)
rag.load_llm(device="cuda", load_in_8bit=True)

# Query with custom parameters
result = rag.query(
    "What are the benefits?",
    k=10,
    max_new_tokens=500,
    temperature=0.8
)

API Cost Estimation

from V4 import SerpAPIMonitor, ConfigManager

config = ConfigManager()
monitor = SerpAPIMonitor(config)

# Estimate research cost
estimate = monitor.estimate_research_cost("Plant name", questions=4)
monitor.print_estimate(estimate)

# Check if can afford
if estimate['can_afford']:
    # Proceed with research
    pass

๐Ÿ“š Documentation

For detailed documentation, visit the Wiki.

Key Topics

๐Ÿค Contributing

Contributions are welcome! Please read our Contributing Guidelines first.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ž Support

๐Ÿ—บ๏ธ Roadmap

  • Add more research domains
  • Implement caching for search results
  • Add web interface
  • Support for more LLM providers
  • Multilingual support
  • Export to more formats (PDF, DOCX)
  • Integration with reference managers

๐Ÿ“ˆ Changelog

See CHANGELOG.md for version history.


Made with โค๏ธ by the Viincci-RAG Team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

viincci_rag-2.0.0.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

viincci_rag-2.0.0-py3-none-any.whl (62.5 kB view details)

Uploaded Python 3

File details

Details for the file viincci_rag-2.0.0.tar.gz.

File metadata

  • Download URL: viincci_rag-2.0.0.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.24

File hashes

Hashes for viincci_rag-2.0.0.tar.gz
Algorithm Hash digest
SHA256 c728650dae1c4893eeeda6cace91ba97af2744434f264c20b5561521366c8ee1
MD5 98157a5763e46e00f7f193b60d9e2113
BLAKE2b-256 fc129951425e598db977e1270a9233c560dcfa04a4c5434aff828e919dcfd9bf

See more details on using hashes here.

File details

Details for the file viincci_rag-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: viincci_rag-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 62.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.24

File hashes

Hashes for viincci_rag-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a02825147d881889d54ef0d808c22f76de0001089be6539fbed124e56b2b7f55
MD5 077f668efd3edea96dc4f7e498c3e8b6
BLAKE2b-256 a0364787036c20a1255f6e5051178603c0f0d2da97f07e8ee8a69064ce38d729

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page