Skip to main content

A sample AI toolkit by Schogini Systems with Retrieval-Augmented Generation (RAG).

Project description

SchoginiAI

SchoginiAI is an AI toolkit developed by Schogini Systems that provides Retrieval-Augmented Generation (RAG) capabilities using LangChain and OpenAI. It leverages both FAISS and ChromaDB for efficient vector storage and retrieval, enabling advanced AI-driven solutions for small businesses and beyond.

๐Ÿš€ Features

  • Recursive Text Chunking: Efficiently splits large text corpora into manageable chunks.
  • OpenAI Embeddings: Utilizes OpenAI's embedding models for high-quality vector representations.
  • FAISS & ChromaDB Vector Stores: Offers flexibility to choose between FAISS and ChromaDB for vector storage and retrieval via configuration.
  • Retrieval-Augmented Generation (RAG): Combines retrieval mechanisms with language models to generate informed responses.
  • Dockerized Environment: Easily build and run in isolated Docker containers for consistency across environments.
  • Environment Variable Management: Securely handles API keys and sensitive information using .env files.

๐Ÿ›  Installation

๐Ÿ“ฆ From PyPI

Install the latest version of SchoginiAI directly from PyPI:

pip install SchoginiAI

๐Ÿง‘โ€๐Ÿ’ป From Source

Clone the repository and install the package manually:

git clone https://github.com/yourusername/SchoginiAI.git
cd SchoginiAI
pip install .

Replace yourusername with your actual GitHub username.

๐Ÿ”ง Usage

๐Ÿ“ Environment Setup

Create a .env file in the examples02/ directory to store your OpenAI API key and vector store type securely:

OPENAI_API_KEY=your_openai_api_key_here
VECTOR_STORE_TYPE=faiss  # Options: 'faiss' or 'chroma'

โš ๏ธ Important: Do not commit the .env file to version control. Ensure it's listed in your .gitignore.

๐Ÿ“š Knowledge Creation

Build and save the vector store from your text corpus using the knowledge_creation.py script.

๐Ÿ Python Script

cd examples02
python knowledge_creation.py

๐Ÿ“ฆ Using Docker

Build the Docker image and run the container with the create argument to generate the vector store:

  • With FAISS:

    docker build --no-cache -t schogini-examples .
    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    
  • With ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    

Expected Output:

  • For FAISS:

    Running knowledge_creation.py with VECTOR_STORE_TYPE='faiss'...
    FAISS vector store created.
    FAISS vector store saved to faiss_store
    
  • For ChromaDB:

    Running knowledge_creation.py with VECTOR_STORE_TYPE='chroma'...
    ChromaDB vector store created at chroma_store.
    ChromaDB vector store persisted at chroma_store
    

โ“ Querying the Knowledge Base

Load the pre-built vector store and perform a query using the usage_example02.py script.

๐Ÿ Python Script

cd examples02
python usage_example02.py

๐Ÿ“ฆ Using Docker

Run the container with the query argument to perform the query:

  • With FAISS:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    
  • With ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    

Expected Output:

Running usage_example02.py with VECTOR_STORE_TYPE='faiss'...
Answer: Schogini Systems is a pioneer in AI Chatbots.
We specialize in automation solutions for small businesses.

or for ChromaDB:

Running usage_example02.py with VECTOR_STORE_TYPE='chroma'...
Answer: Schogini Systems is a pioneer in AI Chatbots.
We specialize in automation solutions for small businesses.

๐Ÿณ Docker Usage

๐Ÿ›  Build the Docker Image

Navigate to the project root directory (where the Dockerfile is located) and build the Docker image:

docker build --no-cache -t schogini-examples .

๐Ÿš€ Run the Docker Container

1. Create Vector Store

  • With FAISS:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    
  • With ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    

2. Query Vector Store

  • With FAISS:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    
  • With ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    

Note: Replace "your_openai_api_key_here" with your actual OpenAI API key.

๐Ÿ“ฆ Dependencies

SchoginiAI relies on the following Python packages:

These dependencies are automatically installed when you install SchoginiAI via pip or using requirements.txt in Docker.

๐Ÿ“„ requirements.txt

langchain>=0.0.200,<0.1.0
langchain-community>=0.0.20,<0.1.0
langchain-chroma>=0.1.0,<1.0.0
openai>=0.28.1,<0.29.0
tiktoken>=0.4.0,<0.5.0
faiss-cpu>=1.7.6,<1.8.0
python-dotenv>=0.21.0,<0.22.0
chromadb>=0.3.22,<0.4.0

๐Ÿณ Docker Configuration

๐Ÿ“„ Dockerfile

# Use a lightweight Python base image
FROM python:3.11-slim

# Install bash and other dependencies (if needed)
RUN apt-get update && apt-get install -y bash && rm -rf /var/lib/apt/lists/*

# Set environment variables for Python
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

# Create and set the working directory
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install Python dependencies into the current directory
RUN pip install --upgrade pip
RUN pip install -r requirements.txt --target=.
RUN pip install -e .

# Ensure the .env file is present
COPY examples02/.env /app/examples02/.env

# Make sure the scripts are executable
RUN chmod +x examples02/doit.sh

# Use bash as the entrypoint
ENTRYPOINT ["/bin/bash"]
# Default command: run doit.sh with arguments
CMD ["examples02/doit.sh"]

๐Ÿ“„ doit.sh

Handles the execution of either the knowledge creation or querying scripts based on input arguments.

#!/bin/bash
set -e  # Exit immediately if a command exits with a non-zero status

# Check if one argument is provided
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 {create|query}"
    exit 1
fi

SCRIPT=$1

if [ "$SCRIPT" == "create" ]; then
    echo "Running knowledge_creation.py..."
    python examples02/knowledge_creation.py
elif [ "$SCRIPT" == "query" ]; then
    echo "Running usage_example02.py..."
    python examples02/usage_example02.py
else
    echo "Invalid argument. Use 'create' or 'query'."
    exit 1
fi

Usage Examples:

  • Create Vector Store with FAISS:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    
  • Create Vector Store with ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples create
    
  • Query Vector Store with FAISS:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    
  • Query Vector Store with ChromaDB:

    docker run --rm -e OPENAI_API_KEY="your_openai_api_key_here" schogini-examples query
    

๐Ÿ—ƒ Project Structure

SchoginiAI/
โ”œโ”€โ”€ SchoginiAI/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ main.py
โ”œโ”€โ”€ examples02/
โ”‚   โ”œโ”€โ”€ usage_example02.py
โ”‚   โ”œโ”€โ”€ knowledge_creation.py
โ”‚   โ””โ”€โ”€ .env
โ”œโ”€โ”€ tests/
โ”‚   โ””โ”€โ”€ test_main.py
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ setup.py
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ doit.sh
โ””โ”€โ”€ build.sh

๐Ÿ›ก License

This project is licensed under the MIT License. See the LICENSE file for details.

๐Ÿ“ Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

  1. Fork the repository.
  2. Create your feature branch: git checkout -b feature/YourFeature
  3. Commit your changes: git commit -m 'Add some feature'
  4. Push to the branch: git push origin feature/YourFeature
  5. Open a pull request.

๐Ÿ“„ .gitignore

Ensure you have a .gitignore file to exclude unnecessary or sensitive files from your GitHub repository.

# Python
__pycache__/
*.py[cod]

# Distribution / packaging
build/
dist/
*.egg-info/

# Environment
venv/
.env/

# Vector Stores
faiss_store/
chroma_store/

# OS generated files
.DS_Store

# IDE configs
.vscode/
.idea/

# Secrets
.pypirc
.env

๐Ÿ“š Additional Resources


๐ŸŽฏ Summary

By configuring your SchoginiAI project to select the vector store type (faiss or chroma) through the .env file, you achieve a more streamlined and maintainable setup. This approach centralizes configuration, reduces the need for repetitive command-line arguments, and aligns with best practices for environment-specific settings.

Key Actions Taken:

  1. Environment Variable Configuration:

    • Added VECTOR_STORE_TYPE to the .env file to specify the desired vector store.
  2. Script Modifications:

    • Removed command-line argument parsing for vector store selection.
    • Updated scripts to read VECTOR_STORE_TYPE from environment variables.
  3. Docker Adjustments:

    • Ensured the .env file is copied into the Docker image.
    • Modified doit.sh to eliminate the need for vector store type arguments.
  4. Verification:

    • Provided steps to verify that the changes work both locally and within Docker.
  5. Documentation:

    • Updated README.md to reflect the new configuration method.
  6. Testing:

    • Recommended implementing unit tests to ensure correct vector store selection based on .env settings.

Next Steps:

  1. Implement Unit Tests: Ensure that vector store selections work as intended.
  2. Continuous Integration (CI): Set up CI pipelines to automatically test configurations.
  3. Monitor Dependencies: Keep your packages updated to avoid future deprecations or compatibility issues.
  4. User Documentation: Make sure all users are aware of the .env configuration for vector store selection.

Feel free to reach out if you need further assistance or encounter any issues during implementation!


By following this guide, your SchoginiAI module is now equipped to handle both FAISS and ChromaDB vector stores seamlessly, ensuring compatibility with the latest LangChain updates and maintaining best practices for security and maintainability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schoginiai-0.2.0.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

SchoginiAI-0.2.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file schoginiai-0.2.0.tar.gz.

File metadata

  • Download URL: schoginiai-0.2.0.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.11

File hashes

Hashes for schoginiai-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b2d749eaf786e225af45e1aa41fca6148524a17e6ed1ebe2ed22fcbc6ba30d77
MD5 64c3dd0e7de940ad174e97de7cc05bf5
BLAKE2b-256 3f89f8f11752348abe469f859ee2f68e9b68b847310c3c8e3feddee7c8498a4e

See more details on using hashes here.

File details

Details for the file SchoginiAI-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: SchoginiAI-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.11

File hashes

Hashes for SchoginiAI-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6cdb5b52cbed5f8d4cc6ab37fe2fa0b477646b7987b00eab12f936c5eda01a1c
MD5 08d83d5d457ef0b160fd4477bcd4dbf4
BLAKE2b-256 4c9b547662aa9ddc8665f3384678cbc794c374d261df7b30c2c7dd7ab19c737f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page