Skip to main content

Lightweight RAG Framework: Simple and Scalable Framework with Efficient Embeddings.

Project description

Description

Lightweight RAG Framework: Simple and Scalable Framework with Efficient Embeddings. Leverage: FAISS, ChromaDB, and Ollama

  • LLMs
  • Vector Database or Similarity Search engine
  • Embeddings
  • Chunking techniques
  • Data Collection
    Each of these objects are managed by the framework and provides the same IN/OUT to enable interoperability and facilitates changes (as this area is moving super fast !).

This project can totally runs locally (meaning on a single laptop without GPUs) and leverages:

Framework description

Several objects are provided to manage the main RAG features and characteristics:

  • rag: is the main interface for managing all needed request.
  • IDocument: manages the document reading and loading (pdf or direct content)
  • IChunks: manages the chunks list
  • IEmbeddings: Manages the vector and data embeddings
  • INearest: Manages the k nearest neighbors retreived by the similarity search engine
  • IPrompt: Manages Prompt templating and simple prompt

LLMs supported:

  • Ollama
  • AWS Claude
  • Hugging Face

Document reading methods are supported:

  • PDF via PyMuPDF
  • PDF via Llamaparse
  • HTML stream

Chunking methods are supported:

  • Character chunkink (langchain)
  • Semantic chunking (langchain)

Vectors stores are currently supported:

  • FAISS: search + load and store indexes
  • ChromaDB

Embeddings methods are supported:

  • via HF Sentence Transformer (the model can be changed)
  • via Ollama Embeddings Models (the model can be changed)

Installation

Python Framework installation

  1. Download and Install Python,
  2. install ragfmk by using pip
pip install [--force-reinstall] wheel file (see the /dist folder)

Environment variables

Some environment variables may need to be set:

  • If you need to use llamaParse, the llamaindex token (generated on the web site: https://cloud.llamaindex.ai/login) must be filled out to LLAMAINDEX_API_KEY
  • If you need to use hugging face, the hugging face env. token must be filled out into HUGGINGFACE_API_KEY
  • For AWS please set the following variables:
    • AWS_ACCESS_KEY_ID=...
    • AWS_SECRET_ACCESS_KEY=...

Installation/Preparation for Ollama

  1. Install ollama (https://ollama.com/)
  2. Run ollama in the command line and pull at least one model. tinydolphin for example is a good choice as it is a very small model and can then run on a simple laptop without a big latency.

Example of use

See the tests folder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartgenai-0.1.0.1-py3-none-any.whl (45.3 kB view details)

Uploaded Python 3

File details

Details for the file smartgenai-0.1.0.1-py3-none-any.whl.

File metadata

  • Download URL: smartgenai-0.1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.1

File hashes

Hashes for smartgenai-0.1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1483a23ad3ac0180baaf135a5903148337a2b38dbfb8d66cf2fd8e27dca0111e
MD5 85c8d22883ede107f5cc20b9253d6f11
BLAKE2b-256 4c8712ab758b8744ab3d3554933998d118242b46081157673d00f6ad7b73edf2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page