Skip to main content

llama-index readers database integration

Project description

LlamaIndex Readers Integration: Database

Overview

Database Reader is a tool designed to query and load data from databases efficiently.

Key features

  • Accepts connection via SQLDatabase, SQLAlchemy Engine, full URI, or discrete credentials
  • Optional schema selection (namespace)
  • Column-level metadata mapping (metadata_cols) and text exclusion (excluded_text_cols)
  • Custom id_ generation function (document_id)
  • Supports streaming (lazy_load_data) and async (aload_data)

Installation

You can install Database Reader via pip:

pip install llama-index-readers-database

Usage

from llama_index.readers.database import DatabaseReader

# Initialize DatabaseReader with the SQL database connection details
reader = DatabaseReader(
    sql_database="<SQLDatabase Object>",  # Optional: SQLDatabase object
    engine="<SQLAlchemy Engine Object>",  # Optional: SQLAlchemy Engine object
    uri="<Connection URI>",  # Optional: Connection URI
    scheme="<Scheme>",  # Optional: Scheme
    host="<Host>",  # Optional: Host
    port="<Port>",  # Optional: Port
    user="<Username>",  # Optional: Username
    password="<Password>",  # Optional: Password
    dbname="<Database Name>",  # Optional: Database Name
)

# Load data from the database using a query
documents = reader.load_data(
    query="<SQL Query>"  # SQL query parameter to filter tables and rows
)
# Initialize DatabaseReader with the SQL connection string and custom database schema
from llama_index.readers.database import DatabaseReader

reader = DatabaseReader(
    uri="postgresql+psycopg2://user:pass@localhost:5432/mydb",
    schema="warehouse",  # optional namespace
)
# Streaming variant, excluded id from text_resource
for doc in reader.lazy_load_data(
    query="SELECT * FROM warehouse.big_table", excluded_text_cols={"id"}
):
    process(doc)

# Async variant, added region to metadata
docs_async = await reader.aload_data(
    query="SELECT * FROM warehouse.big_table", metadata_cols=["region"]
)
# Advanced usage with custom named metadata columns, columns excluded from the `Document.text_resource`, and a dynamic `Document.id_` generated from row data and a fstring template
from llama_index.readers.database import DatabaseReader

reader_media = DatabaseReader(
    uri="postgresql+psycopg2://user:pass@localhost:5432/mydb",
    schema="media",  # optional namespace
)

docs = reader_media.load_data(
    query="SELECT id, title, body, updated_at FROM media.articles",
    metadata_cols=[
        ("id", "article_id"),
        "updated_at",
    ],  # map / include in metadata
    excluded_text_cols=["updated_at"],  # omit from text
    document_id=lambda row: f"media-articles-{row['id']}",  # custom document id
)

This loader is designed to be used as a way to load data into LlamaIndex.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_database-0.5.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_readers_database-0.5.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_readers_database-0.5.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_database-0.5.1.tar.gz
Algorithm Hash digest
SHA256 f613102740dd30800327932d8f1ecf22aea93878d498271e79cb8f7afd7477ff
MD5 573f3a1963378b59823d78953c65792f
BLAKE2b-256 85ace50316d32b0687806cb26a828a5feb27efaf8f121d3f248991cc4a7702aa

See more details on using hashes here.

File details

Details for the file llama_index_readers_database-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_database-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fb96894a4894752a9fbfe62f2220e0e79912ccd8d05b447ea1079d22fda67955
MD5 b4ccc12d679d88d004c286323ad9bba2
BLAKE2b-256 9c70318d7bf3f3897a3b88b6474377fed09a61e9a5352c02feb74834fcb56787

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page