Skip to main content

An integration package connecting MariaDB and LangChain

Project description

langchain-mariadb

CI License: MIT

LangChain's MariaDB integration (langchain-mariadb) provides vector capabilities for working with MariaDB version 11.7.1 and above, distributed under the MIT license. Users can use the provided implementations as-is or customize them for specific needs. Key features include:

  • Built-in vector similarity search
  • Support for cosine and euclidean distance metrics
  • Robust metadata filtering options
  • Performance optimization through connection pooling
  • Configurable table and column settings

Getting Started

Setting Up MariaDB

Launch a MariaDB Docker container with:

docker run --name mariadb-container -e MARIADB_ROOT_PASSWORD=langchain -e MARIADB_DATABASE=langchain -p 3306:3306 -d mariadb:11.7

Installing the Package

The package uses SQLAlchemy but works best with the MariaDB connector, which requires C/C++ components:

# Debian, Ubuntu
sudo apt install libmariadb3 libmariadb-dev

# CentOS, RHEL, Rocky Linux
sudo yum install MariaDB-shared MariaDB-devel

# Install Python connector
pip install --quiet -U mariadb

Then install langchain-mariadb package

pip install -U langchain-mariadb

VectorStore works along with an LLM model, here using langchain-openai as example.

pip install langchain-openai
export OPENAI_API_KEY=...

Creating a Vector Store

from langchain_openai import OpenAIEmbeddings
from langchain_mariadb import MariaDBStore
from langchain_core.documents import Document

# connection string
url = f"mariadb+mariadbconnector://myuser:mypassword@localhost/langchain"

# Initialize vector store
vectorstore = MariaDBStore(
    embeddings=OpenAIEmbeddings(),
    embedding_length=1536,
    datasource=url,
    collection_name="my_docs"
)

Adding Data

You can add data as documents with metadata:

# adding documents
docs = [
    Document(page_content='there are cats in the pond', metadata={"id": 1, "location": "pond", "topic": "animals"}),
    Document(page_content='ducks are also found in the pond', metadata={"id": 2, "location": "pond", "topic": "animals"}),
    # More documents...
]
vectorstore.add_documents(docs)

Or as plain text with optional metadata:

texts = ['a sculpture exhibit is also at the museum', 'a new coffee shop opened on Main Street',]
metadatas = [
    {"id": 6, "location": "museum", "topic": "art"},
    {"id": 7, "location": "Main Street", "topic": "food"},
]

vectorstore.add_texts(texts=texts, metadatas=metadatas)

Searching

# Basic similarity search
results = vectorstore.similarity_search("Hello", k=2)

# Search with metadata filtering
results = vectorstore.similarity_search(
    "Hello",
    filter={"category": "greeting"}
)

Filter Options

The system supports various filtering operations on metadata:

  • Equality: $eq
  • Inequality: $ne
  • Comparisons: $lt, $lte, $gt, $gte
  • List operations: $in, $nin
  • Text matching: $like, $nlike
  • Logical operations: $and, $or, $not

Example:

# Search with simple filter
results = vectorstore.similarity_search('kitty', k=10, filter={
    'id': {'$in': [1, 5, 2, 9]}
})

# Search with multiple conditions (AND)
results = vectorstore.similarity_search('ducks', k=10, filter={
    'id': {'$in': [1, 5, 2, 9]},
    'location': {'$in': ["pond", "market"]}
})

Chat Message History

The package also provides a way to store chat message history in MariaDB:

import uuid
from langchain_core.messages import SystemMessage, AIMessage, HumanMessage
from langchain_mariadb import MariaDBChatMessageHistory

# Set up database connection
url = f"mariadb+mariadbconnector://myuser:mypassword@localhost/chatdb"

# Create table (one-time setup)
table_name = "chat_history"
MariaDBChatMessageHistory.create_tables(url, table_name)

# Initialize chat history manager
chat_history = MariaDBChatMessageHistory(
    table_name,
    str(uuid.uuid4()), # session_id
    datasource=pool
)

# Add messages to the chat history
chat_history.add_messages([
    SystemMessage(content="Meow"),
    AIMessage(content="woof"),
    HumanMessage(content="bark"),
])

print(chat_history.messages)
```_

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_mariadb-0.0.16.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

langchain_mariadb-0.0.16-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file langchain_mariadb-0.0.16.tar.gz.

File metadata

  • Download URL: langchain_mariadb-0.0.16.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_mariadb-0.0.16.tar.gz
Algorithm Hash digest
SHA256 c4362de8cc51a5586c2ad8a7a9729f5302328854fbd01360285420b3ed9d2acb
MD5 d67f6ba1e3d2bc7cac0824e95cdd5b0b
BLAKE2b-256 bc18bb2fb74afadb1419bf7d6dd09307e6d5eb0c2aa58ffdc084153f59faea05

See more details on using hashes here.

File details

Details for the file langchain_mariadb-0.0.16-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_mariadb-0.0.16-py3-none-any.whl
Algorithm Hash digest
SHA256 5693c0b2ad7fecca59cc80166fe9147ee2295c8f7c65a81313c19392039a397b
MD5 725d2e658ed88af801b70332cfbbd58a
BLAKE2b-256 7b3d3f572fe4ae32a61003e84e45e8cb63c30f03a75df99591564143f8b408a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page