Skip to main content

llama-index vector_stores mongodb integration

Project description

LlamaIndex Vector_Stores Integration: MongoDB

Setting up MongoDB Atlas as the Datastore Provider

MongoDB Atlas is a multi-cloud database service made by the same people that build MongoDB. Atlas simplifies deploying and managing your databases while offering the versatility you need to build resilient and performant global applications on the cloud providers of your choice.

You can perform semantic search on data in your Atlas cluster running MongoDB v6.0.11, v7.0.2, or later using Atlas Vector Search. You can store vector embeddings for any kind of data along with other data in your collection on the Atlas cluster.

In the section, we provide detailed instructions to run the tests.

Deploy a Cluster

Follow the Getting-Started documentation to create an account, deploy an Atlas cluster, and connect to a database.

Retrieve the URI used by Python to connect to the Cluster

Once deployed, you will need a URI (connection string) to access the cluster. This you should store as the environment variable: MONGODB_URI. It will look something like the following. The username and password, if not provided, can be configured in Database Access under Security in the left panel.

export MONGODB_URI="mongodb+srv://<username>:<password>@cluster0.foo.mongodb.net/?retryWrites=true&w=majority"

Head to Atlas UI to find the connection string.

NOTE: There are a number of ways to navigate the Atlas UI. Keep your eye out for "Connect" and "driver".

On the left panel, find and click 'Database' under DEPLOYMENT. Click the Connect button that appears, then Drivers. Select Python. (Have no concern for the version. This is the PyMongo, not Python, version.) Once you have the Connect Window open, you will see an instruction to pip install pymongo. You will also see a connection string. This is the uri that a pymongo.MongoClient uses to connect to the Database.

Test the connection

Atlas provides a simple check. Once you have your uri and pymongo installed, try the following in a python console.

from pymongo.mongo_client import MongoClient

client = MongoClient(uri)  # Create a new client and connect to the server
try:
    client.admin.command(
        "ping"
    )  # Send a ping to confirm a successful connection
    print("Pinged your deployment. You successfully connected to MongoDB!")
except Exception as e:
    print(e)

Troubleshooting

  • You can edit a Database's users and passwords on the 'Database Access' page, under Security.
  • Remember to add your IP address. (Try curl -4 ifconfig.co)

Create a Database and Collection

As mentioned, Vector Databases provide two functions. In addition to being the data store, they provide very efficient search based on natural language queries. With Vector Search, one will index and query data with a powerful vector search algorithm using "Hierarchical Navigable Small World (HNSW) graphs to find vector similarity.

The indexing runs beside the data as a separate service asynchronously. The Search index monitors changes to the Collection that it applies to. Subsequently, one need not upload the data first. We will create an empty collection now, which will simplify setup in the example notebook.

Back in the UI, navigate to the Database Deployments page by clicking Database on the left panel. Click the "Browse Collections" and then "+ Create Database" buttons. This will open a window where you choose Database and Collection names. (No additional preferences.) Remember these values as they will be as the environment variables, MONGODB_DATABASE and MONGODB_COLLECTION.

Set Datastore Environment Variables

To establish a connection to the MongoDB Cluster, Database, and Collection, plus create a Vector Search Index, define the following environment variables. You can confirm that the required ones have been set like this: assert "MONGODB_URI" in os.environ

IMPORTANT It is crucial that the choices are consistent between setup in Atlas and Python environment(s).

Name Description Example
MONGODB_URI Connection String mongodb+srv://<user>:<password>@llama-index.zeatahb.mongodb.net
MONGODB_DATABASE Database name llama_index_test_db
MONGODB_COLLECTION Collection name llama_index_test_vectorstore
MONGODB_INDEX Search index name vector_index

The following will be required to authenticate with OpenAI.

Name Description
OPENAI_API_KEY OpenAI token created at https://platform.openai.com/api-keys

Create an Atlas Vector Search Index

The final step to configure MongoDB as the Datastore is to create a Vector Search Index. The procedure is described here.

Under Services on the left panel, choose Atlas Search > Create Search Index > Atlas Vector Search JSON Editor.

The Plugin expects an index definition like the following. To begin, choose numDimensions: 1536 along with the suggested EMBEDDING variables above. You can experiment with these later.

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

Running MongoDB Integration Tests

In addition to the Jupyter Notebook in the documentation, a suite of integration tests is available to verify the MongoDB integration unders ./tests. This test suite needs the cluster up and running, and the environment variables defined above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_vector_stores_mongodb-0.10.1.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_vector_stores_mongodb-0.10.1.tar.gz.

File metadata

  • Download URL: llama_index_vector_stores_mongodb-0.10.1.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_vector_stores_mongodb-0.10.1.tar.gz
Algorithm Hash digest
SHA256 61ead89559c4b3bcc1bca92345b494dd68c52de0fca6edd7b779fa8dc5c4e636
MD5 dcd7a6633a0e0021b92b9454306ce85c
BLAKE2b-256 47237c100b95c40e23156d39f8e3297cd035159d778b8ddb5d9f75367fac8883

See more details on using hashes here.

File details

Details for the file llama_index_vector_stores_mongodb-0.10.1-py3-none-any.whl.

File metadata

  • Download URL: llama_index_vector_stores_mongodb-0.10.1-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_vector_stores_mongodb-0.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b2ebb2baf2a607cea69ed92b2d6c1a67f49d63b40e79a16da092ec39863728d6
MD5 86ba2195a1abe133d2d92826c93c5e81
BLAKE2b-256 3226024c3237404d5ba9e01c74f10c656af3fb54008dc936da2fa125cf32b4d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page