AstraPy is a Pythonic SDK for DataStax Astra
Project description
AstraPy
AstraPy is a Pythonic SDK for DataStax Astra
Part III - Inserting Documents
- 3.1 - Inserting a document
- 3.2 - Inserting multiple documents
- 3.3 - Inserting multiple documents
- 3.4 - Creating a subdocument
- 3.5 - Create a document without an ID
- 5.1 - Find documents using vector search
- 5.2 - Find documents using vector search and projection
- 5.3 - Find one and update with vector search
- 5.4 Find one and replace with vector search
Part I - Getting Started
1.1 Install AstraPy
pip install astrapy
1.2 Setup your Astra client
Create a .env
file with the appropriate values:
ASTRA_DB_APPLICATION_TOKEN="<AstraCS:...>"
ASTRA_DB_API_ENDPOINT="<https://...>"
If you have Astra CLI installed, you can create the
.env
file withastra db create-dotenv DATABASE_NAME
.
Load the variables in and then create the client. This collections client can make non-vector and vector calls, depending on the call configuration.
import os
from dotenv import load_dotenv
from astrapy.db import AstraDB, AstraDBCollection
from astrapy.ops import AstraDBOps
load_dotenv()
# Grab the Astra token and api endpoint from the environment
token = os.getenv("ASTRA_DB_APPLICATION_TOKEN")
api_endpoint = os.getenv("ASTRA_DB_API_ENDPOINT")
# Initialize our vector db
astra_db = AstraDB(token=token, api_endpoint=api_endpoint)
Part II - Collections
2.1 Create and Delete Vector Collections
Create a vector collection with dimension of 5 If you were using OpenAI here you would use 1376 as the value
# Create a collection and then delete it
astra_db.create_collection(collection_name="collection_test_delete", dimension=5)
astra_db.delete_collection(collection_name="collection_test_delete")
# Double check the collections in your vector store
astra_db.get_collections()
At this point you have a collection named "collection_test" to do the following operations
In the next section, you will be creating the object for your collection
2.2 Connect to existing collection
# The return of create_collection() will return the collection
collection = astra_db.create_collection(
collection_name="collection_test", dimension=5
)
# Or you can connect to an existing connection directly
collection = AstraDBCollection(
collection_name="collection_test", astra_db=astra_db
)
# You don't even need the astra_db object
collection = AstraDBCollection(
collection_name="collection_test", token=token, api_endpoint=api_endpoint
)
Part III - Inserting Documents
3.1 Inserting a document
Here is an example of inserting a vector object into your vector store (collection), followed by running a find command to retrieve the document. The first find command fails because that object does not exist. The second find command should succeed.
collection.insert_one(
{
"_id": "5",
"name": "Coded Cleats Copy",
"description": "ChatGPT integrated sneakers that talk to you",
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
}
)
collection.find_one({"name": "potato"}) # Not found
collection.find_one({"name": "Coded Cleats Copy"})
3.2 Inserting multiple documents
Here is an example of inserting a number of documents into your collection. Note that the json object is 'documents' here, not 'document' as it is in insert_one.
In the first insert, the default behavior is in place. If you are inserting documents that already exist, you will get an error and the process will end.
These two examples are using non-vector objects.
documents = [
{
"_id": "id_1",
"first_name": "Dang",
"last_name": "Son",
},
{
"_id": "id_2",
"first_name": "Yep",
"last_name": "Boss",
},
]
response = collection.insert_many(documents=documents)
In the following insert_many example, options are set so that it skips errors and only inserts successful entries.
documents2 = [
{
"_id": "id_2",
"first_name": "Yep",
"last_name": "Boss",
},
{
"_id": "id_3",
"first_name": "Miv",
"last_name": "Fuff",
},
]
response = collection.insert_many(
documents=documents2,
partial_failures_allowed=True,
)
3.3 Inserting multiple vector documents
The following code inserts vector objects into the collection in your vector store.
json_query = [
{
"_id": str(uuid.uuid4()),
"name": "Coded Cleats",
"description": "ChatGPT integrated sneakers that talk to you",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
},
{
"_id": str(uuid.uuid4()),
"name": "Logic Layers",
"description": "An AI quilt to help you sleep forever",
"$vector": [0.45, 0.09, 0.01, 0.2, 0.11],
},
{
"_id": vv_uuid,
"name": "Vision Vector Frame",
"description": "Vision Vector Frame - A deep learning display that controls your mood",
"$vector": [0.1, 0.05, 0.08, 0.3, 0.6],
},
]
res = collection.insert_many(documents=json_query)
3.4 Creating a subdocument
The following code uses update to create or update a sub-document under one of your existing documents.
document = collection.update_one(
filter={"_id": "id_1"},
update={"$set": {"name": "Eric"}},
)
document = collection.find_one(filter={"_id": "id_1"})
3.5 Create a document without an ID
response = collection.insert_one(
document={
"first_name": "New",
"last_name": "Guy",
}
)
document = collection.find_one(filter={"first_name": "New"})
Part IV - Updating Documents
4.1 Update a document
collection.update_one(
filter={"_id": cliff_uuid},
update={"$set": {"name": "Bob"}},
)
document = collection.find_one(filter={"_id": "id_1"})
4.2 Replace a non-vector document
collection.find_one_and_replace(
filter={"_id": "id_1"},
replacement={
"_id": "id_1",
"addresses": {
"work": {
"city": "New York",
"state": "NY",
}
},
},
)
document = collection.find_one(filter={"_id": "id_1"})
document_2 = collection.find_one(
filter={"_id": cliff_uuid}, projection={"addresses.work.city": 1}
)
Part V - Finding Documents
The below examples show our high-level interfaces for finding documents. Note that corresponding low-level functionality exists, i.e., the high-level interface vector_find
is a wrapper around the find
function, as available in the JSON API.
5.1 Find documents using vector search
documents = collection.vector_find(
[0.15, 0.1, 0.1, 0.35, 0.55],
limit=100,
)
5.2 Find documents using vector search and projection
documents = collection.vector_find(
[0.15, 0.1, 0.1, 0.35, 0.55],
limit=100,
fields=["$vector"],
)
5.3 Find one and update with vector search
update = {"$set": {"status": "active"}}
document = collection.find_one(filter={"status": "active"})
collection.vector_find_one_and_update(
[0.15, 0.1, 0.1, 0.35, 0.55],
update=update,
)
document = collection.find_one(filter={"status": "active"})
5.4 Find one and replace with vector search
replacement = {
"_id": "1",
"name": "Vision Vector Frame",
"description": "Vision Vector Frame - A deep learning display that controls your mood",
"$vector": [0.1, 0.05, 0.08, 0.3, 0.6],
"status": "inactive",
}
collection.vector_find_one_and_replace(
[0.15, 0.1, 0.1, 0.35, 0.55],
replacement=replacement,
)
document = collection.find_one(filter={"name": "Vision Vector Frame"})
Part VI - Deleting Documents
6.1 Delete a subdocument
response = collection.delete_subdocument(id="id_1", subdoc="addresses")
document = collection.find(filter={"_id": "id_1"})
6.2 Delete a document
response = collection.delete(id="id_1")
More Information
Check out the notebook which has examples for finding and inserting information into the database, including vector commands.
Take a look at the astra db tests for specific endpoint examples.
Using the Ops Client
You can use the Ops client to work with the Astra DevOps API. Check the devops tests
For Developers
Install poetry
pip install poetry
Install the project dependencies
poetry install
Style, linter, typing
AstraPy tries to be consistent in code style and linting. Moreover, type annotations are used everywhere.
To ensure the code complies, you should get no errors (Success: no issues found...) when running the following in the root dir:
poetry run black --check astrapy && poetry run ruff astrapy && poetry run mypy astrapy
Likewise, for the tests:
poetry run black --check tests && poetry run ruff tests && poetry run mypy tests
Testing
Ensure you provide all required environment variables (you can do so by editing tests/.env
after tests/.env.template
):
export ASTRA_DB_APPLICATION_TOKEN="..."
export ASTRA_DB_API_ENDPOINT="..."
export ASTRA_DB_KEYSPACE="..." # Optional
export ASTRA_DB_ID="..." # For the Ops testing only
export ASTRA_DB_OPS_APPLICATION_TOKEN="..." # Ops-only, falls back to the other token
then you can run:
poetry run pytest
To remove the noise from the logs (on by default), run pytest -o log_cli=0
.
To skip all collection deletions (done by default):
TEST_SKIP_COLLECTION_DELETE=1 poetry run pytest [...]
To enable the AstraDBOps
testing (off by default):
TEST_ASTRADBOPS=1 poetry run pytest [...]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file astrapy-0.7.2.tar.gz
.
File metadata
- Download URL: astrapy-0.7.2.tar.gz
- Upload date:
- Size: 28.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.10.10 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4f2cc9ac64c2ae3e5716a834423a15b146884045538e268806b50c7d3d00ada |
|
MD5 | c37bb270c354aacaa2923456ed8ad2d5 |
|
BLAKE2b-256 | 95560c23c851f11710351a6cbc11bfb50505e3ffc04d5876413d56979e055a77 |
File details
Details for the file astrapy-0.7.2-py3-none-any.whl
.
File metadata
- Download URL: astrapy-0.7.2-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.10.10 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2ee4d3abec18fa0106163c7f8cb9495f207a03891e688deddb421dd4e58cf66 |
|
MD5 | 1cab36bfdfa369e66e4687552dc97c1c |
|
BLAKE2b-256 | 74fdb3bbfc0d2b138caa87d68c0055c5ddfe9d7bce1908e8aa74693ec63b58bb |