DashVector Client Python Sdk Library
Project description
DashVector Client Python Library
DashVector is a scalable and fully-managed vector-database service for building various machine learning applications. The DashVector client SDK is your gateway to access the DashVector service.
For more information about DashVector, please visit: https://help.aliyun.com/document_detail/2510225.html
Installation
To install the DashVector client Python SDK, simply run:
pip install dashvector
QuickStart
import numpy as np
import dashvector
# Use DashVector `Client` api to communicate with the backend vectorDB service.
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
# Create a collection named "quickstart" with dimension of 4, using the default Cosine distance metric
rsp = client.create(name='quickstart', dimension=4)
assert rsp
# Get a collection by name
collection = client.get(name='quickstart')
# Operations on 'Collection' includes Inert/Query/Upsert/Update/Delete/Fetch of docs
# Here we insert sample data (4-dimensional vectors) in batches of 16
collection.insert(
[
dashvector.Doc(id=str(i), vector=np.random.rand(4), fields={'anykey': 'anyvalue'})
for i in range(16)
]
)
# Query a vector from the collection
docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=5)
print(docs)
# Get statistics about collection
stats = collection.stats()
print(stats)
# Delete a collection by name
client.delete(name='quickstart')
Reference
Create a Client
Client
host various APIs for interacting with DashVector Collection
.
dashvector.Client(
api_key: str,
endpoint: str = 'dashvector.cn-hangzhou.aliyuncs.com',
protocal: dashvector.DashVectorProtocol = dashvector.DashVectorProtocol.GRPC,
timeout: float = 10.0
) -> Client
Parameters | Type | Required | Description |
---|---|---|---|
api_key | str | Yes | Your DashVector API-KEY |
endpoint | str | No | Service Endpoint. Default value: dashvector.cn-hangzhou.aliyuncs.com |
protocol | DashVectorProtocol | No | Communication protocol, support HTTP and GRPC. Default value: DashVectorProtocol.GRPC |
timeout | float | No | Timeout period (in seconds), -1 means no timeout. Default value: 10.0 |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
assert client
Create Collection
Client.create(
name: str,
dimension: int,
dtype: Union[Type[int], Type[float]] = float,
fields_schema: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None,
metric: str = 'cosine',
timeout: Optional[int] = None
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Collection to create. |
dimension | int | Yes | The dimensions of the Collection's vectors. Valid values: 1-20,000 |
dtype | Union[Type[int], Type[float]] | No | The date type of the Collection's vectors. Default value: Type[float] |
fields_schema | Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] | No | Fields schema of the Collection. Default value: None e.g. {"name": str, "age": int} |
metric | str | No | Vector similarity metric. For cosine , dtype must be float .Valid values: 1. (Default) cosine 2. dotproduct 3. euclidean |
timeout | Optional[int] | No | Timeout period (in seconds), -1 means asynchronous creation collection. Default value: None |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.create('YOUR-COLLECTION-NAME', dimension=4)
assert rsp
List Collections
Client.list() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collections = client.list()
for collection in collections:
print(collection)
# outputs:
# 'quickstart'
Describe Collection
Client.describe(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Collection to describe. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.describe('YOUR-COLLECTION-NAME')
print(rsp)
# example output:
# {
# "request_id": "8d3ac14e-5382-4736-b77c-4318761ddfab",
# "code": 0,
# "message": "",
# "output": {
# "name": "quickstart",
# "dimension": 4,
# "dtype": "FLOAT",
# "metric": "dotproduct",
# "fields_schema": {
# "name": "STRING",
# "age": "INT",
# "height": "FLOAT"
# },
# "status": "SERVING",
# "partitions": {
# "default": "SERVING"
# }
# }
# }
Delete Collection
Client.delete(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Collection to delete. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
client.delete('YOUR-COLLECTION-NAME')
Get a Collection Instance
Collection
provides APIs for accessing Doc
and Partition
Client.get(name: str) -> Collection
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Collection to get. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
assert collection
Describe Collection Statistics
Collection.stats() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.stats()
print(rsp)
# example output:
# {
# "request_id": "14448bcb-c9a3-49a8-9152-0de3990bce59",
# "code": 0,
# "message": "Success",
# "output": {
# "total_doc_count": "26",
# "index_completeness": 1.0,
# "partitions": {
# "default": {
# "total_doc_count": "26"
# }
# }
# }
# }
Insert/Update/Upsert Docs
Collection.insert(
docs: Union[Doc, List[Doc], Tuple, List[Tuple]],
partition: Optional[str] = None,
async_req: False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
docs | Union[Doc, List[Doc], Tuple, List[Tuple]] | Yes | The docs to Insert/Update/Upsert. |
partition | Optional[str] | No | Name of the partition to Insert/Update/Upsert. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
import numpy as np
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
# insert a doc with Tuple
collection.insert(('YOUR-DOC-ID1', [0.1, 0.2, 0.3, 0.4]))
collection.insert(('YOUR-DOC-ID2', [0.2, 0.3, 0.4, 0.5], {'age': 30, 'name': 'alice', 'anykey': 'anyvalue'}))
# insert a doc with dashvector.Doc
collection.insert(
dashvector.Doc(
id='YOUR-DOC-ID3',
vector=[0.3, 0.4, 0.5, 0.6],
fields={'foo': 'bar'}
)
)
# insert in batches
ret = collection.insert(
[
('YOUR-DOC-ID4', [0.2, 0.7, 0.8, 1.3], {'age': 1}),
('YOUR-DOC-ID4', [0.3, 0.6, 0.9, 1.2], {'age': 2}),
('YOUR-DOC-ID6', [0.4, 0.5, 1.0, 1.1], {'age': 3})
]
)
# insert in batches
ret = collection.insert(
[
dashvector.Doc(id=str(i), vector=np.random.rand(4)) for i in range(10)
]
)
# async insert
ret_funture = collection.insert(
[
dashvector.Doc(id=str(i+10), vector=np.random.rand(4)) for i in range(10)
],
async_req=True
)
ret = ret_funture.get()
Query a Collection
Collection.query(
vector: Optional[Union[List[Union[int, float]], np.ndarray]] = None,
id: Optional[str] = None,
topk: int = 10,
filter: Optional[str] = None,
include_vector: bool = False,
partition: Optional[str] = None,
output_fields: Optional[List[str]] = None,
async_req: False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
vector | Optional[Union[List[Union[int, float]], np.ndarray]] | No | The vector to query |
id | Optional[str] | No | The doc id to query. Setting id means searching by vector corresponding to the id |
topk | Optional[str] | No | Number of similarity results to return. Default value: 10 |
filter | Optional[str] | No | Expression used to filter results Default value: None e.g. age>20 |
include_vector | bool | No | Return vector details or not. Default value: False |
partition | Optional[str] | No | Name of the partition to Query. Default value: None |
output_fields | Optional[List[str]] | No | List of field names to return. Default value: None , means return all fieldse.g. ['name', 'age'] |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
match_docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=100, filter='age>20', include_vector=True, output_fields=['age','name','foo'])
if match_docs:
for doc in match_docs:
print(doc.id)
print(doc.vector)
print(doc.fields)
print(doc.score)
Delete Docs
collection.delete(
ids: Union[str, List[str]],
delete_all: bool = False,
partition: Optional[str] = None,
async_req: bool = False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
ids | Union[str, List[str]] | Yes | The id (or list of ids) for the Doc(s) to Delete |
delete_all | bool | No | Delete all vectors from partition. Default value: False |
partition | Optional[str] | No | Name of the partition to Delete from. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
collection.delete(['YOUR-DOC-ID1','YOUR-DOC-ID2'])
Fetch Docs
Collection.fetch(
ids: Union[str, List[str]],
partition: Optional[str] = None,
async_req: bool = False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
ids | Union[str, List[str]] | Yes | The id (or list of ids) for the Doc(s) to Fetch |
partition | Optional[str] | No | Name of the partition to Fetch from. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
fetch_docs = collection.fetch(['YOUR-DOC-ID1', 'YOUR-DOC-ID2'])
if fetch_docs:
for doc_id in fetch_docs:
doc = fetch_docs[doc_id]
print(doc.id)
print(doc.vector)
print(doc.fields)
Create Collection Partition
Collection.create_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Partition to Create. |
timeout | Optional[int] | No | Timeout period (in seconds), -1 means asynchronous creation partition. Default value: None |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.create_partition('YOUR-PARTITION-NAME')
assert rsp
Delete Collection Partition
Collection.delete_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Partition to Delete. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.delete_partition('YOUR-PARTITION-NAME')
assert rsp
List Collection Partitions
Collection.list_partitions() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
partitions = collection.list_partitions()
assert partitions
for pt in partitions:
print(pt)
Describe Collection Partition
Collection.describe_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Partition to Describe. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.describe_partition('shoes')
print(rsp)
# example output:
# {"request_id":"296267a7-68e2-483a-87e6-5992d85a5806","code":0,"message":"","output":"SERVING"}
Statistics for Collection Partition
Collection.stats_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|---|---|---|
name | str | Yes | The name of the Partition to get Statistics. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.stats_partition('shoes')
print(rsp)
# example outptut:
# {
# "code":0,
# "message":"",
# "requests_id":"330a2bcb-e4a7-4fc6-a711-2fe5f8a24e8c",
# "output":{
# "total_doc_count":0
# }
# }
Class
dashvector.Doc
@dataclass(frozen=True)
class Doc(object):
id: str
vector: Union[List[int], List[float], numpy.ndarray]
fields: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None
score: float = 0.0
dashvector.DashVectorResponse
class DashVectorResponse(object):
code: DashVectorCode
message: str
request_id: str
output: Any
License
This project is licensed under the Apache License (Version 2.0).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dashvector-1.0.18.tar.gz
.
File metadata
- Download URL: dashvector-1.0.18.tar.gz
- Upload date:
- Size: 53.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.9 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e82629ccbf963bcef01dc99273a2c96fd40d564b91b2f989054f99f69b3cbd9 |
|
MD5 | a4fb3caf56ec32f0a9123a80c7a8069b |
|
BLAKE2b-256 | 5ce7e365e1e4f66900e01dd93cb5b9aa8bbb5cf5a174526a75414000ddda7ed4 |
File details
Details for the file dashvector-1.0.18-py3-none-any.whl
.
File metadata
- Download URL: dashvector-1.0.18-py3-none-any.whl
- Upload date:
- Size: 83.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.9 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11c6a1254cfe7a3fb56c00491feb476f343015c242cdec503d4a96946f756514 |
|
MD5 | cd094a72dba349a5ce02b32716686249 |
|
BLAKE2b-256 | 7b2c302bb33262233a304d4ed3cb50dd166a13f278ce42dbfed30f62552bdddb |