Skip to main content

Local lightweight AI Native database for RAG, incluing embedding vectors and text search for LLM generation

Project description

AwaDB - AI Native Database for embedding vectors

Easily Use - No boring database schema definition. No need to pay attention to vector indexing details.

Realtime Search - Lock free realtime index keeps new data fresh with millisecond level latency. No wait no manual operation.

Stability - AwaDB builds upon over 4 years experience at JD.com running production workloads at scale using a system called Vearch, combined with best-of-breed ideas and practices from the community.

Run awadb locally on Mac OSX or Linux

First install awadb:

pip3 install awadb'

Then use as below:

import awadb
# 1. Initialize awadb client!
awadb_client = awadb.Client()

# 2. Create table
awadb_client.Create("test_llm1") 

# 3. Add sentences, the sentence is embedded with SentenceTransformer by default
#    You can also embed the sentences all by yourself with OpenAI or other LLMs
awadb_client.Add([{'embedding_text':'The man is happy'}, {'source' : 'pic1'}])
awadb_client.Add([{'embedding_text':'The man is very happy'}, {'source' : 'pic2'}])
awadb_client.Add([{'embedding_text':'The cat is happy'}, {'source' : 'pic3'}])
awadb_client.Add([{'embedding_text':'The man is eating'}, {'source':'pic4'}])

# 4. Search the most Top3 sentences by the specified query
query = "The man is happy"
results = awadb_client.Search(query, 3)

# Output the results
print(results)

Here the text is embedded by SentenceTransformer which is supported by Hugging Face
More detailed python local library usage you can read here

Run AwaDB as a service

If you are on the Windows platform or want a awadb service, you can download and deploy the awadb docker. The installation of awadb docker please see here

  • Python Usage

First, Install gRPC and awadb service python client as below:

pip3 install grpcio
pip3 install awadb-client

A simple example as below:

# Import the package and module
from awadb_client import Awa

# Initialize awadb client
client = Awa()

# Add dict with vector to table 'example1'
client.add("example1", {'name':'david', 'feature':[1.3, 2.5, 1.9]})
client.add("example1", {'name':'jim', 'feature':[1.1, 1.4, 2.3]})

# Search
results = client.search("example1", [1.0, 2.0, 3.0])

# Output results
print(results)

# '_id' is the primary key of each document
# It can be specified clearly when adding documents
# Here no field '_id' is specified, it is generated by the awadb server 
db_name: "default"
table_name: "example1"
results {
  total: 2
  msg: "Success"
  result_items {
    score: 0.860000074
    fields {
      name: "_id" 
      value: "64ddb69d-6038-4311-9118-605686d758d9"
    }
    fields {
      name: "name"
      value: "jim"
    }
  }
  result_items {
    score: 1.55
    fields {
      name: "_id"
      value: "f9f3035b-faaf-48d4-a947-801416c005b3"
    }
    fields {
      name: "name"
      value: "david"
    }
  }
}
result_code: SUCCESS

More python sdk for service is here
More detailed quick start examples you can find here

  • RESTful Usage
# add documents to table 'test' of db 'default', no need to create table first
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "docs":[{"_id":1, "name":"lj", "age":23 "f":[1,0]},{"_id":2, "name":"david", "age":32, "f":[1,2]}]}' http://localhost:8080/add

# search documents by the vector field 'f' of the value '[1, 1]'
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "vector_query":{"f":[1, 1]}}' http://localhost:8080/search

More detailed RESTful API is here

What are the Embeddings?

Any unstructured data(image/text/audio/video) can be transferred to vectors which are generally understanded by computers through AI(LLMs or other deep neural networks).

For example, "The man is happy"-this sentence can be transferred to a 384-dimension vector(a list of numbers [0.23, 1.98, ....]) by SentenceTransformer language model. This process is called embedding.

More detailed information about embeddings can be read from OpenAI

Awadb uses Sentence Transformers to embed the sentence by default, while you can also use OpenAI or other LLMs to do the embeddings according to your needs.

Combined with LLMs(here use LLaMa and ChatGLM) By LangChain

Examples of combining LLaMa or quantized Alpaca with llama.cpp to do local knowledge database please see here
Examples of combining ChatGLM to do local knowledge database please see here

Get involved

License

Apache 2.0

Community

Join the AwaDB community to share any problem, suggestion, or discussion with us:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

awadb-0.3.13-cp311-cp311-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.11

awadb-0.3.13-cp311-cp311-macosx_13_0_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.11 macOS 13.0+ x86-64

awadb-0.3.13-cp310-cp310-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.10

awadb-0.3.13-cp310-cp310-macosx_13_0_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.10 macOS 13.0+ x86-64

awadb-0.3.13-cp39-cp39-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.9

awadb-0.3.13-cp39-cp39-macosx_13_0_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.9 macOS 13.0+ x86-64

awadb-0.3.13-cp38-cp38-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.8

awadb-0.3.13-cp38-cp38-macosx_13_0_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.8 macOS 13.0+ x86-64

awadb-0.3.13-cp37-cp37m-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.7m

awadb-0.3.13-cp37-cp37m-macosx_13_0_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.7m macOS 13.0+ x86-64

File details

Details for the file awadb-0.3.13-cp311-cp311-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp311-cp311-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 60a4676c3404c90f8354291a85ca152d908588bf53cf8f4aef8f73baaf5d3b76
MD5 df913fb84870041958c6385d68b82cbe
BLAKE2b-256 d6a4bd0fcb0c393a4ff76c07705d77d91bf52ba935768e3dc23b4ae7dde6ee93

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 875541d8a9bca0e42c9b2985dc664163e9913f596ec70404f30ade49b963e2e5
MD5 8e0d9c4c6e7651e4b202aa21b61e0629
BLAKE2b-256 5b460684e78622704dd40237648b4a9e0b3d9631651da2702a7f4c76c60796ab

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 982d6af59cad5143947948dc55f76d668dc9f1b5a2b6c413437226dc0d331859
MD5 583a28504aadbe25630e7dbe84f72f66
BLAKE2b-256 36bd2b5d197f6b5a669f2aa04da96ccb764b876b9a7d93dcf92404c91d33646c

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 b9636a0c0c0b9973364ef841683e45124977c9f80f39b8b4a2455cf4b500cba9
MD5 071c76cc957954789f36493eaed6b435
BLAKE2b-256 59592548feeb3348c7ddbf99777ef8a78b26a735a6002cdaf78c4104268cdb22

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7b8da7adebd1ae831a92466630fc51067fb6643889152c18da28260a94915d6d
MD5 76d99ce659ba63c9fc4cdbc61dc5e055
BLAKE2b-256 8c0a6322977cc2b3fb4b2f0776d794179744d186b900113288291e1fd9d6d9fc

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp39-cp39-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp39-cp39-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 2a0a79342d268ee13133f061ff6a3a6bd0afaeac101fd8c7751b8cc1905cbaf7
MD5 4f814fd29df1dbbecf8d99d430791ca9
BLAKE2b-256 1aabe29fe2bf8f280253faeb681b4b722676d93568fad91391190edacec382c8

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 afa364ca60f3375eb7cdd137286b392e3e60905a45c54d56b7337c2861253716
MD5 f06f526599bd1c456a565e2e2a02cc50
BLAKE2b-256 a4b1e29c01d347e35b64cca97982382cfdb93031f117a29f8fd841ce4ed6d8d0

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp38-cp38-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp38-cp38-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 a5990b59ce43ecaaf993cc3eb1f25cc753ace6eb09c884d7a9dbda6325c81fa3
MD5 53a5acb9f5e54c0e9fc875d88f8b629b
BLAKE2b-256 3249a114a84c7903a3d9050e742267ee95bb3058eab409c2873fd2a69fee7749

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6e251927de268b5074dcf437a2ea31f6d9cdfc54176a021d306a0a6c1550a897
MD5 7c5998481d6cd02895a08d9dc7f9ab43
BLAKE2b-256 6ae134de4242950f28a93ac44ed072c054e3b99382fd58fd426c6feb9a7ad062

See more details on using hashes here.

File details

Details for the file awadb-0.3.13-cp37-cp37m-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for awadb-0.3.13-cp37-cp37m-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 26f8ebb7c877264f32b0c1f496c9e87bbef420242efae366b2bddd97ba720461
MD5 044ace4472b3d5a78b011489629bf057
BLAKE2b-256 7e9f00c60229576847436b7b8e863572b6202ef1cacd78e0f2b959fab208f587

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page