🔮 Super-power your database with AI 🔮

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Bring AI to your favorite database!

Docs | Blog | Showcases | Live Jupyter Demo

🔮 SuperDuperDB is open-source: Leave a star ⭐️ to support the project!

📢 Important Announcement !

On the 21st of November, we are going to officially launch SuperDuperDB with the release of v0.1.

The release will include:

Full integration of major SQL databases, including PostgreSQL, MySQL, SQLite, DuckDB, BigQuery, Snowflake, and many more.
Massive overhaul of the docs
Revamped and modularized testing suite

⭐️ Leave a star to be informed of more exciting updates!

SuperDuperDB is not another database. It is a framework that transforms your favorite database into an AI powerhouse:

A single scalable AI deployment of all your models and AI APIs, including output computation (inference) — always up-to-date as changing data is handled automatically and immediately.
A model trainer that allows easy training and fine-tuning of models simply by querying the database.
A feature store in which the model outputs are stored alongside the inputs in any data format.
A fully functional vector database that allows easy generalization of vector embeddings and vector indexes of the data with preferred models and APIs.

⚡ Integrations (more coming soon):

Build AI applications easily without needing to move your data to complex pipelines and specialized vector databases. Integrate AI and vector search directly with your database including real-time inference and model training. All through a simple Python interface!

Datastores

Unlock the power of SuperDuperDB to connect and manage various types of data sources effortlessly!

 Full Support

 Full Support

 Full Support

 Experimental

 Experimental

 Experimental

 Experimental

 Experimental

AI Frameworks

Leverage SuperDuperDB to discover insights from your data using a variety of AI models!

 Full Support

 Full Support

 Full Support

AI APIs

Let SuperDuperDB make your applications smarter using a suite of ready-to-use AI models!

 Full Support

 Full Support

 Full Support

🔥 Featured Examples

Try our ready-to-use notebooks live on your browser.

Generative AI & chatbots
Vector Search
Standard Use-Cases (classification, regression, clustering, recommendation, etc)
Highly custom AI use cases and workflows with specialized models.

Text-To-Image Search	Text-To-Video Search	Question the Docs


Semantic Search Engine	Classical Machine Learning	Cross-Framework Transfer Learning

🚀 Installation

1. Install SuperDuperDB via `pip` (~1 minute)

pip install superduperdb

2. Try SuperDuperDB via Docker (~2 minutes):

You need to install Docker? See the docs here.

docker run -p 8888:8888 superduperdb/demo:latest

📚 Tutorial

In this tutorial, you will learn how to Integrate, train, and manage any AI models and APIs directly with your database with your data. You can visit the docs to learn more.

- Deploy ML/AI models to your database:

Automatically compute outputs (inference) with your database in a single environment.

import pymongo
from sklearn.svm import SVC

from superduperdb import superduper

# Make your db superduper!
db = superduper(pymongo.MongoClient().my_db)

# Models client can be converted to SuperDuperDB objects with a simple wrapper.
model = superduper(SVC())

# Add the model into the database
db.add(model)

# Predict on the selected data.
model.predict(X='input_col', db=db, select=Collection(name='test_documents').find({'_fold': 'valid'}))

- Train models directly from your database.

Simply by querying your database, without additional ingestion and pre-processing:

import pymongo
from sklearn.svm import SVC

from superduperdb import superduper

# Make your db superduper!
db = superduper(pymongo.MongoClient().my_db)

# Models client can be converted to SuperDuperDB objects with a simple wrapper.
model = superduper(SVC())

# Predict on the selected data.
model.train(X='input_col', y='target_col', db=db, select=Collection(name='test_documents').find({'_fold': 'valid'}))

- Vector-Search your data:

Use your existing favorite database as a vector search database, including model management and serving.

# First a "Listener" makes sure vectors stay up-to-date
indexing_listener = Listener(model=OpenAIEmbedding(), key='text', select=collection.find())

# This "Listener" is linked with a "VectorIndex"
db.add(VectorIndex('my-index', indexing_listener=indexing_listener))

# The "VectorIndex" may be used to search data. Items to be searched against are passed
# to the registered model and vectorized. No additional app layer is required.
db.execute(collection.like({'text': 'clothing item'}, 'my-index').find({'brand': 'Nike'}))

- Integrate AI APIs to work together with other models.

Use OpenAI, PyTorch or Hugging face model as an embedding model for vector search.

# Create a ``VectorIndex`` instance with indexing listener as OpenAIEmbedding and add it to the database.
db.add(
    VectorIndex(
        identifier='my-index',
        indexing_listener=Listener(
            model=OpenAIEmbedding(identifier='text-embedding-ada-002'),
            key='abstract',
            select=Collection(name='wikipedia').find(),
        ),
    )
)
# The above also executes the embedding model (openai) with the select query on the key.

# Now we can use the vector-index to search via meaning through the wikipedia abstracts
cur = db.execute(
    Collection(name='wikipedia')
        .like({'abstract': 'philosophers'}, n=10, vector_index='my-index')
)

- Add a Llama 2 model directly into your database!:

model_id = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = Pipeline(
    identifier='my-sentiment-analysis',
    task='text-generation',
    preprocess=tokenizer,
    object=pipeline,
    torch_dtype=torch.float16,
    device_map="auto",
)

# You can easily predict on your collection documents.
model.predict(
    X=Collection(name='test_documents').find(),
    db=db,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200
)

- Use models outputs as inputs to downstream models:

model.predict(
    X='input_col',
    db=db,
    select=coll.find().featurize({'X': '<upstream-model-id>'}),  # already registered upstream model-id
    listen=True,
)

🤝 Community & Getting Help

If you have any problems, questions, comments, or ideas:

Join our Slack (we look forward to seeing you there).
Search through our GitHub Discussions, or add a new question.
Comment an existing issue or create a new one.
Help us to improve SuperDuperDB by providing your valuable feedback here!
Email us at gethelp@superduperdb.com.
Feel free to contact a maintainer or community volunteer directly!

🌱 Contributing

There are many ways to contribute, and they are not limited to writing code. We welcome all contributions such as:

Please see our Contributing Guide for details.

❤️ Contributors

Thanks goes to these wonderful people:

License

SuperDuperDB is open-source and intended to be a community effort, and it wouldn't be possible without your support and enthusiasm. It is distributed under the terms of the Apache 2.0 license. Any contribution made to this project will be subject to the same provisions.

Join Us

We are looking for nice people who are invested in the problem we are trying to solve to join us full-time. Find roles that we are trying to fill here!

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.1

Feb 9, 2024

0.1.0

Dec 5, 2023

0.0.20

Dec 4, 2023

0.0.19

Dec 4, 2023

0.0.17

Dec 4, 2023

This version

0.0.16

Nov 9, 2023

0.0.15

Nov 1, 2023

0.0.14

Oct 27, 2023

0.0.13

Oct 19, 2023

0.0.12

Oct 10, 2023

0.0.11

Oct 10, 2023

0.0.10

Oct 9, 2023

0.0.9

Oct 6, 2023

0.0.8

Sep 29, 2023

0.0.7

Sep 14, 2023

0.0.6

Aug 30, 2023

0.0.5

Aug 15, 2023

0.0.4

Aug 2, 2023

0.0.3.dev2 pre-release

Jul 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superduperdb-0.0.16.tar.gz (123.4 kB view hashes)

Uploaded Nov 9, 2023 Source

Built Distribution

superduperdb-0.0.16-py3-none-any.whl (150.9 kB view hashes)

Uploaded Nov 9, 2023 Python 3

Hashes for superduperdb-0.0.16.tar.gz

Hashes for superduperdb-0.0.16.tar.gz
Algorithm	Hash digest
SHA256	`9a395e96fbf1cc8ab7ab18cc9c11bd9a23c84d3a72d0e12c25531a297e33b32e`
MD5	`8b18f7e1e9f79a1d12abee6c8acf3349`
BLAKE2b-256	`bbdd6e5ee28944f69b5989a189c51d7e2ebcb9c77a6e8508ce24414375eeb007`

Hashes for superduperdb-0.0.16-py3-none-any.whl

Hashes for superduperdb-0.0.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c149d57f73ae5abc085fc78890c294daf0a547a632071c8b9e6e762e0e1253e4`
MD5	`8e7589859f66a2d844c9606a36b460d0`
BLAKE2b-256	`bc733c9f813f4da189190dfc2255c966aa5c4b8c5e20148644351e81eee1620a`

superduperdb 0.0.16

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Bring AI to your favorite database!

Docs | Blog | Showcases | Live Jupyter Demo

📢 Important Announcement !

⚡ Integrations (more coming soon):

Datastores

AI Frameworks

AI APIs

🔥 Featured Examples

🚀 Installation

1. Install SuperDuperDB via pip (~1 minute)

2. Try SuperDuperDB via Docker (~2 minutes):

📚 Tutorial

- Deploy ML/AI models to your database:

- Train models directly from your database.

- Vector-Search your data:

- Integrate AI APIs to work together with other models.

- Add a Llama 2 model directly into your database!:

- Use models outputs as inputs to downstream models:

🤝 Community & Getting Help

If you have any problems, questions, comments, or ideas:

🌱 Contributing

There are many ways to contribute, and they are not limited to writing code. We welcome all contributions such as:

❤️ Contributors

Thanks goes to these wonderful people:

License

Join Us

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

1. Install SuperDuperDB via `pip` (~1 minute)