EvaDB AI-Relational Database System

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

EvaDB: Database System for AI Apps

EvaDB is a database system for building simpler and faster AI-powered applications.

EvaDB is a database system for developing AI apps. We aim to simplify the development and deployment of AI apps that operate on unstructured data (text documents, videos, PDFs, podcasts, etc.) and structured data (tables, vector index).

The high-level Python and SQL APIs allow beginners to use EvaDB in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library. EvaDB is fully implemented in Python and licensed under an Apache license.

Quick Links

Features
Quick Start
Documentation
Community and Support
Twitter

Features

🔮 Build simpler AI-powered apps using Python functions or SQL queries
⚡️ 10x faster applications using AI-centric query optimization
💰 Save money spent on inference
🚀 First-class support for your custom deep learning models through user-defined functions
📦 Built-in caching to eliminate redundant model invocations across queries
⌨️ Integrations for PyTorch, Hugging Face, YOLO, and Open AI models
🐍 Installable via pip and fully implemented in Python

Illustrative Applications

Here are some illustrative AI apps built using EvaDB (each notebook can be opened on Google Colab):

🔮 PrivateGPT
🔮 ChatGPT-based Video Question Answering
🔮 Querying PDF Documents
🔮 Analysing Traffic Flow with YOLO
🔮 Examining Emotions of Movie
🔮 Image Segmentation with Hugging Face

Documentation

Documentation
- The Getting Started page shows how you can use EvaDB for different AI tasks and how you can easily extend EvaDB to support your custom deep learning model through user-defined functions.
- The User Guides section contains Jupyter Notebooks that demonstrate how to use various features of EvaDB. Each notebook includes a link to Google Colab, where you can run the code yourself.
Join us on Slack
Follow us on Twitter
Roadmap

Quick Start

Step 1: Install EvaDB using pip. EvaDB supports Python versions >= 3.8:

pip install evadb

Step 2: It's time to write an AI app.

import evadb

# Grab a EvaDB cursor to load data into tables and run AI queries
cursor = evadb.connect().cursor()

# Load a collection of news videos into the 'news_videos' table
# This function returns a Pandas dataframe with the query's output
# In this case, the output dataframe indicates the number of loaded videos
cursor.load(
    file_regex="news_videos/*.mp4",
    format="VIDEO",
    table_name="news_videos"
).df()

# Define a function that wraps around your deep learning model
# Here, this function wraps around a speech-to-text model
# After registering the function, we can use the registered function in subsequent queries
cursor.create_function(
    udf_name="SpeechRecognizer",
    type="HuggingFace",
    task='automatic-speech-recognition',
    model='openai/whisper-base'
).df()

# EvaDB automatically extracts the audio from the video
# We only need to run the SpeechRecongizer function on the 'audio' column
# to get the transcript and persist it in a table called 'transcripts'
cursor.query(
    """CREATE TABLE transcripts AS
       SELECT SpeechRecognizer(audio) from news_videos;"""
).df()

# We next incrementally construct the ChatGPT query using EvaDB's Python API
# The query is based on the 'transcripts' table
# This table has a column called 'text' with the transcript text
query = cursor.table('transcripts')

# Since ChatGPT is a built-in function, we don't have to define it
# We can just directly use it in the query
# We need to set the OPENAI_KEY as an environment variable
os.environ["OPENAI_KEY"] = OPENAI_KEY
query = query.select("ChatGPT('Is this video summary related to LLMs', text)")

# Finally, we run the query to get the results as a dataframe
# You can then post-process the dataframe using other Python libraries
response = query.df()

Incrementally build an AI query that chains together multiple models

Here is a AI query that analyses emotions of actors in an Interstellar movie clip using multiple PyTorch models.

# Access the Interstellar movie clip table using a cursor
query = cursor.table("Interstellar")
# Get faces using a `FaceDetector` function
query = query.cross_apply("UNNEST(FaceDetector(data))", "Face(bounding_box, confidence)")
# Focus only on frames 100 through 200 in the clip
query = query.filter("id > 100 AND id < 200")
# Get the emotions of the detected faces using a `EmotionDetector` function
query = query.select("id, bbox, EmotionDetector(Crop(data, bounding_box))")

# Run the query and get the query result as a dataframe
# At each of the above steps, you can run the query and see the output
# If you are familiar with SQL, you can get the SQL query with query.sql_query()
response = query.df()

EvaDB runs AI apps 10x faster using its AI-centric query optimizer.

Three key built-in optimizations are:

💾 Caching: EvaDB automatically caches and reuses model inference results.

⚡️ Parallel Query Execution: EvaDB runs the app in parallel on all the available hardware resources (CPUs and GPUs).

🎯 Model Ordering: EvaDB optimizes the order in which models are evaluated (e.g., runs the faster, more selective model first).

Architecture Diagram

This diagram presents the key components of EvaDB. EvaDB's AI-centric query optimizer takes a query as input and generates a query plan that is executed by the query engine. The query engine hits the relevant storage engines to quickly retrieve the data required for efficiently running the query:

Structured data (SQL database system connected via sqlalchemy).
Unstructured media data (PDFs, videos, etc. on cloud/local filesystem).
Feature data (vector database system).

Screenshots

🔮 Traffic Analysis (Object Detection Model)

Source Video	Query Result

🔮 PDF Question Answering (Question Answering Model)

App

🔮 MNIST Digit Recognition (Image Classification Model)

Source Video	Query Result

🔮 Movie Emotion Analysis (Face Detection + Emotion Classification Models)

Source Video	Query Result

🔮 License Plate Recognition (Plate Detection + OCR Extraction Models)

Query Result

Community and Support

👋 If you have general questions about EvaDB, want to say hello or just follow along, we'd like to invite you to join our Slack Community and to follow us on Twitter.

If you run into any problems or issues, please create a Github issue and we'll try our best to help.

Don't see a feature in the list? Search our issue tracker if someone has already requested it and add a comment to it explaining your use-case, or open a new issue if not. We prioritize our roadmap based on user feedback, so we'd love to hear from you.

Contributing

EvaDB is the beneficiary of many contributors. All kinds of contributions to EvaDB are appreciated. To file a bug or to request a feature, please use GitHub issues. Pull requests are welcome.

For more information, see our contribution guide.

Star History

License

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.9

Nov 19, 2023

0.3.8

Oct 18, 2023

0.3.7

Sep 30, 2023

0.3.6

Sep 21, 2023

0.3.4.post0

Sep 7, 2023

0.3.4

Sep 6, 2023

0.3.3

Aug 29, 2023

This version

0.3.2

Aug 26, 2023

0.3.1

Jun 30, 2023

0.3.0

Jun 28, 2023

0.2.15

Jun 26, 2023

0.2.14

Jun 24, 2023

0.2.14a0 pre-release

Jun 24, 2023

0.2.13

Jun 17, 2023

0.2.12

Jun 17, 2023

0.2.11

Jun 11, 2023

0.2.10

Jun 11, 2023

0.2.9

Jun 9, 2023

0.2.8

Jun 9, 2023

0.2.7

Jun 8, 2023

0.2.6

Jun 3, 2023

0.2.5

Jun 3, 2023

0.2.4

May 18, 2023

0.2.3.post0

May 13, 2023

0.2.3

May 12, 2023

0.2.1

Apr 25, 2023

0.2.0

Apr 17, 2023

0.1.6

Apr 5, 2023

0.1.5

Apr 3, 2023

0.1.4

Jan 28, 2023

0.1.3

Jan 2, 2023

0.1.2

Dec 17, 2022

0.1.1

Nov 21, 2022

0.1.0

Nov 12, 2022

0.0.12

Oct 19, 2022

0.0.10

Sep 22, 2022

0.0.9

Aug 14, 2022

0.0.8

Aug 13, 2022

0.0.6

Aug 5, 2022

0.0.5

Aug 5, 2022

0.0.3

Jul 31, 2022

0.0.2

Jun 20, 2022

0.0.1

May 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evadb-0.3.2.tar.gz (353.1 kB view hashes)

Uploaded Aug 26, 2023 Source

Built Distribution

evadb-0.3.2-py3-none-any.whl (573.5 kB view hashes)

Uploaded Aug 26, 2023 Python 3

Hashes for evadb-0.3.2.tar.gz

Hashes for evadb-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`b932c1954745e638d4b67737b1f5b842bef6965cd1c4f406c107520bae9063d5`
MD5	`706b18849c1d3c3d74806459278b54d3`
BLAKE2b-256	`e23fa9cca7069e2810d096f273c99ac5877c370cf1fabb1180f34e3562d35bb1`

Hashes for evadb-0.3.2-py3-none-any.whl

Hashes for evadb-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ad0cbab3b2e4d29f7743ec50707dded7b22d96ede81569204b8c1d475a3c47a5`
MD5	`0486cd1699217469594838f14d8206c2`
BLAKE2b-256	`0e13317ad8ca789acde6e522153ec349e9a2f58ba5531493d9693c640636ea93`