Generate SQL queries from natural language

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

GitHub	PyPI	Colab	Documentation

Vanna.AI - Personalized AI SQL Agent

Let Vanna.AI write your nasty SQL for you. Vanna is a Python based AI SQL agent trained on your schema that writes complex SQL in seconds. pip install vanna to get started now.

https://github.com/vanna-ai/vanna-py/assets/7146154/61f5f0bf-ce03-47e2-ab95-0750b8df7b6f

An example

A business user asks you "who are the top 2 customers in each region?". Right in the middle of lunch. And they need it for a presentation this afternoon. 😡😡😡

The old way 😡 😫 💩

Simple question to ask, not so fun to answer. You spend over an hour a) finding the tables, b) figuring out out the joins, c) look up the syntax for ranking, d) putting this into a CTE, e) filtering by rank, and f) choosing the correct metrics. Finally, you come up with this ugly mess -

with ranked_customers as (SELECT c.c_name as customer_name,
  r.r_name as region_name,
  row_number() OVER (PARTITION BY r.r_name
     ORDER BY sum(l.l_quantity * l.l_extendedprice) desc) as rank	
     FROM   snowflake_sample_data.tpch_sf1.customer c join snowflake_sample_data.tpch_sf1.orders o
         ON c.c_custkey = o.o_custkey join snowflake_sample_data.tpch_sf1.lineitem l
         ON o.o_orderkey = l.l_orderkey join snowflake_sample_data.tpch_sf1.nation n
         ON c.c_nationkey = n.n_nationkey join snowflake_sample_data.tpch_sf1.region r
         ON n.n_regionkey = r.r_regionkey
             GROUP BY customer_name, region_name)
SELECT region_name,
       customer_name
FROM   ranked_customers
WHERE  rank <= 2;

And you had to skip your lunch. HANGRY!

The Vanna way 😍 🌟 🚀

With Vanna, you train up a custom model on your data warehouse, and simply enter this in your Jupyter Notebook -

import vanna as vn
vn.set_model('your-model')
vn.ask('who are the top 2 customers in each region?')

Vanna generates that nasty SQL above for you, runs it (locally & securely) and gives you back a Dataframe in seconds:

region_name	customer_name	total_sales
ASIA	Customer#000000001	68127.72
ASIA	Customer#000000002	65898.69
...

And you ate your lunch in peace. YUMMY!

How Vanna works

Vanna works in two easy steps - train a model on your data, and then ask questions.

Train a model on your data.
Ask questions.

When you ask a question, we utilize a custom model for your dataset to generate SQL, as seen below. Your model performance and accuracy depends on the quality and quantity of training data you use to train your model. how-vanna-works

Why Vanna?

High accuracy on complex datasets.
- Vanna’s capabilities are tied to the training data you give it
- More training data means better accuracy for large and complex datasets
Secure and private.
- Your database contents are never sent to Vanna’s servers
- We only see the bare minimum - schemas & queries.
Isolated, custom model.
- You train a custom model specific to your database and your schema.
- Nobody else can use your model or view your model’s training data unless you choose to add members to your model or make it public
- We use a combination of third-party foundational models (OpenAI, Google) and our own LLM.
Self learning.
- As you use Vanna more, your model continuously improves as we augment your training data
Supports many databases.
- We have out-of-the-box support Snowflake, BigQuery, Postgres
- You can easily make a connector for any database
Pretrained models.
- If you’re a data provider you can publish your models for anyone to use
- As part of our roadmap, we are in the process of pre-training models for common datasets (Google Ads, Facebook ads, etc)
Choose your front end.
- Start in a Jupyter Notebook.
- Expose to business users via Slackbot, web app, Streamlit app, or Excel plugin.
- Even integrate in your web app for customers.

Getting started

You can start by automatically training Vanna (currently works for Snowflake) or add manual training data.

Train with DDL Statements

If you prefer to manually train, you do not need to connect to a database. You can use the train function with other parmaeters like ddl

vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS my-table (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        age INT
    )
""")

Train with Documentation

Sometimes you may want to add documentation about your business terminology or definitions.

vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")

Train with SQL

You can also add SQL queries to your training data. This is useful if you have some queries already laying around. You can just copy and paste those from your editor to begin generating new SQL.

vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")

Asking questions

vn.ask("What are the top 10 customers by sales?")

SELECT c.c_name as customer_name,
       sum(l.l_extendedprice * (1 - l.l_discount)) as total_sales
FROM   snowflake_sample_data.tpch_sf1.lineitem l join snowflake_sample_data.tpch_sf1.orders o
        ON l.l_orderkey = o.o_orderkey join snowflake_sample_data.tpch_sf1.customer c
        ON o.o_custkey = c.c_custkey
GROUP BY customer_name
ORDER BY total_sales desc limit 10;

	CUSTOMER_NAME	TOTAL_SALES
0	Customer#000143500	6757566.0218
1	Customer#000095257	6294115.3340
2	Customer#000087115	6184649.5176
3	Customer#000131113	6080943.8305
4	Customer#000134380	6075141.9635
5	Customer#000103834	6059770.3232
6	Customer#000069682	6057779.0348
7	Customer#000102022	6039653.6335
8	Customer#000098587	6027021.5855
9	Customer#000064660	5905659.6159

png

AI-generated follow-up questions:

What is the country name for each of the top 10 customers by sales?
How many orders does each of the top 10 customers by sales have?
What is the total revenue for each of the top 10 customers by sales?
What are the customer names and total sales for customers in the United States?
Which customers in Africa have returned the most parts with a gross value?
What are the total sales for the top 3 customers?
What are the customer names and total sales for the top 5 customers?
What are the total sales for customers in Europe?
How many customers are there in each country?

More resources

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

2.0.2

Feb 2, 2026

2.0.1

Nov 20, 2025

2.0.0

Nov 19, 2025

2.0.0rc1 pre-release

Nov 15, 2025

0.7.9

Apr 10, 2025

0.7.8

Apr 8, 2025

0.7.7

Apr 7, 2025

0.7.6

Feb 8, 2025

0.7.5

Oct 25, 2024

0.7.4

Oct 23, 2024

0.7.3

Sep 16, 2024

0.7.2

Sep 5, 2024

0.7.1

Aug 23, 2024

0.7.0

Aug 21, 2024

0.6.6

Aug 9, 2024

0.6.5

Aug 6, 2024

0.6.4

Jul 26, 2024

0.6.3

Jul 12, 2024

0.6.2

Jun 18, 2024

0.6.1

Jun 18, 2024

0.6.0

Jun 7, 2024

0.5.5

May 20, 2024

0.5.4

May 7, 2024

0.5.3

May 6, 2024

0.5.2

May 6, 2024

0.5.1

May 2, 2024

0.5.0

Apr 30, 2024

0.4.3

Apr 22, 2024

0.4.2

Apr 21, 2024

0.4.1

Apr 17, 2024

0.4.0

Apr 16, 2024

0.3.4

Apr 12, 2024

0.3.3

Apr 5, 2024

0.3.2

Apr 1, 2024

0.3.1

Mar 29, 2024

0.3.0

Mar 27, 2024

0.2.1

Mar 5, 2024

0.2.0

Mar 2, 2024

0.1.1

Feb 21, 2024

0.1.0

Feb 14, 2024

0.0.38

Jan 28, 2024

0.0.37

Jan 27, 2024

0.0.36

Jan 23, 2024

0.0.35

Jan 17, 2024

0.0.34

Jan 17, 2024

0.0.33

Jan 17, 2024

0.0.32

Jan 16, 2024

0.0.31

Jan 8, 2024

0.0.30

Dec 22, 2023

0.0.29

Dec 17, 2023

0.0.28

Nov 20, 2023

0.0.27

Oct 30, 2023

0.0.26

Sep 28, 2023

0.0.25

Sep 22, 2023

0.0.24

Sep 13, 2023

0.0.23

Aug 31, 2023

0.0.22

Aug 31, 2023

This version

0.0.21

Aug 4, 2023

0.0.20

Aug 4, 2023

0.0.19

Aug 3, 2023

0.0.18

Aug 2, 2023

0.0.17

Jul 29, 2023

0.0.16

Jul 28, 2023

0.0.15

Jul 25, 2023

0.0.14

Jul 22, 2023

0.0.13

Jul 21, 2023

0.0.12

Jul 20, 2023

0.0.11

Jul 20, 2023

0.0.10

Jul 19, 2023

0.0.9

Jul 19, 2023

0.0.8

Jul 12, 2023

0.0.7

Jul 7, 2023

0.0.6

Jul 7, 2023

0.0.5

Jul 7, 2023

0.0.4

Jul 1, 2023

0.0.3

Jun 23, 2023

0.0.2

Jun 21, 2023

0.0.1

May 13, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vanna-0.0.21.tar.gz (20.8 kB view details)

Uploaded Aug 4, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vanna-0.0.21-py3-none-any.whl (18.5 kB view details)

Uploaded Aug 4, 2023 Python 3

File details

Details for the file vanna-0.0.21.tar.gz.

File metadata

Download URL: vanna-0.0.21.tar.gz
Upload date: Aug 4, 2023
Size: 20.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for vanna-0.0.21.tar.gz
Algorithm	Hash digest
SHA256	`a08705ff83cb30fe8d43ed952b9702b2f8fcb3a68c53976400b6b20dd360c770`
MD5	`56914ba3d837638f9f3adc1b9e5e45d7`
BLAKE2b-256	`728e74fb0d00b90bc407897744817e96731fb7589e2b595b94b98b20b0ef73e0`

See more details on using hashes here.

File details

Details for the file vanna-0.0.21-py3-none-any.whl.

File metadata

Download URL: vanna-0.0.21-py3-none-any.whl
Upload date: Aug 4, 2023
Size: 18.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for vanna-0.0.21-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fbc47e0ff9a9621d36f11584a89f08b8002179e6c968d9ba9fe812189968e02d`
MD5	`298b1274049fcf8285155630b0f74641`
BLAKE2b-256	`06d2131243620847989a10395f522acdbef439c4c85138cb66dff7da6f2b2817`

See more details on using hashes here.

vanna 0.0.21

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Vanna.AI - Personalized AI SQL Agent

An example

The old way 😡 😫 💩

The Vanna way 😍 🌟 🚀

How Vanna works

Why Vanna?

Getting started

Train with DDL Statements

Train with Documentation

Train with SQL

Asking questions

More resources

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes