Skip to main content

Query local or remote data files with natural language queries powered by OpenAI and DuckDB.

Project description

qabot

Query local or remote files with natural language queries powered by langchain and gpt-3.5-turbo and duckdb ๐Ÿฆ†.

Will query Wikidata and local files.

Usage:

$ EXPORT OPENAI_API_KEY=sk-...
$ EXPORT QABOT_MODEL_NAME=gpt-4
$ qabot -q "How many Hospitals are there located in Beijing"
Total tokens 1773 approximate cost in USD: 0.05634

Result:
There are 39 hospitals located in Beijing.
 ๐Ÿš€ anything else I can help you with?: what are the star war films
Query: what are the star war films
Intermediate Steps: 
  Step 1

    wikidata
      SELECT DISTINCT ?film ?filmLabel WHERE { ?film wdt:P31 wd:Q11424; wdt:P179 wd:Q22092344. SERVICE wikibase:label { bd:serviceParam wikibase:language '[AUTO_LANGAGE],en'. } } ORDER BY ?film

Total tokens 4099 approximate cost in USD: 0.13305


Result:
The Star Wars films are: 1. Star Wars: Episode I โ€“ The Phantom Menace, 2. Star Wars: Episode II โ€“ Attack of the Clones, 3. Star Wars: Episode III โ€“ Revenge of the Sith, 4. Star Wars: Episode IV โ€“ A
New Hope, 5. Star Wars: Episode V โ€“ The Empire Strikes Back, 6. Star Wars: Episode VI โ€“ Return of the Jedi, 7. Star Wars: Episode VII โ€“ The Force Awakens, 8. Star Wars: Episode VIII โ€“ The Last 
Jedi, and 9. Star Wars Episode IX: The Rise of Skywalker.

Works on local CSV files:

remote CSV files:

$ qabot \
    -f https://www.stats.govt.nz/assets/Uploads/Environmental-economic-accounts/Environmental-economic-accounts-data-to-2020/renewable-energy-stock-account-2007-2020-csv.csv \
    -q "How many Gigawatt hours of generation was there for Solar resources in 2015 through to 2020?"

Even on (public) data stored in S3:

You can even load data from disk via the natural language query, but that doesn't always work...

"Load the file 'data/titanic_survival.parquet' into a table called 'raw_passengers'. Create a view of the raw passengers table for just the male passengers. What was the average fare for surviving male passengers?"

After a bit of back and forth with the model, it gets there:

The average fare for surviving male passengers from the 'male_passengers' view where the passenger survived is 40.82. I ran the query: SELECT AVG(Fare) FROM male_passengers WHERE Survived = 1 AND Sex = 'male'; The average fare for surviving male passengers is 40.82.

Quickstart

You need to set the OPENAI_API_KEY environment variable to your OpenAI API key, which you can get from here.

Install the qabot command line tool using pip/poetry:

$ pip install qabot

Then run the qabot command with either local files (-f my-file.csv) or a database connection string.

Note if you want to use a database, you will need to install the relevant drivers, e.g. pip install psycopg2-binary for postgres.

Examples

Local CSV file/s

$ qabot -q "how many passengers survived by gender?" -f data/titanic.csv
๐Ÿฆ† Loading data from files...
Loading data/titanic.csv into table titanic...

Query: how many passengers survived by gender?
Result:
There were 233 female passengers and 109 male passengers who survived.


 ๐Ÿš€ any further questions? [y/n] (y): y

 ๐Ÿš€ Query: what was the largest family who did not survive? 
Query: what was the largest family who did not survive?
Result:
The largest family who did not survive was the Sage family, with 8 members.

 ๐Ÿš€ any further questions? [y/n] (y): n

Intermediate steps and database queries

Use the -v flag to see the intermediate steps and database queries.

Sometimes it takes a long route to get to the answer, but it's interesting to see how it gets there:

qabot -f data/titanic.csv -q "how many passengers survived by gender?" -v
๐Ÿฆ† Loading data from files...
Query: how many passengers survived by gender?
I need to check the columns in the 'titanic' table to see which ones contain gender and survival information.
Action: Describe Table
Action Input: titanic

Observation: titanic

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ column_name โ”‚ column_type โ”‚  null   โ”‚   key   โ”‚ default โ”‚ extra โ”‚
โ”‚   varchar   โ”‚   varchar   โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ int32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ PassengerId โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Survived    โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Pclass      โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Name        โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Sex         โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Age         โ”‚ DOUBLE      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ SibSp       โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Parch       โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Ticket      โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Fare        โ”‚ DOUBLE      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Cabin       โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Embarked    โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 12 rows                                               6 columns โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

I need to create a view that only includes the columns I need for this question.
Action: Data Op
Action Input: 
        CREATE VIEW titanic_gender_survival AS
        SELECT Sex, Survived
        FROM titanic
Thought:

> Entering new AgentExecutor chain...
This is a valid SQL query creating a view. We can execute it directly.
Action: execute
Action Input: 
        CREATE VIEW titanic_gender_survival AS
        SELECT Sex, Survived
        FROM titanic
Observation: No output
Thought:The view has been created successfully. We can now query it.
Action: execute
Action Input: SELECT * FROM titanic_gender_survival LIMIT 5
Observation: 
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Sex   โ”‚ Survived โ”‚
โ”‚ varchar โ”‚  int64   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ male    โ”‚        0 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ male    โ”‚        0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Thought:The view has been created successfully and we can see the first 5 rows of the view. The final answer is the first 5 rows of the titanic_gender_survival view, showing the sex and survival status of passengers on the 
Titanic.
Final Answer: 
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Sex   โ”‚ Survived โ”‚
โ”‚ varchar โ”‚  int64   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ male    โ”‚        0 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ male    โ”‚        0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

> Finished chain.

Observation: โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Sex   โ”‚ Survived โ”‚
โ”‚ varchar โ”‚  int64   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ male    โ”‚        0 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ female  โ”‚        1 โ”‚
โ”‚ male    โ”‚        0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
I need to group the data by gender and count the number of survivors for each group.
Action: Data Op
Action Input: 
        SELECT Sex, SUM(Survived) AS num_survived
        FROM titanic_gender_survival
        GROUP BY Sex
Thought:

> Entering new AgentExecutor chain...
We need to check if the table titanic_gender_survival exists and if it has the columns Sex and Survived.
Action: Describe Table
Action Input: titanic_gender_survival
Observation: titanic_gender_survival

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ column_name โ”‚ column_type โ”‚  null   โ”‚   key   โ”‚ default โ”‚ extra โ”‚
โ”‚   varchar   โ”‚   varchar   โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ int32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Sex         โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ”‚ Survived    โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Thought:The table titanic_gender_survival exists and has the columns Sex and Survived. We can now run the query.
Action: execute
Action Input: 

SELECT Sex, SUM(Survived) AS num_survived FROM titanic_gender_survival GROUP BY Sex LIMIT 5


Observation: 
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Sex   โ”‚ num_survived โ”‚
โ”‚ varchar โ”‚    int128    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ male    โ”‚          109 โ”‚
โ”‚ female  โ”‚          233 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Thought:The query returned the number of survivors grouped by gender. The table titanic_gender_survival has been used. 
Final Answer: The number of survivors grouped by gender are: 
- 109 males survived
- 233 females survived.

> Finished chain.

Observation: The number of survivors grouped by gender are: 
- 109 males survived
- 233 females survived.
Intermediate Steps: 
  Step 1

    Describe Table
      titanic

      titanic

    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ column_name โ”‚ column_type โ”‚  null   โ”‚   key   โ”‚ default โ”‚ extra โ”‚
    โ”‚   varchar   โ”‚   varchar   โ”‚ varchar โ”‚ varchar โ”‚ varchar โ”‚ int32 โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ PassengerId โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Survived    โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Pclass      โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Name        โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Sex         โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Age         โ”‚ DOUBLE      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ SibSp       โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Parch       โ”‚ BIGINT      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Ticket      โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Fare        โ”‚ DOUBLE      โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Cabin       โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”‚ Embarked    โ”‚ VARCHAR     โ”‚ YES     โ”‚ NULL    โ”‚ NULL    โ”‚  NULL โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ 12 rows                                               6 columns โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    

  Step 2

    Data Op
      CREATE VIEW titanic_gender_survival AS
            SELECT Sex, Survived
            FROM titanic

      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚   Sex   โ”‚ Survived โ”‚
    โ”‚ varchar โ”‚  int64   โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ male    โ”‚        0 โ”‚
    โ”‚ female  โ”‚        1 โ”‚
    โ”‚ female  โ”‚        1 โ”‚
    โ”‚ female  โ”‚        1 โ”‚
    โ”‚ male    โ”‚        0 โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    

  Step 3

    Data Op
      SELECT Sex, SUM(Survived) AS num_survived
            FROM titanic_gender_survival
            GROUP BY Sex

      The number of survivors grouped by gender are: 
    - 109 males survived
    - 233 females survived.

    


Thought:


Result:
109 males and 233 females survived.

Data accessed via http/s3

Use the -f <url> flag to load data from a url, e.g. a csv file on s3:

$ qabot -f s3://covid19-lake/enigma-jhu-timeseries/csv/jhu_csse_covid_19_timeseries_merged.csv -q "how many confirmed cases of covid are there?" -v
๐Ÿฆ† Loading data from files...
create table jhu_csse_covid_19_timeseries_merged as select * from 's3://covid19-lake/enigma-jhu-timeseries/csv/jhu_csse_covid_19_timeseries_merged.csv';

Result:
264308334 confirmed cases

Links

Ideas

  • Upgrade to use langchain chat interface
  • Use memory, perhaps wait for langchain's next release
  • Decent Python Library API so can be used from other Python code
  • streaming mode to output results as they come in
  • token limits
  • Supervisor agent - assess whether a query is "safe" to run, could ask for user confirmation to run anything that gets flagged.
  • Often we can zero-shot the question and get a single query out - perhaps we try this before the MKL chain
  • test each zeroshot agent individually
  • Generate and pass back assumptions made to the user
  • Add an optional "clarify" tool to the chain that asks the user to clarify the question
  • Create a query checker tool that checks if the query looks valid and/or safe
  • Perhaps an explain query tool that shows the steps taken to get the answer
  • Store all queries, actions, and answers in a table
  • Optional settings to switch to different LLM
  • Inject AWS credentials into duckdb so we can access private resources in S3
  • caching
  • A version that uses document embeddings - probably not in this app as needs Torch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qabot-0.2.15.tar.gz (24.2 kB view hashes)

Uploaded Source

Built Distribution

qabot-0.2.15-py3-none-any.whl (25.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page