Tool for querying natural language on tabular data
Project description
tableQA
Tool for querying natural language on tabular data like csvs,excel sheet,etc.
Features
- Supports detection from multiple csvs
- Supports FuzzyString implementation. i.e, incomplete csv values in query can be automatically detected and filled in the query.
- Open-Domain, No training required.
- Add manual schema for customized experience
- Auto-generate schemas in case schema not provided
- Data visualisations
Configuration:
install via pip:
pip install tableqa
installing from source:
git clone https://github.com/abhijithneilabraham/tableQA
cd tableqa
python setup.py install
Quickstart
Do sample query
from tableqa.agent import Agent
agent=Agent(df) #input your dataframe
response=agent.query_db("Your question here")
print(response)
Get an SQL query from the question
sql=agent.get_query("Your question here")
print(sql) #returns an sql query
Adding Manual schema
Schema Format:
{
"name": DATABASE NAME,
"keywords":[DATABASE KEYWORDS],
"columns":
[
{
"name": COLUMN 1 NAME,
"mapping":{
CATEGORY 1: [CATEGORY 1 KEYWORDS],
CATEGORY 2: [CATEGORY 2 KEYWORDS]
}
},
{
"name": COLUMN 2 NAME,
"keywords": [COLUMN 2 KEYWORDS]
},
{
"name": "COLUMN 3 NAME",
"keywords": [COLUMN 3 KEYWORDS],
"summable":"True"
}
]
}
- Mappings are for those columns whose values have only few distinct classes.
- Include only the column names which need to have manual keywords or mappings.Rest will will be autogenerated.
summable
is included for Numeric Type columns whose values are already count representations. Eg.Death Count,Cases
etc. consists values which already represent a count.
Example (with manual schema):
Database query
from tableqa.agent import Agent
agent=Agent(df,schema) #pass the dataframe and schema objects
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
SQL query
sql=agent.get_query("How many people died of stomach cancer in 2011")
print(sql)
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011"
Multiple CSVs
Pass the absolute path of the directories containing the csvs and schemas respectively. Refer cleaned_data and schema for examples.
Example
csv_path="/content/tableQA/tableqa/cleaned_data"
schema_path="/content/tableQA/tableqa/schema"
agent=Agent(csv_path,schema_path)
Join us
Join our slack workspace:Slack
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tableqa-0.0.10.tar.gz
(928.3 kB
view hashes)
Built Distribution
tableqa-0.0.10-py3-none-any.whl
(930.2 kB
view hashes)