Skip to main content

Tool for querying natural language on tabular data

Project description

tableQA

Tool for querying natural language on tabular data like csvs,excel sheet,etc.

Build Status Open In Colab

Features

  • Supports detection from multiple csvs
  • Supports FuzzyString implementation. i.e, incomplete csv values in query can be automatically detected and filled in the query.
  • Open-Domain, No training required.
  • Add manual schema for customized experience
  • Auto-generate schemas in case schema not provided
  • Data visualisations

Configuration:

install via pip:

pip install tableqa

installing from source:

git clone https://github.com/abhijithneilabraham/tableQA

cd tableqa

python setup.py install

Quickstart

Do sample query

from tableqa.agent import Agent
agent=Agent(df) #input your dataframe
response=agent.query_db("Your question here")
print(response)

Get an SQL query from the question

sql=agent.get_query("Your question here")  
print(sql) #returns an sql query

Adding Manual schema

Schema Format:
{
    "name": DATABASE NAME,
    "keywords":[DATABASE KEYWORDS],
    "columns":
    [
        {
        "name": COLUMN 1 NAME,
        "mapping":{
            CATEGORY 1: [CATEGORY 1 KEYWORDS],
            CATEGORY 2: [CATEGORY 2 KEYWORDS]
        }

        },
        {
        "name": COLUMN 2 NAME,
        "keywords": [COLUMN 2 KEYWORDS]
        },
        {
        "name": "COLUMN 3 NAME",
        "keywords": [COLUMN 3 KEYWORDS],
        "summable":"True"
        }
    ]
}

  • Mappings are for those columns whose values have only few distinct classes.
  • Include only the column names which need to have manual keywords or mappings.Rest will will be autogenerated.
  • summable is included for Numeric Type columns whose values are already count representations. Eg. Death Count,Cases etc. consists values which already represent a count.

Example (with manual schema):

Database query
from tableqa.agent import Agent
agent=Agent(df,schema) #pass the dataframe and schema objects
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
SQL query
sql=agent.get_query("How many people died of stomach cancer in 2011")
print(sql)
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011"

Multiple CSVs

Pass the absolute path of the directories containing the csvs and schemas respectively. Refer cleaned_data and schema for examples.

Example
csv_path="/content/tableQA/tableqa/cleaned_data"
schema_path="/content/tableQA/tableqa/schema"
agent=Agent(csv_path,schema_path)

Join us

Join our slack workspace:Slack

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tableqa-0.0.10.tar.gz (928.3 kB view details)

Uploaded Source

Built Distribution

tableqa-0.0.10-py3-none-any.whl (930.2 kB view details)

Uploaded Python 3

File details

Details for the file tableqa-0.0.10.tar.gz.

File metadata

  • Download URL: tableqa-0.0.10.tar.gz
  • Upload date:
  • Size: 928.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.10.tar.gz
Algorithm Hash digest
SHA256 caccc2641f40f3f6c8bd842d398b082653fad17bc867ceacf55a24b8fbc0fce7
MD5 2504dbfbf13850de198b24208c2bf097
BLAKE2b-256 b0bb85adaf68768df821f54e649f95ec66ebd43f6ce5c9eea149ad21a3f40cd6

See more details on using hashes here.

File details

Details for the file tableqa-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: tableqa-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 930.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 f04fbe394ac91bcc320cbccf6365f6590cdd4d8932da7544e8b064f986e3304a
MD5 446342dd4ac83955aa3421f6e5174637
BLAKE2b-256 3ef08a3d21f33b39a487a790ca8ac49d9c7c5a1a20834eaca84ef1cf25eadf1e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page