Skip to main content

Tool for querying natural language on tabular data

Project description

tableQA

Tool for querying natural language on tabular data like csvs,excel sheet,etc.

Build Status

Features

  • Supports detection from multiple csvs
  • Support FuzzyString implementation. i.e, incomplete csv values in query can be automatically detected and filled in the query.
  • Open-Domain, No training required.
  • Add manual schema for customized experience
  • Auto-generate schemas in case schema not provided

Configuration:

install via pip:

pip install tableqa

installing from source:

git clone https://github.com/abhijithneilabraham/tableQA

cd tableqa

python setup.py install

Quickstart

Getting an SQL query from csv

from tableqa.agent import Agent
agent=Agent(data_dir) #specify the absolute path of the data directory.
print(agent.get_query("Your question here")) #returns an sql query

Do Sample query on database

response=agent.query_db("Your question here")  
print("Response ={}".format(response)) #returns the result of the sql query after feeding the csv to the database

Adding Manual schema

include the directory containing the schemas of the respective csvs, with the same filename. Refer cleaned_data and schema for examples.

Schema Format:
{
    "name": DATABASE NAME,
    "keywords":[DATABASE KEYWORDS],
    "columns":
    [
        {
        "name": COLUMN 1 NAME,
        "mapping":{
            CATEGORY 1: [CATEGORY 1 KEYWORDS],
            CATEGORY 2: [CATEGORY 2 KEYWORDS]
        }

        },
        {
        "name": COLUMN 2 NAME,
        "keywords": [COLUMN 2 KEYWORDS]
        },
        {
        "name": "COLUMN 3 NAME",
        "keywords": [COLUMN 3 KEYWORDS],
        "summable":"True"
        }
    ]
}

  • Mappings are for those columns whose values have only few distinct classes.
  • Include only the column names which need to have manual keywords or mappings.Rest will will be autogenerated.
  • summable is included for Numeric Type columns whose values are already count representations. Eg. Death Count,Cases etc. consists values which already represent a count.

Example (with manual schema):

SQL query
from tableqa.agent import Agent
agent=Agent(data_dir,schema_dir) 
print(agent.get_query("How many people died of stomach cancer in 2011")) 
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011" 
Database query
response=agent.query_db("how many people died of stomach cancer in 2011")
print("Response ={}".format(response)) #returns the result of the sql query after feeding the csv to the database
#Response =[(22,)]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tableqa-0.0.6.tar.gz (922.7 kB view details)

Uploaded Source

Built Distribution

tableqa-0.0.6-py3-none-any.whl (927.5 kB view details)

Uploaded Python 3

File details

Details for the file tableqa-0.0.6.tar.gz.

File metadata

  • Download URL: tableqa-0.0.6.tar.gz
  • Upload date:
  • Size: 922.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.6.tar.gz
Algorithm Hash digest
SHA256 cdf8f53770bbe1a126e1df7823fa263f89bafdc769bd4380111da0011fbacff6
MD5 9cd9a507c03129aeaeb85e11d2696397
BLAKE2b-256 c5fde4c75aef8a0e8ca18f4b572893c8f28c793b30254880be98ed414de9afd8

See more details on using hashes here.

File details

Details for the file tableqa-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: tableqa-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 927.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1ea87f886447a0de1b732166fff654b21c311c78431c8afa66305ed70858d51a
MD5 9445691220b71313f88ade8d1c75249a
BLAKE2b-256 f10d0abfa56acd21f46222b65c52341f053ec468940e86aa72ff9fac4a5c23ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page