Structured and Unstructured Query Language (SUQL) Python API
Project description
SUQL (Structured and Unstructured Query Language)
Conversational Search over Structured and Unstructured Data with LLMs
Online demo:
https://yelpbot.genie.stanford.edu
What is SUQL
SUQL stands for Structured and Unstructured Query Language. It augments SQL with several important free text primitives for a precise, succinct, and expressive representation. It can be used to build chatbots for relational data sources that contain both structured and unstructured information. Similar to how text-to-SQL has seen great success, SUQL can be used as the semantic parsing target language for hybrid databases, for instance, for:
Several important features:
- SUQL seamlessly integrates retrieval models, LLMs, and traditional SQL to deliver a clean, effective interface for hybrid data access;
- It utilizes techniques inherent to each component: retrieval model and LM for unstructured data and relational SQL for structured data;
- Index of free text fields built with faiss, natively supporting all your favorite dense vector processing methods, e.g. product quantizer, HNSW, etc.;
- A series of important optimizations to minimize expensive LLM calls;
- Scalability to large databases with PostgreSQL;
- Support for general SQLs, e.g. JOINs, GROUP BYs.
The answer function
One important component of SUQL is the answer
function. answer
function allows for constraints from free text to be easily combined with structured constraints. Here is one high-level example:
For more details, see our paper at https://arxiv.org/abs/2311.09818.
Installation / Usage tutorial
There are two main ways of installing the SUQL library.
Install from pip
Ideal for integrating the SUQL compiler in a larger codebase / system. See install_pip.md for details.
Install from source
Ideal for using this repo to build a SUQL-powered conversational interface to your data out-of-the-box, like the one for https://yelpbot.genie.stanford.edu discussed in the paper. See install_source.md for details.
Agent tutorial
Check out conv_agent.md for more information on best practices for using SUQL to power your conversational agent.
Release notes
Check release_notes.md for new release notes.
Bugs / Contribution
If you encounter a problem, first check known_issues.md. If it is not listed there, we welcome Issues and/or PRs!
Paper results
To replicate our results on HybridQA and restaurants in our paper, see paper_results.md for details.
Citation
If you find this work useful to you, please consider citing us.
@inproceedings{liu2024suql,
title={SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models},
author={Shicheng Liu and Jialiang Xu and Wesley Tjangnaka and Sina J. Semnani and Chen Jie Yu and Monica S. Lam},
booktitle = {Findings of the Association for Computational Linguistics: NAACL 2024},
year={2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file suql-1.1.6.tar.gz
.
File metadata
- Download URL: suql-1.1.6.tar.gz
- Upload date:
- Size: 46.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e89baec987725ab7bdecd2ed7b2e2a1e611de74084ec7e6f005a6605cd92df27 |
|
MD5 | 8f105ec636bceed563ddae27821b4741 |
|
BLAKE2b-256 | 12795bc5c99fdcc4450553614e205296f6254ba38a6b7697a811749a01778b7a |
File details
Details for the file suql-1.1.6-py3-none-any.whl
.
File metadata
- Download URL: suql-1.1.6-py3-none-any.whl
- Upload date:
- Size: 49.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d48459a1e8d983d788570552cb853b6ab51647ec88930f58167bb7f87935a19f |
|
MD5 | 765bbaca85bb22f5ad2d62d4de5503cd |
|
BLAKE2b-256 | 8b4fe61b02eb096b597c34f3fb4d58a7c06a5be73cc787bd9521ca1453d0da4f |