Skip to main content

NLP interface for Trilogy

Project description

Trilogy NLP

Natural language interface for generating SQL queries via a Trilogy data model.

Most of the value in a SQL statement comes from the column selection, transformation, and filtering.

Joins, table selection, group bys are all opportunitites to introduce errors.

Trilogy is easier SQL for humans because it separates out those parts in the language into a reusable metadata layer; the exact same benefits apply to an LLM.

The extra data encoded in the semantic model, and the significantly reduced target space for generation reduce common sources of LLM errors.

This makes it more testable and less prone to hallucination than generating SQL directly.

Trilogy-NLP is built on the common NLP backend (langchain, etc) and supports configurable backends.

Examples

Basic BQ example

from trilogy_public_models import models
from trilogy import Executor, Dialects
from trilogy_nlp import build_query

# define the model we want to parse
environment = models["bigquery.stack_overflow"]

# set up preql executor
# default bigquery executor requires local default credentials configured
executor = Dialects.BIGQUERY.default_executor(environment= environment)

# build a query off text and the selected model
processed_query = build_query(
    "How many questions are asked per year?",
    environment,
)

# make sure we got reasonable outputs
for concept in processed_query.output_columns:
    print(concept.name)

# and run that to get our answer
results = executor.execute_query(processed_query)
for row in results:
    print(row)

Don't Expecct Perfection

Results are non-determistic; review the generated trilogy to make sure it maches your expectations.

# generated from prompt: What is Taylor Swift's birthday? How many questions were asked on that day in 2020?
SELECT
    question.count,
    answer.creation_date.year,
    question.creation_date.year,
    question.creation_date,
WHERE
    question.creation_date.year = 1989
ORDER BY
    question.count desc,
    question.count desc
LIMIT 100;

Setting Up Your Environment

Recommend that you work in a virtual environment with requirements from both requirements.txt and requirements-test.txt installed. The latter is necessary to run tests (surprise).

trilogy-nlp is python 3.10+

Open AI Config

Requires setting the following environment variables

  • OPENAI_API_KEY
  • OPENAI_MODEL

Recommended to use "gpt-3.5-turbo" or higher as the model.

LlamaFile Config

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytrilogy_nlp-0.1.1.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

pytrilogy_nlp-0.1.1-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file pytrilogy_nlp-0.1.1.tar.gz.

File metadata

  • Download URL: pytrilogy_nlp-0.1.1.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for pytrilogy_nlp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 46adc83f915439c1055669263cea72858f2eec0d24ea2627155dd7a681bf2ef3
MD5 e292cdb8f5169f628f327b739f26685d
BLAKE2b-256 9a5b309d68d3c6737192fbca0b7759be2536878cf0d60d8df5399164ac6924ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytrilogy_nlp-0.1.1.tar.gz:

Publisher: pythonpublish.yml on trilogy-data/pytrilogy-nlp

Attestations:

File details

Details for the file pytrilogy_nlp-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pytrilogy_nlp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6b7dfb8142423edaddc89428efa3cb593d592016ddb87f163e9fa56cf286f767
MD5 baffb7b988236700993fa52ccd95db4e
BLAKE2b-256 99f8f7e509204e0aa711820af660a965179796a5b3db01b81a92eada1926b282

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytrilogy_nlp-0.1.1-py3-none-any.whl:

Publisher: pythonpublish.yml on trilogy-data/pytrilogy-nlp

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page