Skip to main content

NLP interface for Trilogy

Project description

Trilogy NLP

Natural language interface for generating SQL queries via a Trilogy data model.

Most of the value in a SQL statement comes from the column selection, transformation, and filtering.

Joins, table selection, group bys are all opportunitites to introduce errors.

Trilogy is easier SQL for humans because it separates out those parts in the language into a reusable metadata layer; the exact same benefits apply to an LLM.

The extra data encoded in the semantic model, and the significantly reduced target space for generation reduce common sources of LLM errors.

This makes it more testable and less prone to hallucination than generating SQL directly.

Trilogy-NLP is built on the common NLP backend (langchain, etc) and supports configurable backends.

Examples

[!TIP] These utilize the trilogy-public-models package to get predefined model.s, which can be installed with pip install trilogy-public-models

Hello World

from trilogy_public_models import get_executor
from trilogy_nlp import NLPEngine, Provider, CacheType

# we use this to run queries
# get a Trilogy executor preloaded with the tpc_ds schema in duckdb
# Executors run queries again a model using an engine
executor = get_executor("duckdb.tpc_ds")

# create an NLP engine
# we use this to generate queries against the model
engine = NLPEngine(
    provider=Provider.OPENAI,
    model="gpt-4o-mini",
    cache=CacheType.SQLLITE,
    cache_kwargs={"database_path": ".demo.db"},
)

# We can pass the executor to the engine
# to directly run a querie
results = engine.run_query(
    "What was the store sales for the first 5 days of January 2000 for customers in CA?",
    executor=executor,
)

for row in results:
    print(row)

# Or generate a query without executing it
query = engine.generate_query(
    "What was the store sales for the first 5 days of January 2000 for customers in CA?",
    env=executor.environment,
)

# which can compile it to SQL
# this might be multiple statements in some cases
# but here we can just grab the last one
print(executor.generate_sql(query)[-1])

BQ Example

from trilogy_public_models import models
from trilogy import Executor, Dialects
from trilogy_nlp import build_query

# define the model we want to parse
environment = models["bigquery.stack_overflow"]

# set up preql executor
# default bigquery executor requires local default credentials configured
executor = Dialects.BIGQUERY.default_executor(environment= environment)

# build a query off text and the selected model
processed_query = build_query(
    "How many questions are asked per year?",
    environment,
)

# make sure we got reasonable outputs
for concept in processed_query.output_columns:
    print(concept.name)

# and run that to get our answer
results = executor.execute_query(processed_query)
for row in results:
    print(row)

[!WARNING]
Don't expect perfection - results are non-determistic; review the generated Trilogy to make sure it maches your expectations. Treat queries as a starting point for refinement.

Setting Up Your Environment

Recommend that you work in a virtual environment with requirements from both requirements.txt and requirements-test.txt installed. The latter is necessary to run tests (surprise).

trilogy-nlp is python 3.10+

Open AI Config

Requires setting the following environment variables or passing them into NLPEngine creation.

  • OPENAI_API_KEY
  • OPENAI_MODEL

Recommended to use "gpt-4o-mini" or higher as the model.

Gemini

Requires setting the following environment variables or passing them into NLpEngine reation

  • GOOGLE_API_KEY

LlamaFile Config

Run server locally

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytrilogy_nlp-0.1.3.tar.gz (35.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytrilogy_nlp-0.1.3-py3-none-any.whl (38.5 kB view details)

Uploaded Python 3

File details

Details for the file pytrilogy_nlp-0.1.3.tar.gz.

File metadata

  • Download URL: pytrilogy_nlp-0.1.3.tar.gz
  • Upload date:
  • Size: 35.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for pytrilogy_nlp-0.1.3.tar.gz
Algorithm Hash digest
SHA256 eab95618afc0eba24c81192341e8f2bcfc06a19c77ac40e9c5891e6dad696857
MD5 8c8389b4b88616720a70ad43e94ab884
BLAKE2b-256 ca5d4294fd75cf70e2598b236dc05f5f4e03b7a412dd66b79a8eab188062b95a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytrilogy_nlp-0.1.3.tar.gz:

Publisher: pythonpublish.yml on trilogy-data/pytrilogy-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pytrilogy_nlp-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pytrilogy_nlp-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 38.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for pytrilogy_nlp-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3d3d37debdd7759dd6166969fc76cb951c26096358eb8f4cad1b9e73d933bd88
MD5 9a8d1f73ae75a95df27a603a264cd406
BLAKE2b-256 200c336c02748f6b3f7f489e5e521862a5d91d7f8a1bab84b5742e841c4078b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytrilogy_nlp-0.1.3-py3-none-any.whl:

Publisher: pythonpublish.yml on trilogy-data/pytrilogy-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page