Skip to main content

Construct trusted SQL queries from untrusted input

Project description

HeimdaLLM

Pronounced [ˈhaɪm.dɔl.əm] or HEIM-dall-EM

HeimdaLLM is a robust static analysis framework for validating that LLM-generated structured output is safe. It currently supports SQL.

In simple terms, it helps makes sure that AI won't wreck your systems.

Heimdall Build status Docs GitHub Sponsors PyPI License: Commercial License: AGPL v3 Coverage Status

Consider the following natural-language database query:

how much have i spent renting movies, broken down by month?

From this query (and a little bit of context), an LLM can produce the following SQL query:

SELECT
   strftime('%Y-%m', payment.payment_date) AS month,
   SUM(payment.amount) AS total_amount
FROM payment
JOIN rental ON payment.rental_id=rental.rental_id
JOIN customer ON payment.customer_id=customer.customer_id
WHERE customer.customer_id=:customer_id
GROUP BY month
LIMIT 10;

But how can you ensure the LLM-generated query is safe and that it only accesses authorized data?

HeimdaLLM performs static analysis on the generated SQL to ensure that only certain columns, tables, and functions are used. It also automatically edits the query to add a LIMIT and to remove forbidden columns. Lastly, it ensures that there is a column constraint that would restrict the results to only the user's data.

It does all of this locally, without AI, using good ol' fashioned grammars and parsers:

✅ Ensuring SELECT statement...
✅ Resolving column and table aliases...
✅ Allowlisting selectable columns...
   ✅ Removing 2 forbidden columns...
✅ Ensuring correct row LIMIT exists...
   ✅ Lowering row LIMIT to 10...
✅ Checking JOINed tables and conditions...
✅ Checking required WHERE conditions...
✅ Ensuring query is constrained to requester's identity...
✅ Allowlisting SQL functions...
   ✅ strftime
   ✅ SUM

The validated query can then be executed:

month total_amount
2005-05 4.99
2005-06 22.95
2005-07 100.78
2005-08 87.82

Want to get started quickly? Go here.

🥽 Safety

I am in the process of organizing an independent security audit of HeimdaLLM. Until this audit is complete, I do not recommend using HeimdaLLM against any production system without a careful risk assessment. These audits are self-funded, so if you will get value from the confidence that they bring, consider sponsoring me or inquire about interest in a commercial license.

To understand some of the potential vulnerabilities, take a look at the attack surface to see the risks and the mitigations.

📚 Database support

  • Sqlite
  • MySQL
  • Postgres

There is active development for the other top relational SQL databases. To help me prioritize, please vote on which database you would like to see supported:

Static Badge

📜 License

HeimdaLLM is dual-licensed for open-source or for commercial use.

🤝 Open-source license

The open-source license is AGPLv3, which permits free usage, modification, and distribution, and is appropriate for individual or open-source usage. For commercial usage, AGPLv3 has key obligations that your organization may want to avoid:

  • Source Code Disclosure: Any changes you make and use over a network must be made publicly available, potentially revealing your proprietary modifications.

  • Copyleft Clause: If HeimdaLLM is integrated into your application, the whole application may need to adhere to AGPLv3 terms, including code disclosure of your application.

  • Service Providers: If you use HeimdaLLM to provide services, your clients also need to abide by AGPLv3, complicating contracts.

📈 Commercial license

The commercial license eliminates the above restrictions, providing flexibility and protection for your business operations. This commercial license is recommended for commercial use. Please inquire about a commerical license here:

License Inquiry

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

heimdallm-1.0.3.tar.gz (60.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

heimdallm-1.0.3-py3-none-any.whl (92.6 kB view details)

Uploaded Python 3

File details

Details for the file heimdallm-1.0.3.tar.gz.

File metadata

  • Download URL: heimdallm-1.0.3.tar.gz
  • Upload date:
  • Size: 60.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for heimdallm-1.0.3.tar.gz
Algorithm Hash digest
SHA256 38da659e9da9339458378be1322f81fc3c1b1a76249ae0a0ec9f7ffb81ecdc63
MD5 ef4c700014577a59121107987243e96d
BLAKE2b-256 a91a32d5e0561342262a164e1a310fb9389a11af47b73b8b3fb1eee485cfd4f2

See more details on using hashes here.

File details

Details for the file heimdallm-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: heimdallm-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 92.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for heimdallm-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 36690b868fb8775b565c9842ad1ba7b124e568766298a61c2e4ddceddc0eebad
MD5 6644d74c732e31c157221866b4330dba
BLAKE2b-256 272df8c5ffa84b620adaab622babbd5151a2c14188acf87f915067bfca0c73b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page