Skip to main content

Bindings for the Low-level Guidance (llguidance) Rust library for use within Guidance

Project description

Low-level Guidance (llguidance)

This library implements constrained decoding (also called constrained sampling or structured outputs) for Large Langauge Models (LLMs). It can enforce arbitrary context-free grammar on the output of LLM and is fast - on the order of 1ms of CPU time per token (for 100k tokenizer) with negligible startup costs.

Following grammar formats are supported:

The internal format is most powerful and can be generated by the following libraries:

The library can be used from:

The library is currently integrated in:

  • Guidance - library for interacting with LLMs; uses either llama.cpp or HF Tranformers
  • LLGTRT - OpenAI-compatible REST server using NVIDIA's TensorRT-LLM

The integration is ongoing in:

Technical details

Given a context-free grammar, a tokenizer, and a prefix of tokens, llguidance computes a token mask - a set of tokens from the tokenizer - that, when added to the current token prefix, can lead to a valid string in the language defined by the grammar. Mask computation takes approximately 1ms of single-core CPU time for a tokenizer with 100k tokens. While this timing depends on the exact grammar, it holds, for example, for grammars derived from JSON schemas. There is no significant startup cost.

The library implements a context-free grammar parser using Earley’s algorithm on top of a lexer based on derivatives of regular expressions. Mask computation is achieved by traversing the prefix tree (trie) of all possible tokens, leveraging highly optimized code.

Comparison

LM-format-enforcer and llama.cpp grammars are similar to llguidance in that they dynamically build token masks for every step of the decoding process. Both are significantly slower - the former due to clean Python code and the latter due to the lack of a lexer and use of a backtracking parser, which, while elegant, is inefficient.

Outlines builds an automaton from constraints and then pre-computes token masks for all automaton states, making sampling fast but inherently limiting constraint complexity and introducing significant startup cost and memory overhead. Llguidance computes token masks on the fly and has essentially no startup cost. The lexer’s automata are built lazily and are typically much smaller, as the context-free grammar imposes the top-level structure.

In llguidance, online mask computation takes approximately 1ms of CPU time per sequence in a batch. Thus, with 16 cores and a 10ms forward pass, the library can handle batch sizes up to 160 without slowing down the model. (Note that a 10ms forward pass for small batch sizes typically increases to 20ms+ for batch sizes of 100-200.)

Building

If you just need the C or Rust library (llguidance_parser), check the parser directory.

For Python bindings:

  • install python 3.9 or later; very likely you'll need a virtual env/conda
  • run ./scripts/install-deps.sh
  • to build and after any changes, run ./scripts/test-guidance.sh

This builds the Python bindings for the library and runs the tests (which mostly live in the Guidance repo - it will clone it).

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llguidance-0.4.0.tar.gz (162.2 kB view details)

Uploaded Source

Built Distributions

llguidance-0.4.0-cp39-abi3-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.9+ Windows x86-64

llguidance-0.4.0-cp39-abi3-win32.whl (2.0 MB view details)

Uploaded CPython 3.9+ Windows x86

llguidance-0.4.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ x86-64

llguidance-0.4.0-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (2.5 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ i686

llguidance-0.4.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ ARM64

llguidance-0.4.0-cp39-abi3-macosx_11_0_arm64.whl (2.4 MB view details)

Uploaded CPython 3.9+ macOS 11.0+ ARM64

llguidance-0.4.0-cp39-abi3-macosx_10_12_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.9+ macOS 10.12+ x86-64

File details

Details for the file llguidance-0.4.0.tar.gz.

File metadata

  • Download URL: llguidance-0.4.0.tar.gz
  • Upload date:
  • Size: 162.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.4

File hashes

Hashes for llguidance-0.4.0.tar.gz
Algorithm Hash digest
SHA256 12f01ede5758fc5dd25426b293a89f4130f4bbfd8e0277a78be876a3cafd5e39
MD5 cd5ea851d0a283b3f61c4a4d67cada84
BLAKE2b-256 cab243484c45db25f91767d4ed90cc5ff586bf1d80426951f0bf20a144c9cb4c

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 95100b74ccc49c835a31d7bc3ca6f0a4876f000ff0251c2bc49b62f755b01298
MD5 7f2ddd42399b7ac9b5139f883437244c
BLAKE2b-256 b5b37ca914bf3191bafd3074f1b77cdc313f9adb83a80a7b6ef28916af8cf90e

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-win32.whl.

File metadata

  • Download URL: llguidance-0.4.0-cp39-abi3-win32.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.9+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.4

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-win32.whl
Algorithm Hash digest
SHA256 d0fc6de379392d9259ae37338edbc0ddcb7fc6000cee32498a13aff1661e4bd4
MD5 75a562d58c872fd10ac6d86ea3e46914
BLAKE2b-256 2926ca741d34aa882fedcb017c4ae01ca0917dc16a350a9c65f3e76423196081

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e5a60b6d3461430294aca512e5a3e35e393c2205b7c785fc1eae01c1e5ab2bab
MD5 7e8f640aa8dd21546297d3fcb786bba3
BLAKE2b-256 053968520c91442c6cfd6dde321823df12a109dc05bb9a686e85a7c7ba091ff0

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 12780240aedec45d58d592a3686910a403699704a0cf38d7016ca59c0d7ede62
MD5 b296454ed8d02c588915ff4e2d38d972
BLAKE2b-256 4218f18fd57083310f0164661c726e234633435d1b970454477e170dfab091a9

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8fd42a0ccab606f6f08b2bb0f18b3445bfd3179b29d54ec84f02a2694a7158e0
MD5 3c5aab9594bf96e90b24518925779219
BLAKE2b-256 025b65e71f603638d6b69a3e9caf4a93a79c8357f7944b2447422621e480692a

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 78def0a6f1f7a1e30fbcb3af6afbfe3d21f6abae469f1ff0efc3cbfa01b07de7
MD5 9f657fea0aa70d2579bd287f7eff3414
BLAKE2b-256 459e802e9b5fd67195a6a25761a2af4f094eca87f2173d8d21e6b55efaa9e2bb

See more details on using hashes here.

File details

Details for the file llguidance-0.4.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for llguidance-0.4.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9a485aa195879ef2c7fc94e45de8e99314c1c5e13a5428a295500cbb2a7cde41
MD5 9e825863c0989906f492108ad3723014
BLAKE2b-256 3f581b74bba6cbdc39d73fa3dea56aed32aa4562b7f4cc611927b2d0550a48da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page