Skip to main content

Formatron empowers everyone to control the output format of language models with minimal overhead.

Reason this release was yanked:

Should be 0.0.1

Project description

Logo

Formatron allows users to control the output format of language models with minimal overhead. It is lightweight, user-friendly, and seamlessly integrates into existing codebases and frameworks.

Features

  • ๐Ÿ”— Popular Library Integrations: Supports transformers, exllamav2, vllm and RWKV.
  • ๐Ÿ”Œ Plugins, not wrappers: Instead of wrapping third-party libraries in large, cumbersome classes, Formatron offers convenient, clean plugins for different libraries.
  • ๐Ÿ’ก Library, not framework: Instead of unifying everything into a bulky framework, Formatron is a flexible library that can be embedded anywhere.
  • โœ๏ธ Fluent Formatting: Describe your format as easily as writing natural language.
  • ๐Ÿ“œ Regex and CFG Support: Effortlessly interleave regular expressions and context-free grammars (CFG) in formats.
  • โš™๏ธ Efficient JSON Generation: Feature-complete JSON generation based on Pydantic models or json schemas.
  • ๐Ÿ“ค Batched Inference: Freely specify different formats for each sequence in one batch!
  • ๐Ÿš€ Minimal Runtime Overhead: With Leo optimization, a specialized compacting algorithm, and CFG caches across generations, Earley algorithm implemented in Rust is aymptotically and practically the fastest algorithm.
  • ๐Ÿ”ง Customizable: Everything is configurable, including schema generation, grammar generation, and post-generation processing (such as function calls).

Comparison to other libraries

Capability Formatron LM Format Enforcer Guidance Outlines
Regular Expressions โœ… โœ… โœ… โœ…
Efficient Regex-constrained Generation โœ… ๐ŸŸก(performance issues still exist) โŒ ๐ŸŸก(scalablity currently suffers)
Context Free Grammars(CFG) โœ… โŒ โœ… ๐ŸŸก(some bugs exists)
Efficient CFG-constrained Generation โœ… โŒ โŒ โŒ
Custom Format Extractor ๐ŸŸก(some limitations exist) โŒ โœ… โœ…
JSON Schema โœ…(indirectly) โœ… โœ… โœ…
Function Call From Callable โœ… โŒ โœ… โœ…
Interleave Python control flow in generation โŒ โŒ โœ… โŒ
Batched Generation โœ… โœ… โŒ โœ…
Beam Search โŒ โœ… โŒ โœ…
Integrates into existing pipelines โœ… โœ… โŒ โœ…
Optional JSON Fields โœ… โœ… โŒ โŒ
LLM Controls JSON field whitespaces โœ… โœ… โŒ โŒ
LLM Controls JSON field orderings โŒ โœ… โŒ โŒ
JSON Schema with recursive classes โœ… โœ… โŒ โŒ

Feel free to open up an issue if something is missing or incorrect!

Examples

Quick Start

TODO: make a fancy example that shows off all the powerful features of Formatron

Function Calls

TODO: show how to call functions in Formatron

Customize Schema Generation

TODO: show how to customize schema generation

Customize Grammar Generation

TODO: show how to customize grammar generation

Customize Post-Generation Processing

TODO: show how to customize post-generation processing

Integrations

Check out integration examples in the tests directory.

API Reference

Check out the API reference here.

Benchmarks

Effectiveness

TODO: show Formatron's improvements on benchmarks against unconstrained versions.

Efficiency

TODO: show Formatron's speed and memory usage against other libraries

What Formatron Won't Do

Implement an End-to-End Inference Pipeline

Every library related to large language models(LLM) must consider that LLMs are rapidly evolving. Many libraries, such as Guidance, Outlines, and LMQL, address this by offering their own end-to-end inference pipelines, which are constantly updated to incorporate the latest techniques.

Formatron, however, takes a different approach. Rather than providing a full-fledged inference pipeline, Formatron focuses on being modular and easily embeddable into existing and future pipelines. While this may require users to write a bit more code initially, it makes maintaining and updating the pipeline painless in the long run.

What Formatron Can't Do Now

Support OpenAI or in general API-based LLM solutions

They don't support efficient logits masking per token, nullifying most benefits of constrained decoding.

Context-Sensitive Validation

Unfortunately, many formats require context-sensitive validation. For example, two keys in a JSON object must not be equal to each other. Unlike CFGs, there is no efficient, generic algorithm to validate such constraints. However, for a specific format, it is possible to validate them efficiently with a specialized algorithm. In a future release, Formatron will support context-sensitive validation for popular formats like JSON.

Abstract Syntax Tree (AST) Construction

Formatron uses an Earley recognizer rather than a parser under the hood. This approach allows for more efficient generation and validation but also means that the AST of a given format is not available. In most cases, this is not a problem, as it is usually possible to extract the format from the generated string using simple algorithms and then parse it with a parser. However, in some cases, obtaining the AST might be necessary. In a future release, Formatron will support AST construction.

Process batch logits in parallel

While it is technically possible to process batch logits in parallel CPU threads since Formatron uses Rust internally, most frameworks sequentially call Formatron's plugin for each logits in a batch. Altering this behaviour requires a breaking change to the frameworks' API or letting Formatron take over the control flow. Both options imply substantial work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

formatron-0.1.0.tar.gz (24.8 kB view hashes)

Uploaded Source

Built Distribution

formatron-0.1.0-py3-none-any.whl (24.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page