Efficient, Flexible and Portable Structured Generation
Project description
Efficient, Flexible and Portable Structured Generation
News
- [2025/02] XGrammar has been officially integrated into Modular's MAX
- [2025/01] XGrammar has been officially integrated into TensorRT-LLM.
- [2024/12] XGrammar has been officially integrated into vLLM.
- [2024/12] We presented research talks on XGrammar at CMU, UC Berkeley, MIT, THU, SJTU, Ant Group, LMSys, Qingke AI, Camel AI. The slides can be found here.
- [2024/11] XGrammar has been officially integrated into SGLang.
- [2024/11] XGrammar has been officially integrated into MLC-LLM.
- [2024/11] We officially released XGrammar v0.1.0!
Overview
XGrammar is an open-source library for efficient, flexible, and portable structured generation.
It leverages constrained decoding to ensure 100% structural correctness of the output. It supports general context-free grammar to enable a broad range of structures, including JSON, regex, custom context-free grammar, etc.
XGrammar uses careful optimizations to achieve extremely low overhead in structured generation. It has achieved near-zero overhead in JSON generation, making it one of the fastest structured generation engines available.
XGrammar features universal deployment. It supports:
- Platforms: Linux, macOS, Windows
- Hardware: CPU, NVIDIA GPU, AMD GPU, Apple Silicon, TPU, etc.
- Languages: Python, C++, and JavaScript APIs
- Models: Qwen, Llama, DeepSeek, Phi, Gemma, etc.
XGrammar is very easy to integrate with LLM inference engines. It is the default structured generation backend for most LLM inference engines, including vLLM, SGLang, TensorRT-LLM, and MLC-LLM, as well as many other companies. You can also try out their structured generation modes!
Get Started
Install XGrammar:
pip install xgrammar
Import XGrammar:
import xgrammar as xgr
Please visit our documentation to get started with XGrammar.
Adoption
XGrammar has been adopted by many projects and companies, including but not limited to:
Citation
If you find XGrammar useful in your research, please consider citing our paper:
@article{dong2024xgrammar,
title={Xgrammar: Flexible and efficient structured generation engine for large language models},
author={Dong, Yixin and Ruan, Charlie F and Cai, Yaxing and Lai, Ruihang and Xu, Ziyi and Zhao, Yilong and Chen, Tianqi},
journal={Proceedings of Machine Learning and Systems 7},
year={2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file b10_xgrammar-0.1.23rc1-cp313-cp313-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: b10_xgrammar-0.1.23rc1-cp313-cp313-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 7.6 MB
- Tags: CPython 3.13, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de3821efcdaba2adad52688f14f981581414cac0c696c92ab50e53da8c3de3f0
|
|
| MD5 |
1f48790933778be7cd5212017a6d6913
|
|
| BLAKE2b-256 |
dfb0e2727ebf7c2c2fb8aad0d221d0a209ce726145de6addf37e386a318291a3
|
File details
Details for the file b10_xgrammar-0.1.23rc1-cp312-cp312-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: b10_xgrammar-0.1.23rc1-cp312-cp312-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 7.6 MB
- Tags: CPython 3.12, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be920f154904a6e53aa0cde96247214e2c1e90137805c4faca84cbb5259faa78
|
|
| MD5 |
2b907e2765d401463babeb8e508ff5ca
|
|
| BLAKE2b-256 |
602ec8d3ef4c260e0793bd6c3f7dddb5932a920c1de30f8fa3e697f02f141dd6
|