Modern syntax highlighting for Python 3.14t — O(n) guaranteed, zero ReDoS
Project description
⌾⌾⌾ Rosettes
Modern syntax highlighting for Python 3.14t
from rosettes import highlight
html = highlight("def hello(): print('world')", "python")
Why Rosettes?
- O(n) guaranteed — Hand-written state machines, no regex backtracking
- Zero ReDoS — No exploitable patterns, safe for untrusted input
- Thread-safe — Immutable state, optimized for Python 3.14t free-threading
- Pygments compatible — Drop-in CSS class compatibility
- 55 languages — Python, JavaScript, Rust, Go, and 51 more
Installation
pip install rosettes
Requires Python 3.14+
Quick Start
| Function | Description |
|---|---|
highlight(code, lang) |
Generate HTML with syntax highlighting |
tokenize(code, lang) |
Get raw tokens for custom processing |
highlight_many(items) |
Parallel highlighting for multiple blocks |
list_languages() |
List all 55 supported languages |
Features
| Feature | Description | Docs |
|---|---|---|
| Basic Highlighting | highlight() and tokenize() functions |
Highlighting → |
| Parallel Processing | highlight_many() for multi-core systems |
Parallel → |
| Line Highlighting | Highlight specific lines, add line numbers | Lines → |
| CSS Styling | Semantic or Pygments-compatible classes | Styling → |
| Custom Formatters | Build terminal, LaTeX, or custom output | Extending → |
📚 Full documentation: lbliii.github.io/rosettes
Usage
Basic Highlighting — Generate HTML from code
from rosettes import highlight
# Basic highlighting
html = highlight("def foo(): pass", "python")
# <div class="rosettes" data-language="python">...</div>
# With line numbers
html = highlight(code, "python", show_linenos=True)
# Highlight specific lines
html = highlight(code, "python", hl_lines={2, 3, 4})
Parallel Processing — Speed up multiple blocks
For 8+ code blocks, use highlight_many() for parallel processing:
from rosettes import highlight_many
blocks = [
("def foo(): pass", "python"),
("const x = 1;", "javascript"),
("fn main() {}", "rust"),
]
# Highlight in parallel
results = highlight_many(blocks)
On Python 3.14t with free-threading, this provides 1.5-2x speedup for 50+ blocks.
Tokenization — Raw tokens for custom processing
from rosettes import tokenize
tokens = tokenize("x = 42", "python")
for token in tokens:
print(f"{token.type.name}: {token.value!r}")
# NAME: 'x'
# WHITESPACE: ' '
# OPERATOR: '='
# WHITESPACE: ' '
# NUMBER_INTEGER: '42'
CSS Class Styles — Semantic or Pygments
Semantic (default) — Readable, self-documenting:
html = highlight(code, "python")
# <span class="syntax-keyword">def</span>
# <span class="syntax-function">hello</span>
.syntax-keyword { color: #ff79c6; }
.syntax-function { color: #50fa7b; }
.syntax-string { color: #f1fa8c; }
Pygments-compatible — Use existing themes:
html = highlight(code, "python", css_class_style="pygments")
# <span class="k">def</span>
# <span class="nf">hello</span>
Supported Languages
55 languages with full syntax support
| Category | Languages |
|---|---|
| Core | Python, JavaScript, TypeScript, JSON, YAML, TOML, Bash, HTML, CSS, Diff |
| Systems | C, C++, Rust, Go, Zig |
| JVM | Java, Kotlin, Scala, Groovy, Clojure |
| Apple | Swift |
| Scripting | Ruby, Perl, PHP, Lua, R, PowerShell |
| Functional | Haskell, Elixir |
| Data/Query | SQL, CSV, GraphQL |
| Markup | Markdown, XML |
| Config | INI, Nginx, Dockerfile, Makefile, HCL |
| Schema | Protobuf |
| Modern | Dart, Julia, Nim, Gleam, V |
| AI/ML | Mojo, Triton, CUDA, Stan |
| Other | PKL, CUE, Tree, Kida, Jinja, Plaintext |
Architecture
State Machine Lexers — O(n) guaranteed
Every lexer is a hand-written finite state machine:
┌─────────────────────────────────────────────────────────────┐
│ State Machine Lexer │
│ │
│ ┌─────────┐ char ┌─────────┐ char ┌─────────┐ │
│ │ INITIAL │ ────────► │ STRING │ ────────► │ ESCAPE │ │
│ │ STATE │ │ STATE │ │ STATE │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │ │ │ │
│ │ emit │ emit │ emit │
│ ▼ ▼ ▼ │
│ [Token] [Token] [Token] │
└─────────────────────────────────────────────────────────────┘
Key properties:
- Single character lookahead (O(n) guaranteed)
- No backtracking (no ReDoS possible)
- Immutable state (thread-safe)
- Local variables only (no shared mutable state)
Thread Safety — Free-threading ready
All public APIs are thread-safe:
- Lexers use only local variables during tokenization
- Formatter state is immutable
- Registry uses
functools.cachefor memoization - Module declares itself safe for free-threading (PEP 703)
Performance
Benchmarked against Pygments on a 10,000-line Python file:
| Operation | Rosettes | Pygments | Speedup |
|---|---|---|---|
| Tokenize | 12ms | 45ms | 3.75x |
| Highlight | 18ms | 52ms | 2.89x |
| Parallel (8 blocks) | 22ms | 48ms | 2.18x |
Documentation
| Section | Description |
|---|---|
| Get Started | Installation and quickstart |
| Highlighting | Core highlighting APIs |
| Styling | CSS classes and themes |
| Reference | Complete API documentation |
| About | Architecture and design |
Development
git clone https://github.com/lbliii/rosettes.git
cd rosettes
uv sync --group dev
pytest
The Bengal Ecosystem
A structured reactive stack — every layer written in pure Python for 3.14t free-threading.
| ᓚᘏᗢ | Bengal | Static site generator | Docs |
| ∿∿ | Purr | Content runtime | — |
| ⌁⌁ | Chirp | Web framework | Docs |
| =^..^= | Pounce | ASGI server | Docs |
| )彡 | Kida | Template engine | Docs |
| ฅᨐฅ | Patitas | Markdown parser | Docs |
| ⌾⌾⌾ | Rosettes | Syntax highlighter ← You are here | Docs |
Python-native. Free-threading ready. No npm required.
License
MIT License — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rosettes-0.2.0.tar.gz.
File metadata
- Download URL: rosettes-0.2.0.tar.gz
- Upload date:
- Size: 133.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
538cd22f9c96e70bb743974af360c4e03f91ab80e9ed05356e2b192ab04fe633
|
|
| MD5 |
32f8ff7de4cc30f84146cd2c1d1fb769
|
|
| BLAKE2b-256 |
9d543568ce0b0b12ecb9a763d8c6f77929b4ef3f60d0279f6336c877c631a23f
|
Provenance
The following attestation bundles were made for rosettes-0.2.0.tar.gz:
Publisher:
python-publish.yml on lbliii/rosettes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rosettes-0.2.0.tar.gz -
Subject digest:
538cd22f9c96e70bb743974af360c4e03f91ab80e9ed05356e2b192ab04fe633 - Sigstore transparency entry: 953219687
- Sigstore integration time:
-
Permalink:
lbliii/rosettes@6e0079176c311df0d13d4424d8abec488480f554 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@6e0079176c311df0d13d4424d8abec488480f554 -
Trigger Event:
release
-
Statement type:
File details
Details for the file rosettes-0.2.0-py3-none-any.whl.
File metadata
- Download URL: rosettes-0.2.0-py3-none-any.whl
- Upload date:
- Size: 202.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38272fa1a3332c9b53f8c5513de16bff3134d859362d1edfde9c28fd67cc8689
|
|
| MD5 |
585ee3a11e52acedc3853993ec18c38c
|
|
| BLAKE2b-256 |
d15fb8eca9b4ba319ec3e7c0ef38665229a082af0a85d27898320a72ba9c7c33
|
Provenance
The following attestation bundles were made for rosettes-0.2.0-py3-none-any.whl:
Publisher:
python-publish.yml on lbliii/rosettes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rosettes-0.2.0-py3-none-any.whl -
Subject digest:
38272fa1a3332c9b53f8c5513de16bff3134d859362d1edfde9c28fd67cc8689 - Sigstore transparency entry: 953219688
- Sigstore integration time:
-
Permalink:
lbliii/rosettes@6e0079176c311df0d13d4424d8abec488480f554 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@6e0079176c311df0d13d4424d8abec488480f554 -
Trigger Event:
release
-
Statement type: