Streaming JSON parser that yields progressively complete values - Python port of the TypeScript jsonriver library
Project description
jsonriver - Python Streaming JSON Parser
Parse JSON incrementally as it streams in, e.g. from a network request or a language model. Gives you a sequence of increasingly complete values.
This is a Python port of the TypeScript jsonriver library.
Features
- Incremental parsing: Get progressively complete JSON values as data arrives
- Zero dependencies: Uses only Python standard library
- Fully typed: Complete type hints with mypy strict mode compliance
- Memory efficient: Reuses objects and arrays when possible
- Correct: Final result matches
json.loads()exactly - Fast: Optimized for performance with minimal overhead
Installation
From PyPI (recommended)
Using uv:
uv add jsonriver
Using pip:
pip install jsonriver
From source
Using uv:
git clone https://github.com/chrisschnabl/streamjson.git
cd streamjson
uv pip install -e .
Using pip:
git clone https://github.com/chrisschnabl/streamjson.git
cd streamjson
pip install -e .
Usage
import asyncio
import json
from jsonriver import parse
async def make_stream(text: str, chunk_size: int):
"""Simulate a streaming source"""
for i in range(0, len(text), chunk_size):
yield text[i:i + chunk_size]
async def main():
json_str = '{"name": "Alice", "age": 30}'
stream = make_stream(json_str, chunk_size=3)
async for value in parse(stream):
print(json.dumps(value))
# Output shows incremental results:
# {}
# {"name": "Al"}
# {"name": "Alice"}
# {"name": "Alice", "age": 30.0}
asyncio.run(main())
How it Works
jsonriver yields a sequence of increasingly complete JSON values. Consider this JSON:
{"name": "Alex", "keys": [1, 20, 300]}
If you parse this one byte at a time, it would yield:
{}
{"name": ""}
{"name": "A"}
{"name": "Al"}
{"name": "Ale"}
{"name": "Alex"}
{"name": "Alex", "keys": []}
{"name": "Alex", "keys": [1]}
{"name": "Alex", "keys": [1, 20]}
{"name": "Alex", "keys": [1, 20, 300]}
Invariants
The library maintains these guarantees:
- Type stability: Future versions will have the same type (never changes string → array)
- Atomic values:
null,true,false, and numbers are only yielded when complete - String growth: Strings may be replaced with longer versions
- Array append-only: Arrays only modified by appending or mutating the last element
- Object append-only: Objects only modified by adding properties or mutating the last one
- Complete keys: Object properties only added once key and value type are known
Error Handling
The parser throws errors for invalid JSON, matching json.loads() behavior:
async def example_error():
try:
stream = make_stream('{"invalid": }', 1)
async for value in parse(stream):
print(value)
except ValueError as e:
print(f"Parse error: {e}")
Development
Setup
# Create virtual environment and install dependencies
uv venv
uv pip install -e ".[dev]"
Testing
# Run all tests
python -m pytest tests/ -v
# Run specific test file
python -m pytest tests/test_parse.py -v
# Run with coverage
python -m pytest tests/ --cov=src/jsonriver
Type Checking
# Check types with mypy
mypy src/jsonriver --strict
Running Examples
python example_jsonriver.py
Project Structure
src/jsonriver/
__init__.py # Public API exports
parse.py # JSON parser implementation
tokenize.py # JSON tokenizer implementation
tests/
test_parse.py # Parser tests
test_tokenize.py # Tokenizer tests
utils.py # Test utilities
API Reference
parse(stream: AsyncIterator[str]) -> AsyncIterator[JsonValue]
Incrementally parse a single JSON value from the given iterable of string chunks.
Parameters:
stream: An async iterator that yields string chunks containing JSON data
Yields:
- Increasingly complete JSON values as more input is parsed
Raises:
ValueError: If the input is not valid JSONRuntimeError: For internal parsing errors
Example:
async def parse_json():
json_str = '{"a": 1, "b": 2}'
async def stream():
for char in json_str:
yield char
async for value in parse(stream()):
print(value)
Type Definitions
JsonValue = Union[
None,
bool,
float,
str,
list['JsonValue'],
dict[str, 'JsonValue']
]
JsonObject = dict[str, JsonValue]
Performance
jsonriver is designed for performance:
- Processes input synchronously in batches when available
- Reuses objects and arrays to minimize allocations
- Minimal overhead compared to standard
json.loads() - Efficient state machine implementation
In practice, jsonriver adds negligible overhead to the parsing process while providing valuable incremental updates.
Use Cases
- Streaming APIs: Parse JSON from network requests as data arrives
- Large payloads: Start processing data before complete response
- Real-time UIs: Update UI as JSON parses
- LLM responses: Parse structured output from language models
- Progress indicators: Show parsing progress to users
- Server-sent events: Handle JSON in SSE streams
Comparison with Alternatives
| Feature | jsonriver | json.loads | ijson |
|---|---|---|---|
| Incremental parsing | ✅ | ❌ | ✅ |
| Complete values | ✅ | ✅ | ❌ |
| No dependencies | ✅ | ✅ | ❌ |
| Type hints | ✅ | ✅ | ❌ |
| Memory efficient | ✅ | ❌ | ✅ |
License
BSD-3-Clause License
- Original TypeScript implementation: Copyright (c) 2023 Google LLC
- Python port: Copyright (c) 2024 jsonriver-python contributors
See LICENSE file for full license text.
Credits
This is a Python port of the excellent jsonriver TypeScript library by Peter Burns (@rictic).
Contributing
Contributions are welcome! Please ensure:
- All tests pass:
pytest tests/ -v - Type checking passes:
mypy src/jsonriver --strict - Code follows existing style
- New features include tests
Changelog
0.0.1 (2024)
- Initial Python port from TypeScript
- Full type hints with mypy strict mode
- Comprehensive test suite (37 tests)
- Complete documentation
- Zero dependencies
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jsonriver-0.0.1.tar.gz.
File metadata
- Download URL: jsonriver-0.0.1.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.29
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c41972f8b7fbb32b8ebdb7a890c14f291d1ae75a88ac613178a59b7780d8ba4d
|
|
| MD5 |
63dab407a8126f94ff2a90a9ae9227ed
|
|
| BLAKE2b-256 |
1a3e694ff88c0667c070815040020f077d682b36ecf3d7c6e447a046e4d77e8c
|
File details
Details for the file jsonriver-0.0.1-py3-none-any.whl.
File metadata
- Download URL: jsonriver-0.0.1-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.29
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9139ed7199c800acd68b81dc478ea8f9bd5a0d1cdbc87d9fdea628ca931c22d5
|
|
| MD5 |
782ce85bc6fcb838c3d88537d16d1227
|
|
| BLAKE2b-256 |
d2e698c242b8d6dd40a6d83e2ba0780e8b2b88068fdeead0748893bb76da53d6
|