A lightweight Python library for handling jsonlines files.
Project description
jsonl
About
jsonl is a lightweight Python library designed to simplify working with JSON Lines data, adhering to the jsonlines and ndjson specifications.
🎯 Features
- 🌎 Provides an API similar to Python's standard
jsonmodule. - 🚀 Supports custom (de)serialization via user-defined callbacks.
- 🗜️ Built-in support for
gzip,bzip2,xzcompression formats andZIPorTARarchives. - 🔧 Skips malformed lines during file loading.
- 📥 Loads from URLs directly.
- 🐍 No external dependencies: relies only on the Python standard library.
📦 Installation
To install jsonl using pip, run the following command:
pip install py-jsonl
⚡ Quick Start
Dumping data to a JSON Lines File
[!NOTE]
Use
jsonl.dumpto incrementally write an iterable of dictionaries to a JSON Lines file:
# -*- coding: utf-8 -*-
import jsonl
data = [
{"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
{"name": "May", "wins": []},
]
jsonl.dump(data, "file.jsonl")
Loading data from a JSON Lines source
[!NOTE]
Use
jsonl.loadto incrementally load a JSON Lines source—such as a filename, URL, or file-like object—into as an iterator of dictionaries:
# -*- coding: utf-8 -*-
import jsonl
# Load data from a JSON Lines file
iterator = jsonl.load("file.jsonl")
print(tuple(iterator))
# Load data from a URL
iterator = jsonl.load("https://example.com/file.jsonl")
print(tuple(iterator))
Dump multiple JSON Lines Files into an Archive (ZIP or TAR)
[!NOTE]
Use
jsonl.dump_archiveto incrementally write structured data to multiple JSON Lines files, which are then stored in a ZIP or TAR archive.
# -*- coding: utf-8 -*-
import jsonl
data = [
# Create `file1.jsonl` withing the archive
("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
# Create `file2.jsonl` within the archive
("path/to/file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
# Append to `file1.jsonl` within the archive
("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_archive("archive.zip", data)
Load multiple JSON Lines Files from an Archive (ZIP or TAR)
[!NOTE]
Use
jsonl.load_archiveto incrementally load multiple JSON Lines files from a ZIP or TAR archive.
[!TIP]
- It is possible to load the archive from a URL
- This function allows you to filter files using Unix shell-style wildcards.
# -*- coding: utf-8 -*-
import jsonl
# Load all JSON Lines files matching the pattern "*.jsonl" from a local archive
for filename, iterator in jsonl.load_archive("archive.zip"):
print("Filename:", filename)
print("Data:", tuple(iterator))
# Load all JSON Lines files matching the pattern "*.jsonl" from a remote archive
for filename, iterator in jsonl.load_archive("https://example.com/archive.zip"):
print("Filename:", filename)
print("Data:", tuple(iterator))
Dumping data to Multiple JSON Lines Files
[!NOTE]
Use
jsonl.dump_forkto incrementally write structured data to multiple JSON Lines files, which can be useful when you want to separate data based on some criteria.
# -*- coding: utf-8 -*-
import jsonl
data = [
# Create `file1.jsonl` or overwrite it if it exists
("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
# Create `file2.jsonl` or overwrite it if it exists
("file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
# Append to `file1.jsonl`
("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_fork(data)
📚 Documentation
For more detailed information and usage examples, refer to the project documentation
🛠️ Development
To contribute to the project, you can run the following commands for testing and documentation:
First, ensure you have the latest version of pip:
python -m pip install --upgrade pip
Running Unit Tests
Install the development dependencies and run the tests:
pip install --group=test --upgrade # Install test dependencies, skip if already installed
python -m pytest tests/ # Run all tests
python -m pytest tests/ --cov # Run tests with coverage
Running Linters
pip install --group=lint --upgrade # Install lint dependencies, skip if already installed
ruff check . # Run linter
spxl . # Run sphinx-linter for docstring issues
pymport . # Check for import issues
Building the Documentation
To build the documentation locally, use the following commands:
pip install --group=doc --upgrade # Install doc dependencies, skip if already installed
mkdocs serve # Start live-reloading docs server
mkdocs build # Build the documentation site
🗒️ License
This project is licensed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_jsonl-1.3.22.tar.gz.
File metadata
- Download URL: py_jsonl-1.3.22.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fd944a27354e45bda2ce1868d4ece33431e386b4a9a06682fde45ee6e86f566
|
|
| MD5 |
c43c58bf5b2093512b5a051b4c64abc0
|
|
| BLAKE2b-256 |
08497bfbe310788a6003cd726fcc7f8837b57ab3b4bd9e2eef1b2baf87886c97
|
File details
Details for the file py_jsonl-1.3.22-py3-none-any.whl.
File metadata
- Download URL: py_jsonl-1.3.22-py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bcffd9870615ccd05f790bc960087b082dad82fab365885bf43ec6dc2584a7a
|
|
| MD5 |
972c955dc328a77973084bdffc33ce8f
|
|
| BLAKE2b-256 |
e3f3a0ac98740161152a902f990dc5d9a51907dc66be981ee97a69f10b4c39d1
|