A lightweight Python library for handling jsonlines files.
Project description
jsonl
About
jsonl is a lightweight Python library designed to simplify working with JSON Lines data, adhering to the jsonlines and ndjson specifications.
Features
- 🌎 Provides an API similar to Python's standard
jsonmodule. - 🚀 Supports custom (de)serialization via user-defined callbacks.
- 🗜️ Built-in support for
gzip,bzip2,xzcompression formats andZIPorTARarchives. - 🔧 Skips malformed lines during file loading.
- 📥 Loads from URLs directly.
Installation
To install jsonl using pip, run the following command:
pip install py-jsonl
Getting Started
Dumping data to a JSON Lines File
Use jsonl.dump to incrementally write an iterable of dictionaries to a JSON Lines file:
import jsonl
data = [
{"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
{"name": "May", "wins": []},
]
jsonl.dump(data, "file.jsonl")
Loading data from a JSON Lines source
Use jsonl.load to incrementally load a JSON Lines source—such as a filename, URL, or file-like object—into as an iterator of dictionaries:
import jsonl
# Load data from a JSON Lines file
iterator = jsonl.load("file.jsonl")
print(tuple(iterator))
# Load data from a URL
iterator = jsonl.load("https://example.com/file.jsonl")
print(tuple(iterator))
Dump multiple JSON Lines Files into an Archive (ZIP or TAR)
Use jsonl.dump_archive to incrementally write structured data to multiple JSON Lines files,
which are then stored in a ZIP or TAR archive.
import jsonl
data = [
# Create `file1.jsonl` withing the archive
("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
# Create `file2.jsonl` within the archive
("path/to/file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
# Append to `file1.jsonl` within the archive
("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_archive("archive.zip", data)
Load multiple JSON Lines Files from an Archive (ZIP or TAR)
Use jsonl.load_archive to incrementally load multiple JSON Lines files from a ZIP or TAR archive.
- It is possible to load the archive from a URL
- This function allows you to filter files using Unix shell-style wildcards.
import jsonl
# Load all JSON Lines files matching the pattern "*.jsonl" from a local archive
for filename, iterator in jsonl.load_archive("archive.zip"):
print("Filename:", filename)
print("Data:", tuple(iterator))
# Load all JSON Lines files matching the pattern "*.jsonl" from a remote archive
for filename, iterator in jsonl.load_archive("https://example.com/archive.zip"):
print("Filename:", filename)
print("Data:", tuple(iterator))
Dumping data to Multiple JSON Lines Files
Use jsonl.dump_fork to incrementally write structured data to multiple JSON Lines files,
which can be useful when you want to separate data based on some criteria.
import jsonl
data = [
# Create `file1.jsonl` or overwrite it if it exists
("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
# Create `file2.jsonl` or overwrite it if it exists
("file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
# Append to `file1.jsonl`
("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_fork(data)
Documentation
For more detailed information and usage examples, refer to the project documentation
Development
To contribute to the project, you can run the following commands for testing and documentation:
First, ensure you have the latest version of pip:
python -m pip install --upgrade pip
Running Unit Tests
Install the development dependencies and run the tests:
pip install --group=test # Install test dependencies
pytest tests/ # Run all tests
pytest --cov jsonl # Run tests with coverage
Running Linter
pip install --group=lint # Install linter dependencies
ruff check . # Run linter
Building the Documentation
To build the documentation locally, use the following commands:
pip install --group=doc # Install documentation dependencies
mkdocs serve # Start live-reloading docs server
mkdocs build # Build the documentation site
License
This project is licensed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_jsonl-1.3.18.tar.gz.
File metadata
- Download URL: py_jsonl-1.3.18.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f822ceb07b369df8c28a32a530ffb48d302bdf4b565efa1dcf7d2df356210d13
|
|
| MD5 |
8fe1e6cf84e3a8072eae87dbaecc12e9
|
|
| BLAKE2b-256 |
ce31b9f88d2a88e384ff409f2ae50c9b850d1376f95b3c09bf2860ef17d8ccca
|
File details
Details for the file py_jsonl-1.3.18-py3-none-any.whl.
File metadata
- Download URL: py_jsonl-1.3.18-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecdd15a581c328511d2000c7b9ee146b35b24fcbdce4fa46ac9174c681998e33
|
|
| MD5 |
813b4ca929f90a5c1add3baa3cc3ee59
|
|
| BLAKE2b-256 |
43eef4dc15240153197a5efa71968cb895086987c5f39f7b85916ca341106113
|