Skip to main content

A lightweight Python library for handling jsonlines files.

Project description

jsonl

GitHub tag CI pypi versions codecov license Linter: ruff Downloads


About

jsonl is a lightweight Python library designed to simplify working with JSON Lines data, adhering to the jsonlines and ndjson specifications.

🎯 Features

  • 🌎 Provides an API similar to Python's standard json module.
  • 🚀 Supports custom (de)serialization via user-defined callbacks.
  • 🗜️ Built-in support for gzip, bzip2, xz compression formats and ZIP or TAR archives.
  • 🔧 Skips malformed lines during file loading.
  • 📥 Loads from URLs directly.
  • 🐍 No external dependencies: relies only on the Python standard library.

📦 Installation

To install jsonl using pip, run the following command:

pip install py-jsonl

⚡ Quick Start

Dumping data to a JSON Lines File

[!NOTE]

Use jsonl.dump to incrementally write an iterable of dictionaries to a JSON Lines file:

# -*- coding: utf-8 -*-

import jsonl

data = [
    {"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
    {"name": "May", "wins": []},
]

jsonl.dump(data, "file.jsonl")

Loading data from a JSON Lines source

[!NOTE]

Use jsonl.load to incrementally load a JSON Lines source—such as a filename, URL, or file-like object—into as an iterator of dictionaries:

# -*- coding: utf-8 -*-

import jsonl

# Load data from a JSON Lines file
iterator = jsonl.load("file.jsonl")
print(tuple(iterator))

# Load data from a URL
iterator = jsonl.load("https://example.com/file.jsonl")
print(tuple(iterator))

Dump multiple JSON Lines Files into an Archive (ZIP or TAR)

[!NOTE]

Use jsonl.dump_archive to incrementally write structured data to multiple JSON Lines files, which are then stored in a ZIP or TAR archive.

# -*- coding: utf-8 -*-

import jsonl

data = [
    # Create `file1.jsonl` withing the archive
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` within the archive
    ("path/to/file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl` within the archive
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_archive("archive.zip", data)

Load multiple JSON Lines Files from an Archive (ZIP or TAR)

[!NOTE]

Use jsonl.load_archive to incrementally load multiple JSON Lines files from a ZIP or TAR archive.

[!TIP]

# -*- coding: utf-8 -*-

import jsonl

# Load all JSON Lines files matching the pattern "*.jsonl" from a local archive
for filename, iterator in jsonl.load_archive("archive.zip"):
    print("Filename:", filename)
    print("Data:", tuple(iterator))

# Load all JSON Lines files matching the pattern "*.jsonl" from a remote archive
for filename, iterator in jsonl.load_archive("https://example.com/archive.zip"):
    print("Filename:", filename)
    print("Data:", tuple(iterator))

Dumping data to Multiple JSON Lines Files

[!NOTE]

Use jsonl.dump_fork to incrementally write structured data to multiple JSON Lines files, which can be useful when you want to separate data based on some criteria.

# -*- coding: utf-8 -*-

import jsonl

data = [
    # Create `file1.jsonl` or overwrite it if it exists
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` or overwrite it if it exists
    ("file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl`
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_fork(data)

📚 Documentation

For more detailed information and usage examples, refer to the project documentation

🛠️ Development

To contribute to the project, you can run the following commands for testing and documentation:

First, ensure you have the latest version of pip:

python -m pip install --upgrade pip

Running Unit Tests

Install the development dependencies and run the tests:

pip install --group=test  --upgrade # Install test dependencies, skip if already installed
python -m pytest tests/ # Run all tests
python -m pytest tests/ --cov # Run tests with coverage

Running Linters

pip install --group=lint --upgrade  # Install lint dependencies, skip if already installed
ruff check . # Run linter
spxl . # Run sphinx-linter for docstring issues
pymport . # Check for import issues

Building the Documentation

To build the documentation locally, use the following commands:

pip install --group=doc --upgrade  # Install doc dependencies, skip if already installed
mkdocs serve # Start live-reloading docs server
mkdocs build # Build the documentation site

🗒️ License

This project is licensed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_jsonl-1.3.22.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_jsonl-1.3.22-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file py_jsonl-1.3.22.tar.gz.

File metadata

  • Download URL: py_jsonl-1.3.22.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for py_jsonl-1.3.22.tar.gz
Algorithm Hash digest
SHA256 8fd944a27354e45bda2ce1868d4ece33431e386b4a9a06682fde45ee6e86f566
MD5 c43c58bf5b2093512b5a051b4c64abc0
BLAKE2b-256 08497bfbe310788a6003cd726fcc7f8837b57ab3b4bd9e2eef1b2baf87886c97

See more details on using hashes here.

File details

Details for the file py_jsonl-1.3.22-py3-none-any.whl.

File metadata

  • Download URL: py_jsonl-1.3.22-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for py_jsonl-1.3.22-py3-none-any.whl
Algorithm Hash digest
SHA256 3bcffd9870615ccd05f790bc960087b082dad82fab365885bf43ec6dc2584a7a
MD5 972c955dc328a77973084bdffc33ce8f
BLAKE2b-256 e3f3a0ac98740161152a902f990dc5d9a51907dc66be981ee97a69f10b4c39d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page