Skip to main content

A lightweight Python library for handling jsonlines files.

Project description

jsonl

CI pypi versions codecov license Linter: ruff Downloads

About

jsonl is a lightweight Python library designed to simplify working with JSON Lines data, adhering to the JSON Lines format.

Features

  • 🌎 Provides an API similar to Python's standard json module.
  • 🚀 Supports custom (de)serialization via user-defined callbacks.
  • 🗜️ Built-in support for gzip, bzip2, xz compression formats and ZIP or TAR archives.
  • 🔧 Skips malformed lines during file loading.

Installation

To install jsonl using pip, run the following command:

pip install py-jsonl

Getting Started

Dumping data to a JSON Lines File

Use jsonl.dump to incrementally write an iterable of dictionaries to a JSON Lines file:

import jsonl

data = [
    {"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
    {"name": "May", "wins": []},
]

jsonl.dump(data, "file.jsonl")

Loading data from a JSON Lines File

Use jsonl.load to incrementally load a JSON Lines file into an iterable of objects:

import jsonl

iterable = jsonl.load("file.jsonl")
print(tuple(iterable))

Load multiple JSON Lines Files from an Archive (ZIP or TAR)

Use jsonl.load_archive to incrementally load multiple JSON Lines files from a ZIP or TAR archive. This function allows you to filter files using Unix shell-style wildcards.

import jsonl

# Load all JSON Lines files matching the pattern "*.jsonl" from the archive
for filename, items in jsonl.load_archive("path/to/archive.zip"):
    print("Filename:", filename)
    print("Data:", tuple(items))

Dump multiple JSON Lines Files into an Archive (ZIP or TAR)

Use jsonl.dump_archive to incrementally write structured data to multiple .jsonl files, which are then stored in a ZIP or TAR archive.

import jsonl

data = [
    # Create `file1.jsonl` withing the archive
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` within the archive
    ("path/to/file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl` within the archive
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_archive("my_archive.zip", data)

Dumping data to Multiple JSON Lines Files

Use jsonl.dump_fork to incrementally write structured data to multiple .jsonl files, which can be useful when you want to separate data based on some criteria.

import jsonl

data = [
    # Create `file1.jsonl` or overwrite it if it exists
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` or overwrite it if it exists
    ("file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl`
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_fork(data)

Documentation

For more detailed information and usage examples, refer to the project documentation

Development

To contribute to the project, you can run the following commands for testing and documentation:

First, ensure you have the latest version of pip:

python -m pip install --upgrade pip

Running Unit Tests

Install the development dependencies and run the tests:

pip install --group=test  # Install test dependencies
pytest tests/ # Run all tests
pytest --cov jsonl # Run tests with coverage

Running Linter

pip install --group=lint  # Install linter dependencies
ruff check . # Run linter

Building the Documentation

To build the documentation locally, use the following commands:

pip install --group=doc  # Install documentation dependencies
mkdocs serve # Start live-reloading docs server
mkdocs build # Build the documentation site

License

This project is licensed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_jsonl-1.3.14.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_jsonl-1.3.14-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file py_jsonl-1.3.14.tar.gz.

File metadata

  • Download URL: py_jsonl-1.3.14.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for py_jsonl-1.3.14.tar.gz
Algorithm Hash digest
SHA256 49f4fcd2d382a0c2012bc589a81fe241d5a7ca610c0fda9a85113c4fffb4c45d
MD5 3ef488e8556d7a9ce22c72555c0db972
BLAKE2b-256 5faed5300ec56a3f835f83552d6f5615feaabd6f801b5447c6d17fc60c5740b3

See more details on using hashes here.

File details

Details for the file py_jsonl-1.3.14-py3-none-any.whl.

File metadata

  • Download URL: py_jsonl-1.3.14-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for py_jsonl-1.3.14-py3-none-any.whl
Algorithm Hash digest
SHA256 d73763828b331e11473e6467417e2ffb7882c09f51f0c1819fac73c3c20ad225
MD5 9f75bf5cf2fd1edfdb8667f3a88dbb0d
BLAKE2b-256 82adc0daa7ae751485154fd857f7df98f81f23401e3f7696cfeeebfe98c249fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page