Skip to main content

A lightweight Python library for handling jsonlines files.

Project description

jsonl

A lightweight, dependency-free Python library for JSON Lines — read, write, compress, and stream with ease.

PyPI version Python versions CI Coverage License Downloads

Documentation · Changelog · Issues


jsonl provides a simple, Pythonic API for working with JSON Lines data. It follows the conventions of Python's standard json module — if you know json.dump and json.load, you already know how to use jsonl.

Fully compliant with the jsonlines and ndjson specifications.

Features

Feature Description
🌎 Familiar API Interface similar to the standard json module (dump, load, dumps)
Streaming by default Read and write incrementally via iterators, keeping memory usage low
🗜️ Built-in compression Transparent support for gzip, bzip2, xz, and zst (Python ≥ 3.14)
📦 Archive support Read and write ZIP and TAR archives (.tar.gz, .tar.bz2, .tar.xz, and .tar.zst (Python ≥ 3.14) )
📥 Load from URLs Pass a URL directly to load() or load_archive()
🚀 Pluggable serialization Swap in orjson, or any JSON library
🔧 Error tolerance Optionally skip malformed lines instead of crashing
🐍 Zero dependencies Uses only the Python standard library — nothing else

Installation

pip install py-jsonl

Requires Python 3.8+. No external dependencies.

Quick Start

Write

import jsonl

data = [
    {"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
    {"name": "May", "wins": []},
]

jsonl.dump(data, "players.jsonl")

Read

import jsonl

for item in jsonl.load("players.jsonl"):
    print(item)

Read from a URL

import jsonl

for item in jsonl.load("https://example.com/data.jsonl"):
    print(item)

Compressed files

The compression format is determined automatically — by file extension when writing, and by magic numbers when reading if the file extension is not recognized:

import jsonl

data = [{"key": "value"}]

jsonl.dump(data, "file.jsonl.gz")  # gzip
jsonl.dump(data, "file.jsonl.bz2")  # bzip2
jsonl.dump(data, "file.jsonl.xz")  # xz
jsonl.dump(data, "file.jsonl.zst")  # zst (Python ≥ 3.14) 

for item in jsonl.load("file.jsonl.gz"):
    print(item)

Archives (ZIP / TAR)

import jsonl

# Write multiple files into an archive
data = [
    ("users.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    ("orders.jsonl", [{"id": 1, "total": 99.90}, {"id": 2, "total": 45.00}]),
]
jsonl.dump_archive("data.tar.gz", data)

# Read them back
for filename, items in jsonl.load_archive("data.tar.gz"):
    print(f"--- {filename} ---")
    for item in items:
        print(item)

Multiple output files

import jsonl

data = [
    ("file1.jsonl", [{"name": "Alice"}, {"name": "Bob"}]),
    ("file2.jsonl", [{"name": "Charlie"}]),
    ("file1.jsonl", [{"name": "Eve"}]),  # appended to file1.jsonl
]

jsonl.dump_fork(data)

API Overview

Reading

Function Description
jsonl.load(source, **kw) Read from a file, URL, or file-like object
jsonl.load_archive(file, **kw) Unpack JSON Lines files from a ZIP or TAR archive
jsonl.loader(stream, broken, **kw) Low-level generator deserializing a line stream

[!TIP] All read functions accept cls and **kwargs for custom decoding.

Writing

Function Description
jsonl.dump(iterable, file, **kw) Write objects to a JSON Lines file
jsonl.dumps(iterable, **kw) Serialize to a JSON Lines string
jsonl.dump_fork(paths, **kw) Write to multiple JSON Lines files at once
jsonl.dump_archive(path, data, **kw) Pack multiple JSON Lines files into a ZIP or TAR archive
jsonl.dumper(iterable, **kw) Low-level generator yielding formatted lines

[!TIP] All write functions accept cls and **kwargs for custom encoding.

For complete parameter documentation, see the full docs →

Custom Serialization

Plug in any JSON-compatible serializer.

For example, orjson for high-performance encoding:

import orjson  # ensure orjson is installed: pip install orjson
import jsonl

data = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]

# Write with orjson (returns bytes → set text_mode=False)
jsonl.dump(data, "fast.jsonl", text_mode=False, cls=orjson.dumps)

# Read with orjson
for item in jsonl.load("fast.jsonl", cls=orjson.loads):
    print(item)

Another example: using custom cls with **kwargs for various purposes, for example:

import datetime
import decimal
import json

import jsonl


class UpperDecoder(json.JSONDecoder):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, object_hook=self.object_hook, parse_float=decimal.Decimal, **kwargs)

    def object_hook(self, obj):
        return {k.upper(): v for k, v in obj.items()}


class ISODateEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime.date):
            return obj.isoformat()
        return super().default(obj)


data = [
    {"name": "Alice", "birthdate": datetime.date(2000, 1, 1)},
    {"name": "Bob", "birthdate": datetime.date(2005, 1, 1)}
]

#  Write using a custom encoder to serialize datetime objects as ISO strings
jsonl.dump(data, "file.jsonl", cls=ISODateEncoder)

# Read using a custom decoder to convert floats into Decimal and uppercase all keys
for item in jsonl.load("file.jsonl", cls=UpperDecoder):
    print(item)

keyword arguments are forwarded to the underlying serializer:

import jsonl

data = [{"name": "Alice", "score": 9.5}, {"name": "Bob", "score": 7.2}]

jsonl.dump(data, "compact.jsonl", separators=(",", ":"))  # compact output
jsonl.dump(data, "sorted.jsonl", sort_keys=True)  # deterministic keys

Supported Formats

Type Extensions
Plain .jsonl
Compressed .jsonl.gz, .jsonl.bz2, .jsonl.xz, .jsonl.zst (Python ≥ 3.14)
ZIP archive .zip
TAR archive .tar, .tar.gz, .tar.bz2, .tar.xz, .tar.zst (Python ≥ 3.14)

When reading, if the file extension is not recognized, jsonl falls back to magic-number detection to identify the compression format automatically.

Contributing

# Install dev dependencies
pip install --group=test --upgrade

# Run tests
python -Wd -m pytest tests/
python -Wd -m pytest tests/ --cov  # run with coverage reporting

# Lint
pip install --group=lint --upgrade
ruff check .

# Docs
pip install --group=doc --upgrade

# zensical usage: https://zensical.org/docs/usage/
zensical build 
zensical serve

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_jsonl-1.4.0.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_jsonl-1.4.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file py_jsonl-1.4.0.tar.gz.

File metadata

  • Download URL: py_jsonl-1.4.0.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for py_jsonl-1.4.0.tar.gz
Algorithm Hash digest
SHA256 765c80b3542ccbb664ab4cd04787cabff9ed2527cf7ecdf4c601f57a0e34e38e
MD5 f6c689efdb3404106458dba3d877fc03
BLAKE2b-256 39a29a417f20d2c9fcebe89398f4826523250e566a49f3b562e958fbfa3ec2d9

See more details on using hashes here.

File details

Details for the file py_jsonl-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: py_jsonl-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for py_jsonl-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 312821fb22e9be6218c6b88cff9679f884b7db61673894a8ac8685879fab3647
MD5 074fa381e564d32363a389ce6b39f38c
BLAKE2b-256 47d87bddf664be15bc37e96278d1943a4aff0ff5bda07156af68b498c86072db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page