Skip to main content

Reading JSON lines (jl) files

Project description

This is a tiny library for reading JSON lines (.jl) files, including gzipped and broken files.

JSON lines is a text file format where each line is a single json encoded item.

Why?

Reading a well-formed JSON lines file is a one-liner in Python. But if the file can be broken (this happens when the process writing it is killed), handling all exceptions takes 10x more code, especially when the file is compressed.

Installation

pip install json-lines

Usage

In order to read a well-formed json lined file, pass an open file as the first argument. The file can be opened in text or binary mode, but it it’s opened in text mode, the encoding must be set correctly:

import json_lines

with open('file.jl', 'rb') as f:
    for item in json_lines.reader(f):
        print(item['x'])

There is also a helper function json_lines.open_file that recognizes “.gz” and “.gzip” extensions and opens them with gzip:: Reading files in gzip format ( extensions recognized):

with json_lines.open_file('file.jl.gz') as f:
    for item in json_lines.reader(f):
        print(item['x'])

Handling broken (cut at some point) files: read while it’s possible to decode the compressed stream and parse json, silently stopping on the first error (only logging a warning):

with json_lines.open_file('file.jl.gz') as f:
    for item in json_lines.reader(f, broken=True):
        print(item['x'])

License

License is MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

json-lines-0.1.0.tar.gz (2.4 kB view hashes)

Uploaded Source

Built Distribution

json_lines-0.1.0-py2.py3-none-any.whl (4.2 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page