A simple Python library for handling jsonlines files.
Project description
jsonl
About
Useful functions for working with jsonlines data as described: https://jsonlines.org/
Features:
- Exposes an API similar to the
json
module from the standard library. - Supports
orjson
,ujson
libraries or standardjson
for serialization/deserialization, prioritizingorjson
, thenujson
, and defaulting to the standardjson
if none are installed. - Supports
gzip
andbzip2
compression formats.
Installation (via pip)
pip install py-jsonl
Usage
dumps
Serialize an iterable into a jsonlines formatted string.
dumps(iterable, **kwargs)
:param Iterable[Any] iterable: Iterable of objects
:param kwargs: `json.dumps` kwargs
:rtype: str
Examples:
import jsonl
data = ({'foo': 1}, {'bar': 2})
result = jsonl.dumps(data)
print(result) # >> '{"foo": 1}\n{"bar": 2}\n'
dump
Dump an iterable to a jsonlines file.
- Use (
.gz
,.gzip
,.bz2
) extensions to dump the compressed file. - Dumps falls back to the following functions: (
orjson.dumps
,ujson.dumps
, andjson.dumps
).
dump(iterable, file, **kwargs)
:param Iterable[Any] iterable: Iterable of objects
:param Union[str | bytes | os.PathLike | io.IOBase] file: File to dump
:param kwargs: `json.dumps` kwargs
Examples:
import gzip
import jsonl
data = ({'foo': 1}, {'bar': 2})
# Dump the data into an uncompressed file at the given path.
jsonl.dump(data, "file1.jsonl")
# Dump the data into a gzipped file at the given path.
jsonl.dump(data, "file2.jsonl.gz")
# Dump the data into the already opened gzipped file.
with gzip.open("file3.jsonl.gz", mode="wb") as fp:
jsonl.dump(data, fp)
# Append the data to the end of the existing gzipped file.
with gzip.open("file3.jsonl.gz", mode="ab") as fp:
jsonl.dump(data, fp)
dump_fork
Incrementally dumps multiple iterables into the specified jsonlines file paths, effectively reducing memory consumption.
- Use (
.gz
,.gzip
,.bz2
) extensions to dump the compressed file. - Dumps falls back to the following functions: (
orjson.dumps
,ujson.dumps
, andjson.dumps
).
dump_fork(path_iterables, dump_if_empty=True, **kwargs)
:param Iterable[str, Iterable[Any]] path_iterables: Iterable of iterables by filepath
:param bool dump_if_empty: If false, don't create an empty jsonlines file.
:param kwargs: `json.dumps` kwargs
Examples:
import jsonl
path_iterables = (
("num.jsonl", ({"value": 1}, {"value": 2})),
("foo.jsonl", ({"a": "1"}, {"b": 2})),
("num.jsonl", ({"value": 3},)),
("foo.jsonl", ()),
)
jsonl.dump_fork(path_iterables)
load
Deserialize a UTF-8-encoded jsonlines file into an iterable of Python objects.
- Recognizes (
.gz
,.gzip
,.bz2
) extensions to load compressed files. - Loads falls back to the following functions: (
orjson.loads
,ujson.loads
, andjson.loads
).
def load(file, **kwargs)
:param Union[str | bytes | os.PathLike | io.IOBase] file: File to load
:param kwargs: `json.loads` kwargs
:rtype: Iterable[Any]
Examples:
import gzip
import jsonl
# Load the uncompressed file from the given path.
iterable1 = jsonl.load("file1.jsonl")
print(tuple(iterable1))
# Load the gzipped file from the given path.
iterable2 = jsonl.load("file2.jsonl.gz")
print(tuple(iterable2))
# Load the gzipped file from the given open file.
with gzip.open("file3.jsonl.gz", mode="rb") as fp:
iterable3 = jsonl.load(fp)
print(tuple(iterable3))
Unit tests
(env)$ pip install -r requirements.txt # Ignore this command if it has already been executed
(env)$ pytest tests/
(env)$ pytest --cov jsonl # Tests with coverge
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
py_jsonl-1.1.2.tar.gz
(6.3 kB
view hashes)