Skip to main content

Leverage the built-in python csv module to write files in jsonl format

Project description

csv-jsonl

A convenient module for writing a list of dictionaries or list of lists to a .jsonl-formatted text file, suitable for ingestion by BigQuery and other services.

csv-jsonl is built on top of Python's built-in csv module. It allows you to specify a fieldnames list to add a bit of assurance. Otherwise, no schema-handling is offered.

Why not Just Use csv Files?

If you are here asking that question, I'm guessing you have not spent exciting times attempting to clean up poorly-formatted csv files (I'm looking at you, Excel).

Other Data Formats

Basically supports anything with a __getitem__, as well as dataclasses. See test for everything.

Installation

pip install csv-jsonl

Usage

List of Dictonaries

>>> from csv_jsonl import JSONLinesDictWriter
>>> l = [{"foo": "bar", "bat": 1}, {"foo": "bar", "bat": 2}]
>>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerows(l)
...
>>> d = {"foo": "bar", "bat": 1}
>>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerow(d)
...
>>> from collections import OrderedDict
>>> od = OrderedDict([('foo', 'bar'), ('bat', 1)])
>>> with open("qux.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerow(od)
...
>>> fieldnames = ["foo", "bar"] # keys = ["foo", "bat"] expect fail
>>> with open("baz.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh, fieldnames=fieldnames)
...     writer.writerows(l)
...
Expect ValueError

List of Lists

        >>> from csv_jsonl import JSONLineslistWriter
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesListWriter(_fh)
        ...     writer.writerows(l)
        ...
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesDictWriter(_fh)
        ...     writer.writerow(next(l))
        ...
        >>> fieldnames = ["baz", "qux", "quux"]
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesListWriter(_fh, fieldnames=fieldnames)
        ...     writer.writeheader()
        ...     writer.writerows(l)
        ...

pipeline status Latest Release Downloads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv-jsonl-0.1.6.tar.gz (17.2 kB view hashes)

Uploaded Source

Built Distribution

csv_jsonl-0.1.6-py3-none-any.whl (18.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page