Skip to main content

Leverage the built-in python csv module to write files in jsonl format

Project description

csv-jsonl

A convenient module for writing a list of dictionaries or list of lists to a .jsonl-formatted text file, suitable for ingestion by BigQuery and other services.

csv-jsonl is built on top of Python's built-in csv module. It allows you to specify a fieldnames list to add a bit of assurance. Otherwise, no schema-handling is offered.

Why not Just Use csv Files?

If you are here asking that question, I'm guessing you have not spent exciting times attempting to clean up poorly-formatted csv files (I'm looking at you, Excel).

Other Data Formats

Basically supports anything with a __getitem__, as well as dataclasses. See test for everything.

Installation

pip install csv-jsonl

Usage

List of Dictonaries

>>> from csv_jsonl import JSONLinesDictWriter
>>> l = [{"foo": "bar", "bat": 1}, {"foo": "bar", "bat": 2}]
>>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerows(l)
...
>>> d = {"foo": "bar", "bat": 1}
>>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerow(d)
...
>>> from collections import OrderedDict
>>> od = OrderedDict([('foo', 'bar'), ('bat', 1)])
>>> with open("qux.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh)
...     writer.writerow(od)
...
>>> fieldnames = ["foo", "bar"] # keys = ["foo", "bat"] expect fail
>>> with open("baz.jsonl", "w", encoding="utf-8") as _fh:
...     writer = JSONLinesDictWriter(_fh, fieldnames=fieldnames)
...     writer.writerows(l)
...
Expect ValueError

List of Lists

        >>> from csv_jsonl import JSONLineslistWriter
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesListWriter(_fh)
        ...     writer.writerows(l)
        ...
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesDictWriter(_fh)
        ...     writer.writerow(next(l))
        ...
        >>> fieldnames = ["baz", "qux", "quux"]
        >>> l = zip(["foo", "bar", "bat"], range(3), range(3))
        >>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
        ...     writer = JSONLinesListWriter(_fh, fieldnames=fieldnames)
        ...     writer.writeheader()
        ...     writer.writerows(l)
        ...

pipeline status Latest Release Downloads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv-jsonl-0.1.6.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

csv_jsonl-0.1.6-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file csv-jsonl-0.1.6.tar.gz.

File metadata

  • Download URL: csv-jsonl-0.1.6.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for csv-jsonl-0.1.6.tar.gz
Algorithm Hash digest
SHA256 f270e5babf7f8e42804de30289f1fed7cb1f3d2b70899b6e2ecf814a26eb3b98
MD5 38d54d9dc912de83c4cb472e4e3ea52f
BLAKE2b-256 a17aeada95f82b5f627d77c8b4f4f34ec66e4d21c3c52587908eb5c2fddaaa72

See more details on using hashes here.

File details

Details for the file csv_jsonl-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: csv_jsonl-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for csv_jsonl-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a1f0113ca916cbb60775bea2ebd31fec94dcf5c4794811189cff0cce7ed280a2
MD5 e6b1d7a2d9659cc7380a3ce6d417601b
BLAKE2b-256 a434c51199c7c5d2243bccfd4ddf9be898c049e6a453705dc851d73cca68d6a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page