Skip to main content

Python support for linear TSV files

Project description

Python support for linear TSV files

  • Free software: MIT license

What is Linear TSV

In contrast to Excel’s TSV dialect, linear TSV is line-based.

“But hey”, I hear you say, “isn’t TSV always line-based?”. Well, the issue arises when a cell contains a tab or newline character. In excel’s TSV format, that cell is surrounded by quotes and the entry is continued on the next line. Now you have:

  • entries spanning several lines

  • quotes that need to be ignored ()

  • quotes that are escaped by doubling them (“”)

Since entries can span several lines, many naïve file manipulations aren’t possible:

  • Taking the first 50 entries of a dataset: head -n 50 customers.tsv

  • Filtering entries: grep “Zürich” customers.tsv

  • Sorting the entries alphabetically: sort customers.tsv

All of this can be prevented if you simply:

  • escape tabs: \t

  • escape newlines: \n

  • escape carriage returns: \r

  • escape backslashes: \\

Lastly, linear TSV can also encode None as \N.

That’s linear tsv in a nutshell.

Installation

pip install tsv2dict

You can also install the in-development version with:

pip install https://github.com/nkurmann/tsv2dict/archive/master.zip

Documentation

https://tsv2dict.readthedocs.io/

Development

To run all the tests run:

tox

Note, to combine the coverage data from all the tox environments run:

Windows

set PYTEST_ADDOPTS=--cov-append
tox

Other

PYTEST_ADDOPTS=--cov-append tox

Changelog

0.0.3 (2021-03-04)

  • Converters now won’t attempt to convert None.

0.0.2 (2021-03-03)

  • Converters can convert rows retrieved into types other than strings.

0.0.1 (2021-03-02)

  • (De)Serialize None as N, consistent with SQL.

0.0.0 (2021-02-28)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsv2dict-0.0.3.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tsv2dict-0.0.3-py2.py3-none-any.whl (6.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file tsv2dict-0.0.3.tar.gz.

File metadata

  • Download URL: tsv2dict-0.0.3.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.6

File hashes

Hashes for tsv2dict-0.0.3.tar.gz
Algorithm Hash digest
SHA256 8245e223b68aa16cac59e18156c6cff2cbdc3b5f0929d9bcfd1b48cfa63b9201
MD5 89d4060695904db93286397171667942
BLAKE2b-256 f3ecbfc9dca8b3638b7d6ad5e57823a8b55f7722ac3c5fd262d29844d6fed57f

See more details on using hashes here.

File details

Details for the file tsv2dict-0.0.3-py2.py3-none-any.whl.

File metadata

  • Download URL: tsv2dict-0.0.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.6

File hashes

Hashes for tsv2dict-0.0.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f1243a031b3f08118793cfbc47ef52ebd2448db6cae6ada28a942e50e0666c2c
MD5 46d109ec92b55754967df9acb208fdd1
BLAKE2b-256 144fa9dd86a09b4b91c1741dfd121b2908a5cc7e1cef4fc5c4fe8de867af440b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page