Skip to main content

Python support for linear TSV files

Project description

Python support for linear TSV files

  • Free software: MIT license

What is Linear TSV

In contrast to Excel’s TSV dialect, linear TSV is line-based.

“But hey”, I hear you say, “isn’t TSV always line-based?”. Well, the issue arises when a cell contains a tab or newline character. In excel’s TSV format, that cell is surrounded by quotes and the entry is continued on the next line. Now you have:

  • entries spanning several lines

  • quotes that need to be ignored ()

  • quotes that are escaped by doubling them (“”)

Since entries can span several lines, many naïve file manipulations aren’t possible:

  • Taking the first 50 entries of a dataset: head -n 50 customers.tsv

  • Filtering entries: grep “Zürich” customers.tsv

  • Sorting the entries alphabetically: sort customers.tsv

All of this can be prevented if you simply:

  • escape tabs: \t

  • escape newlines: \n

  • escape carriage returns: \r

  • escape backslashes: \\

Lastly, linear TSV can also encode None as \N.

That’s linear tsv in a nutshell.

Installation

pip install tsv2dict

You can also install the in-development version with:

pip install https://github.com/nkurmann/tsv2dict/archive/master.zip

Documentation

https://tsv2dict.readthedocs.io/

Development

To run all the tests run:

tox

Note, to combine the coverage data from all the tox environments run:

Windows

set PYTEST_ADDOPTS=--cov-append
tox

Other

PYTEST_ADDOPTS=--cov-append tox

Changelog

0.0.3 (2021-03-04)

  • Converters now won’t attempt to convert None.

0.0.2 (2021-03-03)

  • Converters can convert rows retrieved into types other than strings.

0.0.1 (2021-03-02)

  • (De)Serialize None as N, consistent with SQL.

0.0.0 (2021-02-28)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsv2dict-0.0.3.tar.gz (26.3 kB view hashes)

Uploaded Source

Built Distribution

tsv2dict-0.0.3-py2.py3-none-any.whl (6.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page