Skip to main content

Python reader/writer for CSV files with YAML header information.

Project description

CSVY for Python

Test and build PyPI version shields.io PyPI status PyPI pyversions PyPI license codecov Codacy Badge pre-commit.ci status

All Contributors

CSV is a popular format for storing tabular data used in many disciplines. Metadata concerning the contents of the file is often included in the header, but it rarely follows a format that is machine readable - sometimes is not even human readable! In some cases, such information is provided in a separate file, which is not ideal as it is easy for data and metadata to get separated.

CSVY is a small Python package to handle CSV files in which the metadata in the header is formatted in YAML. It supports reading/writing tabular data contained in numpy arrays, pandas DataFrames and nested lists, as well as metadata using a standard python dictionary. Ultimately, it aims to incorporate information about the CSV dialect used and a Table Schema specifying the contents of each column to aid the reading and interpretation of the data.

Instalation

'pycsvy' is available in PyPI therefore its installation is as easy as:

pip install pycsvy

In order to support reading into numpy arrays or into pandas DataFrames, you will need to install those two packages, too.

Usage

In the simplest case, to save some data contained in data and some metadata contained in a metadata dictionary into a CSVY file important_data.csv (the extension is not relevant), just do the following:

import csvy

csvy.write("important_data.csv", data, metadata)

The resulting file will have the YAML-formatted header in between --- markers with, optionally, a comment character starting each header line. It could look something like the following:

---
name: my-dataset
title: Example file of csvy
description: Show a csvy sample file.
encoding: utf-8
schema:
  fields:
  - name: Date
    type: object
  - name: WTI
    type: number
---
Date,WTI
1986-01-02,25.56
1986-01-03,26.00
1986-01-06,26.53
1986-01-07,25.85
1986-01-08,25.87

For reading the information back:

import csvy

# To read into a numpy array
data, metadata = csvy.read_to_array("important_data.csv")

# To read into a pandas DataFrame
data, metadata = csvy.read_to_dataframe("important_data.csv")

The appropriate writer/reader will be selected based on the type of data:

  • numpy array: np.savetxt and np.loadtxt
  • pandas DataFrame: pd.DataFrame.to_csv and pd.read_csv
  • nested lists:' csv.writer and csv.reader

Options can be passed to the tabular data writer/reader by setting the csv_options dictionary. Likewise you can set the yaml_options dictionary with whatever options you want to pass to yaml.safe_load and yaml.safe_dump functions, reading/writing the YAML-formatted header, respectively.

Finally, you can control the character(s) used to indicate comments by setting the comment keyword when writing a file. By default, there is no character (""). During reading, the comment character is found atomatically.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Diego Alonso Álvarez
Diego Alonso Álvarez

🚇 🤔 🚧 ⚠️ 🐛 💻
Alex Dewar
Alex Dewar

🤔 ⚠️ 💻
Adrian D'Alessandro
Adrian D'Alessandro

🐛 💻 📖

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycsvy-0.2.2.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

pycsvy-0.2.2-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file pycsvy-0.2.2.tar.gz.

File metadata

  • Download URL: pycsvy-0.2.2.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.11.2 Darwin/22.3.0

File hashes

Hashes for pycsvy-0.2.2.tar.gz
Algorithm Hash digest
SHA256 58984837b865f3e370d3bd81ad572a34fda528d4d48af6d459994ea70179c50b
MD5 ac963b7f5bd7ed457e40341bce2c591a
BLAKE2b-256 eec69bfb0766d2e267ce49a7528909922020366ff0067f65ee4baeadaa5c00a8

See more details on using hashes here.

File details

Details for the file pycsvy-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: pycsvy-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.11.2 Darwin/22.3.0

File hashes

Hashes for pycsvy-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3d49b1b89fb72178c78f74fbdeead30bf46a44e36a8c1a2e463f9bccd323a448
MD5 32c38dc78536db8eb351b22ed2a5c88b
BLAKE2b-256 3f912d7aa077fb68f2089b5193f4744e614a97fa05bda3cfc4dc5a4bdb9c9290

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page