Lightweight csv read/write, keeping track of csv dialect and other metadata.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

MIT license

CSVMeta

CSVMeta is an extremely lightweight Python package designed to work with csv files and attached metadata. It writes data to an arbitrary folder such as mydata.csv/ and creates two internal files:

mydata.csv/data.csv: the usual csv file.
mydata.csv/metadata.json: metadata about the csv file, such as the csv dialect.

When reading data from mydata.csv/, it uses dialect information from the metadata file to read the csv data correctly. The metadata file can also be used to store additional information about the data, such as a data schema and a header indicator.

The package has no external dependencies beyond Python's standard library and is tested with Python 3.7+ on Linux, Windows, and macOS.

Installation

pip install csvmeta

Usage

Reading and Writing Data

Input and ouput data formats for the read and write functions are modelled on Python's csv module: data to write should be an iterable of rows, and data read will be an iterable of rows with string data types. The data header is always returned as the first row.

import csvmeta as csvm

data = [
    ['name', 'age', 'state'],
    ['Nicole', 43, 'CA'],
    ['John', 28, 'DC']
]

# Write data to a csv file folder
csvm.write('mydata.csv', data)

# Read data from a csv file folder
data = csvm.read('mydata.csv')
## [
##     ['name', 'age', 'state'],
##     ['Nicole', '43', 'CA'],
##     ['John', '28', 'DC']
## ]
##

Reading and Writing Metadata

Metadata is stored in a json file in the csv folder. The metadata file is created automatically when writing data, and only the dialect object is used when reading data. The dialect object is a dictionary of csv dialect parameters, such as delimiter, quotechar, and lineterminator. See the csv module documentation for more information.

Arbitrary metadata can be added to the metadata file by passing keyword arguments to the write function. We recommend setting the header keyword argument to True if the first row of the data is a header row, and setting the schema keyword argument to a list of column names and data types. The frictionless tabular data resource standard is a good reference for metadata schemas.

Metadata can be read using the metadata() function

import csvmeta as csvm

data = [
    ['name', 'age', 'state'],
    ['Nicole', 43, 'CA'],
    ['John', 28, 'DC']
]

# Write data and metadata to a csv file folder
csvm.write(
    'mydata.csv', 
    data, 
    header=True, 
    schema=['name', 'age', 'state'],
    dialect={
        'delimiter': ',',
        'quotechar': '"',
        'lineterminator': '\n'
    },
    description='This is an example dataset.'
    )

# Read metadata from a csv file folder
csvm.metadata('mydata.csv')
## {
##     "name": "mydata.csv",
##     "path": "data.csv",
##     "mediatype": "text/csv",
##     "dialect": {
##         "delimiter": ",",
##         "quotechar": "\"",
##         "lineterminator": "\n"
##     },
##     "header": true,
##     "schema": [
##         "name",
##         "age",
##         "state"
##     ],
##     "description": "This is an example dataset."
## }

Reading to Pandas DataFrame

import csvmeta as csvm

data = [
    ['name', 'age', 'state'],
    ['Nicole', 43, 'CA'],
    ['John', 28, 'DC']
]

# Write data and metadata to a csv file folder
csvm.write('mydata.csv', data, header=True)


data = csvm.read('mydata.csv')
metadata = csvm.metadata('mydata.csv')
if metadata.get("header", False):
    df = pd.DataFrame(data[1:], columns=data[0])
else:
    df = pd.DataFrame(data)

df
##      name age state
## 0  Nicole  43    CA
## 1    John  28    DC

Links and References

Changelog

1.1.2 (2023-11-25)

Make DEFAULT_DIALECT an explicit dictionary specification rather than "unix".
Add DEFAULT_DIALECT to tests.

1.1.1 (2023-11-25)

Change Iterable typing to Sequence to account for order and allow multiple passes over data.
Improve tests.

1.1.0 (2023-11-25)

Fix read function return type: now return list of lists instead of generator.

1.0.0 (2023-11-25)

Initial release.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.1.2

Nov 26, 2023

1.1.1

Nov 26, 2023

1.1.0

Nov 26, 2023

1.0.0

Nov 26, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvmeta-1.1.2.tar.gz (14.0 kB view details)

Uploaded Nov 26, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

csvmeta-1.1.2-py3-none-any.whl (6.1 kB view details)

Uploaded Nov 26, 2023 Python 3

File details

Details for the file csvmeta-1.1.2.tar.gz.

File metadata

Download URL: csvmeta-1.1.2.tar.gz
Upload date: Nov 26, 2023
Size: 14.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for csvmeta-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`64ababfbb41e1dcadb4a4ad61e6f9d2370566a440e55017e1402e79af7163c2b`
MD5	`cf79ead3997a47e92c63a1b134291d7e`
BLAKE2b-256	`4068254edc832aaa2e9f61eac22b19ec1ca42ac65868795da459441354435ded`

See more details on using hashes here.

File details

Details for the file csvmeta-1.1.2-py3-none-any.whl.

File metadata

Download URL: csvmeta-1.1.2-py3-none-any.whl
Upload date: Nov 26, 2023
Size: 6.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for csvmeta-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7b0e901aff1753a042594d43b1780d4b37bd172dae1ea816565339fc2d2af80`
MD5	`acb3b42e263585586227337fb94b1917`
BLAKE2b-256	`cd36cb36536063372ca6f32c9a5788d862c2dd90978241914e254e8ca5cd9e3e`

See more details on using hashes here.

csvmeta 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CSVMeta

Installation

Usage

Reading and Writing Data

Reading and Writing Metadata

Reading to Pandas DataFrame

Links and References

Changelog

1.1.2 (2023-11-25)

1.1.1 (2023-11-25)

1.1.0 (2023-11-25)

1.0.0 (2023-11-25)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes