Skip to main content

validated reading of tabular files (CVS, Excel, ODS, PRN)

Project description

Build Status Test coverage Code Health

Cutplace is a tool and API to validate that tabular data stored in CSV, Excel, ODS and PRN files conform to a cutplace interface definition (CID).

As an example, consider the following customers.csv file that stores data about customers:

customer_id,surname,first_name,born,gender
1,Beck,Tyler,1995-11-15,male
2,Gibson,Martin,1969-08-18,male
3,Hopkins,Chester,1982-12-19,male
4,Lopez,Tyler,1930-10-13,male
5,James,Ana,1943-08-10,female
6,Martin,Jon,1932-09-27,male
7,Knight,Carolyn,1977-05-25,female
8,Rose,Tammy,2004-01-12,female
9,Gutierrez,Reginald,2010-05-18,male
10,Phillips,Pauline,1960-11-09,female

A CID can describe such a file in an easy to read way. It consists of three sections. First, there is the general data format:

  Property Value
D Format Delimited
D Encoding UTF-8
D Header 1
D Line delimiter LF
D Item delimiter ,

Next there are the fields stored in the data file:

  Name Example Empty Length Type Rule
F customer_id 3798     Integer 0…99999
F surname Miller   …60    
F first_name John X …60    
F date_of_birth 1978-11-27     DateTime YYYY-MM-DD
F gender male X   Choice female, male

Optionally you can describe conditions that must be met across the whole file:

  Description Type Rule
C customer must be unique IsUnique customer_id

The CID can be stored in common spreadsheet formats, in particular Excel and ODS, for example cid_customers.ods.

Cutplace can validate that the data file conforms to the CID:

$ cutplace cid_customers.ods customers.csv

Now add a new line with a broken date_of_birth:

73921,Harris,Diana,04.08.1953,female

Cutplace rejects this file with the error message:

customers.csv (R12C4): cannot accept field ‘date_of_birth’: date must match format YYYY-MM-DD (%Y-%m-%d) but is: ‘04.08.1953’

Additionally, cutplace provides an easy to use API to read and write tabular data files using a common interface without having to deal with the intrinsic of data format specific modules. To read and validate the above example:

import cutplace
import cutplace.errors

cid_path = 'cid_customers.ods'
data_path = 'customers.csv'
try:
    for row in cutplace.rows(cid_path, data_path):
        pass  # We could also do something useful with the data in ``row`` here.
except cutplace.errors.DataError as error:
    print(error)

For more information, read the documentation at http://cutplace.readthedocs.org/ or visit the project at https://github.com/roskakori/cutplace.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
cutplace-0.8.8-py2.py3-none-any.whl (72.9 kB) Copy SHA256 hash SHA256 Wheel 3.4 Nov 13, 2015
cutplace-0.8.8.zip (821.8 kB) Copy SHA256 hash SHA256 Source None Nov 13, 2015

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page