Skip to main content

validate data stored in CSV, PRN, ODS or Excel files

Project description

PyPI Documentation Build Status Black

Cutplace is a tool and API to validate that tabular data stored in CSV, Excel, ODS and PRN files conform to a cutplace interface definition (CID).

As an example, consider the following customers.csv file that stores data about customers:

customer_id,surname,first_name,born,gender
1,Beck,Tyler,1995-11-15,male
2,Gibson,Martin,1969-08-18,male
3,Hopkins,Chester,1982-12-19,male
4,Lopez,Tyler,1930-10-13,male
5,James,Ana,1943-08-10,female
6,Martin,Jon,1932-09-27,male
7,Knight,Carolyn,1977-05-25,female
8,Rose,Tammy,2004-01-12,female
9,Gutierrez,Reginald,2010-05-18,male
10,Phillips,Pauline,1960-11-09,female

A CID can describe such a file in an easy to read way. It consists of three sections. First, there is the general data format:

Property

Value

D

Format

Delimited

D

Encoding

UTF-8

D

Header

1

D

Line delimiter

LF

D

Item delimiter

,

Next there are the fields stored in the data file:

Name

Example

Empty

Length

Type

Rule

F

customer_id

3798

Integer

0…99999

F

surname

Miller

…60

F

first_name

John

X

…60

F

date_of_birth

1978-11-27

DateTime

YYYY-MM-DD

F

gender

male

X

Choice

female, male

Optionally you can describe conditions that must be met across the whole file:

Description

Type

Rule

C

customer must be unique

IsUnique

customer_id

The CID can be stored in common spreadsheet formats, in particular Excel and ODS, for example cid_customers.ods.

Cutplace can validate that the data file conforms to the CID:

$ cutplace cid_customers.ods customers.csv

Now add a new line with a broken date_of_birth:

73921,Harris,Diana,04.08.1953,female

Cutplace rejects this file with the error message:

customers.csv (R12C4): cannot accept field ‘date_of_birth’: date must match format YYYY-MM-DD (%Y-%m-%d) but is: ‘04.08.1953’

Additionally, cutplace provides an easy to use API to read and write tabular data files using a common interface without having to deal with the intrinsic of data format specific modules. To read and validate the above example:

import cutplace
import cutplace.errors

cid_path = 'cid_customers.ods'
data_path = 'customers.csv'
try:
    for row in cutplace.rows(cid_path, data_path):
        pass  # We could also do something useful with the data in ``row`` here.
except cutplace.errors.DataError as error:
    print(error)

For more information, read the documentation at http://cutplace.readthedocs.org/ or visit the project at https://github.com/roskakori/cutplace.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cutplace-0.9.2.tar.gz (58.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cutplace-0.9.2-py3-none-any.whl (67.8 kB view details)

Uploaded Python 3

File details

Details for the file cutplace-0.9.2.tar.gz.

File metadata

  • Download URL: cutplace-0.9.2.tar.gz
  • Upload date:
  • Size: 58.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/24.1.0

File hashes

Hashes for cutplace-0.9.2.tar.gz
Algorithm Hash digest
SHA256 dc402ea72348e1fcb70565e16bff3385f00ccb2261017cfae7daca58eed74865
MD5 aa7d272a551b7a88c2cfa7f94b0d1ede
BLAKE2b-256 823d152072db0087523e18b27ac24515c3d5c32331032cf968f8d3bf557f9d2b

See more details on using hashes here.

File details

Details for the file cutplace-0.9.2-py3-none-any.whl.

File metadata

  • Download URL: cutplace-0.9.2-py3-none-any.whl
  • Upload date:
  • Size: 67.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/24.1.0

File hashes

Hashes for cutplace-0.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9530da835534b4f898945c4787339a75169eefacf4c77d010fa17f7c35ebe9a8
MD5 c4a3fc11123987ab1d48343b4ea5ebc5
BLAKE2b-256 0c982bcc81b5d352c8368a3e264dcb02af087d6f6874a83cdbd2a13fe90e8165

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page