validate data stored in CSV, PRN, ODS or Excel files
Project description
Cutplace is a tool and API to validate that tabular data stored in CSV, Excel, ODS and PRN files conform to a cutplace interface definition (CID).
As an example, consider the following customers.csv file that stores data about customers:
customer_id,surname,first_name,born,gender 1,Beck,Tyler,1995-11-15,male 2,Gibson,Martin,1969-08-18,male 3,Hopkins,Chester,1982-12-19,male 4,Lopez,Tyler,1930-10-13,male 5,James,Ana,1943-08-10,female 6,Martin,Jon,1932-09-27,male 7,Knight,Carolyn,1977-05-25,female 8,Rose,Tammy,2004-01-12,female 9,Gutierrez,Reginald,2010-05-18,male 10,Phillips,Pauline,1960-11-09,female
A CID can describe such a file in an easy to read way. It consists of three sections. First, there is the general data format:
Property |
Value |
|
|---|---|---|
D |
Format |
Delimited |
D |
Encoding |
UTF-8 |
D |
Header |
1 |
D |
Line delimiter |
LF |
D |
Item delimiter |
, |
Next there are the fields stored in the data file:
Name |
Example |
Empty |
Length |
Type |
Rule |
|
|---|---|---|---|---|---|---|
F |
customer_id |
3798 |
Integer |
0…99999 |
||
F |
surname |
Miller |
…60 |
|||
F |
first_name |
John |
X |
…60 |
||
F |
date_of_birth |
1978-11-27 |
DateTime |
YYYY-MM-DD |
||
F |
gender |
male |
X |
Choice |
female, male |
Optionally you can describe conditions that must be met across the whole file:
Description |
Type |
Rule |
|
|---|---|---|---|
C |
customer must be unique |
IsUnique |
customer_id |
The CID can be stored in common spreadsheet formats, in particular Excel and ODS, for example cid_customers.ods.
Cutplace can validate that the data file conforms to the CID:
$ cutplace cid_customers.ods customers.csv
Now add a new line with a broken date_of_birth:
73921,Harris,Diana,04.08.1953,female
Cutplace rejects this file with the error message:
customers.csv (R12C4): cannot accept field ‘date_of_birth’: date must match format YYYY-MM-DD (%Y-%m-%d) but is: ‘04.08.1953’
Additionally, cutplace provides an easy to use API to read and write tabular data files using a common interface without having to deal with the intrinsic of data format specific modules. To read and validate the above example:
import cutplace
import cutplace.errors
cid_path = 'cid_customers.ods'
data_path = 'customers.csv'
try:
for row in cutplace.rows(cid_path, data_path):
pass # We could also do something useful with the data in ``row`` here.
except cutplace.errors.DataError as error:
print(error)
For more information, read the documentation at http://cutplace.readthedocs.org/ or visit the project at https://github.com/roskakori/cutplace.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cutplace-0.9.2.tar.gz.
File metadata
- Download URL: cutplace-0.9.2.tar.gz
- Upload date:
- Size: 58.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/24.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc402ea72348e1fcb70565e16bff3385f00ccb2261017cfae7daca58eed74865
|
|
| MD5 |
aa7d272a551b7a88c2cfa7f94b0d1ede
|
|
| BLAKE2b-256 |
823d152072db0087523e18b27ac24515c3d5c32331032cf968f8d3bf557f9d2b
|
File details
Details for the file cutplace-0.9.2-py3-none-any.whl.
File metadata
- Download URL: cutplace-0.9.2-py3-none-any.whl
- Upload date:
- Size: 67.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/24.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9530da835534b4f898945c4787339a75169eefacf4c77d010fa17f7c35ebe9a8
|
|
| MD5 |
c4a3fc11123987ab1d48343b4ea5ebc5
|
|
| BLAKE2b-256 |
0c982bcc81b5d352c8368a3e264dcb02af087d6f6874a83cdbd2a13fe90e8165
|