Vertical display of delimited data
Project description
csvchk
Vertical display of delimited data
This program will show you the first record of a delimited text file transposed vertically.
It is meant to complement the many features of the csvkit
tools.
For example, given a file like this:
$ csvlook test/test.csv
| id | val |
| -- | --- |
| 1 | foo |
| 2 | bar |
This program will show:
// ****** Record 1 ****** //
id : 1
val : foo
Usage and options
Run with -h
or --help
for a full usage:
$ ./csvchk.py -h
usage: csvchk.py [-h] [-s sep] [-f names] [-l nrecs] [-L nrecs] [-g grep] [-d]
[-n] [-N] [-e encode] [--version]
FILE [FILE ...]
Check a delimited text file
positional arguments:
FILE Input file(s)
optional arguments:
-h, --help show this help message and exit
-s sep, --sep sep Field separator (default: )
-f names, --fieldnames names
Field names (no header) (default: )
-l nrecs, --limit nrecs
How many records to show (default: 1)
-L nrecs, --field-limit nrecs
How many fields to show (default: 0)
-g grep, --grep grep Only show records with a given value (default: )
-d, --dense Not sparse (skip empty fields) (default: False)
-n, --number Show field number (e.g., for awk) (default: False)
-N, --noheaders No headers in first row (default: False)
-e encode, --encoding encode
File encoding (default: utf-8)
--version show program's version number and exit
Separator
The default field separator is a tab character unless the input file has the extension .csv
.
You can change this value using the -s
or --sep
option.
For example, given this file:
$ cat test/test2.txt
id:val
1:foo
2:bar
You could run:
$ csvchk -s ':' test/test2.txt
// ****** Record 1 ****** //
id : 1
val : foo
Field names
The input file is assumed to contain column headers/field names in the first row.
If a file has no such headers, you can provide a comma-separated string with -f
or --fieldnames
of values to use instead.
For example, given this file:
$ cat test/nohdr.csv
1,foo
2,bar
You can run:
$ csvchk -f 'id, value' test/nohdr.csv
// ****** Record 1 ****** //
id : 1
value : foo
Limit
By default, the program will use the -l
or --limit
value of 1
to show the first record.
You can increase this, for example:
$ csvchk -l 2 test/test.csv
// ****** Record 1 ****** //
id : 1
val : foo
// ****** Record 2 ****** //
id : 2
val : bar
To see all the records, use a negative value like -1
:
$ csvchk -l -1 test/test.csv
// ****** Record 1 ****** //
id : 1
val : foo
// ****** Record 2 ****** //
id : 2
val : bar
// ****** Record 3 ****** //
id : 3
val : baz
Dense output
By default, all fields and values will be shown for each record. For example, given this file:
$ cat test/sparse.csv
id,val
1,foo
2,
,baz
This will be shown:
$ csvchk test/sparse.csv -l -1
// ****** Record 1 ****** //
id : 1
val : foo
// ****** Record 2 ****** //
id : 2
val :
// ****** Record 3 ****** //
id :
val : baz
You can use the -d
or --dense
option to omit fields that have no values:
$ csvchk test/sparse.csv -l -1 -d
// ****** Record 1 ****** //
id : 1
val : foo
// ****** Record 2 ****** //
id : 2
// ****** Record 3 ****** //
val : baz
Numbering fields
The -n
or --number
option will append the field numbers before the output:
$ csvchk -n test/test.tab
// ****** Record 1 ****** //
1 id : 1
2 val : foo
This can be useful if you would like to know the field number to use with awk
, e.g., we could look for records where the val
column (in the second position) has an "a":
$ awk '$2 ~ /a/' test/test.tab
id val
2 bar
No headers
If the input file does not have headers (column names) in the first row, you can use the -N
or --noheaders
option to have the program create names like "Field1," "Field2," etc.:
$ csvchk -N test/nohdr.csv
// ****** Record 1 ****** //
Field1 : 1
Field2 : foo
Filter by record contents
You can use the -g
or --grep
option to view only records containing a string:
$ csvchk -g ba -l 2 tests/test.csv
// ****** Record 1 ****** //
id : 2
val : bar
// ****** Record 2 ****** //
id : 3
val : baz
Multiple file inputs
If given multiple files as inputs, the program will insert a header noting the basename of each file:
$ csvchk test/test.csv test/test.tab
==> test.csv <==
// ****** Record 1 ****** //
id : 1
val : foo
==> test.tab <==
// ****** Record 1 ****** //
id : 1
val : foo
Duplicate Column Names
Duplicate column names will have a suffix of _<num>
starting at the second occurrence.
For instance, this file:
$ cat tests/duplicate_cols.csv
name,age,age
Keith,42,42
Jorge,35,35
Geoffrey,51,51
Will produce this output:
$ csvchk tests/duplicate_cols.csv
// ****** Record 1 ****** //
name : Keith
age : 42
age_2 : 42
Limiting the Columns Shown
You may wish to limit the number of columns shown using the -L|--field-limit
option:
$ csvchk --field-limit 1 tests/test.csv
// ****** Record 1 ****** //
id : 1
Author
Ken Youens-Clark kyclark@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file csvchk-0.3.2.tar.gz
.
File metadata
- Download URL: csvchk-0.3.2.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac285baca7dc4a2ec4b3a91434b96e96108272f962f60dc2537c7280399e5624 |
|
MD5 | 72cae34885477dfb942848f4cd1cac50 |
|
BLAKE2b-256 | a233a3237d0ed44731ebe973adffd2623f8e40342c7250a3f321c7ca5c130b8d |
File details
Details for the file csvchk-0.3.2-py3-none-any.whl
.
File metadata
- Download URL: csvchk-0.3.2-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a0f5a2090016c5a359116f01c8eac617fc36a64bab041163d8d63fba2b10817 |
|
MD5 | 3398a06b2a47027119f0e90518bcc19a |
|
BLAKE2b-256 | 68b88ec5d6423c65121cc44db7d8c2653f8db583238a963ad858623b85349dc8 |