Skip to main content

Vertical display of delimited data

Project description

csvchk

Vertical display of delimited data

This program will show you the first record of a delimited text file transposed vertically. It is meant to complement the many features of the csvkit tools. For example, given a file like this:

$ csvlook test/test.csv
| id | val |
| -- | --- |
|  1 | foo |
|  2 | bar |

This program will show:

// ****** Record 1 ****** //
id  : 1
val : foo

Usage and options

Run with -h or --help for a full usage:

$ ./csvchk.py -h
usage: csvchk.py [-h] [-s sep] [-f names] [-l nrecs] [-L nrecs] [-g grep] [-d]
                 [-n] [-N] [-e encode] [--version]
                 FILE [FILE ...]

Check a delimited text file

positional arguments:
  FILE                  Input file(s)

optional arguments:
  -h, --help            show this help message and exit
  -s sep, --sep sep     Field separator (default: )
  -f names, --fieldnames names
                        Field names (no header) (default: )
  -l nrecs, --limit nrecs
                        How many records to show (default: 1)
  -L nrecs, --field-limit nrecs
                        How many fields to show (default: 0)
  -g grep, --grep grep  Only show records with a given value (default: )
  -d, --dense           Not sparse (skip empty fields) (default: False)
  -n, --number          Show field number (e.g., for awk) (default: False)
  -N, --noheaders       No headers in first row (default: False)
  -e encode, --encoding encode
                        File encoding (default: utf-8)
  --version             show program's version number and exit

Separator

The default field separator is a tab character unless the input file has the extension .csv. You can change this value using the -s or --sep option.

For example, given this file:

$ cat test/test2.txt
id:val
1:foo
2:bar

You could run:

$ csvchk -s ':' test/test2.txt
// ****** Record 1 ****** //
id  : 1
val : foo

Field names

The input file is assumed to contain column headers/field names in the first row. If a file has no such headers, you can provide a comma-separated string with -f or --fieldnames of values to use instead.

For example, given this file:

$ cat test/nohdr.csv
1,foo
2,bar

You can run:

$ csvchk -f 'id, value' test/nohdr.csv
// ****** Record 1 ****** //
id    : 1
value : foo

Limit

By default, the program will use the -l or --limit value of 1 to show the first record. You can increase this, for example:

$ csvchk -l 2 test/test.csv
// ****** Record 1 ****** //
id  : 1
val : foo
// ****** Record 2 ****** //
id  : 2
val : bar

To see all the records, use a negative value like -1:

$ csvchk -l -1 test/test.csv
// ****** Record 1 ****** //
id  : 1
val : foo
// ****** Record 2 ****** //
id  : 2
val : bar
// ****** Record 3 ****** //
id  : 3
val : baz

Dense output

By default, all fields and values will be shown for each record. For example, given this file:

$ cat test/sparse.csv
id,val
1,foo
2,
,baz

This will be shown:

$ csvchk test/sparse.csv -l -1
// ****** Record 1 ****** //
id  : 1
val : foo
// ****** Record 2 ****** //
id  : 2
val :
// ****** Record 3 ****** //
id  :
val : baz

You can use the -d or --dense option to omit fields that have no values:

$ csvchk test/sparse.csv -l -1 -d
// ****** Record 1 ****** //
id  : 1
val : foo
// ****** Record 2 ****** //
id : 2
// ****** Record 3 ****** //
val : baz

Numbering fields

The -n or --number option will append the field numbers before the output:

$ csvchk -n test/test.tab
// ****** Record 1 ****** //
  1 id  : 1
  2 val : foo

This can be useful if you would like to know the field number to use with awk, e.g., we could look for records where the val column (in the second position) has an "a":

$ awk '$2 ~ /a/' test/test.tab
id	val
2	bar

No headers

If the input file does not have headers (column names) in the first row, you can use the -N or --noheaders option to have the program create names like "Field1," "Field2," etc.:

$ csvchk -N test/nohdr.csv
// ****** Record 1 ****** //
Field1 : 1
Field2 : foo

Filter by record contents

You can use the -g or --grep option to view only records containing a string:

$ csvchk -g ba -l 2 tests/test.csv
// ****** Record 1 ****** //
id  : 2
val : bar
// ****** Record 2 ****** //
id  : 3
val : baz

Multiple file inputs

If given multiple files as inputs, the program will insert a header noting the basename of each file:

$ csvchk test/test.csv test/test.tab
==> test.csv <==
// ****** Record 1 ****** //
id  : 1
val : foo

==> test.tab <==
// ****** Record 1 ****** //
id  : 1
val : foo

Duplicate Column Names

Duplicate column names will have a suffix of _<num> starting at the second occurrence. For instance, this file:

$ cat tests/duplicate_cols.csv
name,age,age
Keith,42,42
Jorge,35,35
Geoffrey,51,51

Will produce this output:

$ csvchk tests/duplicate_cols.csv
// ****** Record 1 ****** //
name  : Keith
age   : 42
age_2 : 42

Limiting the Columns Shown

You may wish to limit the number of columns shown using the -L|--field-limit option:

$ csvchk --field-limit 1 tests/test.csv
// ****** Record 1 ****** //
id  : 1

Author

Ken Youens-Clark kyclark@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvchk-0.3.2.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

csvchk-0.3.2-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file csvchk-0.3.2.tar.gz.

File metadata

  • Download URL: csvchk-0.3.2.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for csvchk-0.3.2.tar.gz
Algorithm Hash digest
SHA256 ac285baca7dc4a2ec4b3a91434b96e96108272f962f60dc2537c7280399e5624
MD5 72cae34885477dfb942848f4cd1cac50
BLAKE2b-256 a233a3237d0ed44731ebe973adffd2623f8e40342c7250a3f321c7ca5c130b8d

See more details on using hashes here.

File details

Details for the file csvchk-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: csvchk-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for csvchk-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8a0f5a2090016c5a359116f01c8eac617fc36a64bab041163d8d63fba2b10817
MD5 3398a06b2a47027119f0e90518bcc19a
BLAKE2b-256 68b88ec5d6423c65121cc44db7d8c2653f8db583238a963ad858623b85349dc8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page