Skip to main content

xgrep: A grep for Excel (and CSV/TSV) files.

Project description

xgrep

Documentation

Can be found at https://xgrep.readthedocs.io

Installation

$ pip install xgrep

Usage is similar to regular grep. Give a pattern and then one or more filenames. There are various options (some with the same name and effect as grep options, like -c, -h, -H, -i, -q, and -v). Run xgrep --help for a full listing.

Example usage

To give an example of using xgrep, suppose you have an Excel file (example.xlsx)) with a sheet that looks like this:

You can download example.xlsx (or find it in the docs/source directory in the repo) if you want to try the following commands.

Find cells based on a regular expression

Look for the regular expression 'Xia|radius|Jilin':

$ xgrep 'Xia|radius|Jilin' example.xlsx

Note that the command-line output is all text. It is produced using the Table class of the wonderful rich package.

Display the row and column information

Add the row numbers (--rn) and Excel column (--ec) information from the input Excel:

$ xgrep --rn --ec 'Xia|radius|Jilin' example.xlsx

De-emphasize unmatched cells

If you don't care about the values in cells that were not matched, you can give a value to show in unmatched (-u) cells:

$ xgrep -u . --rn --ec 'Xia|radius|Jilin' example.xlsx

Only show matching columns

To exclude columns with no matching cells, you can show only matching columns (--omc):

$ xgrep --omc --rn --ec 'Xia|radius|Jilin' example.xlsx

Write CSV (or TSV)

The default output format is a rich Table. You can also produce CSV, TSV, or Excel using the --format option:

$ xgrep --format csv 'Xia|radius|Jilin' example.xlsx

If you use --format excel you will also need to give an output filename using --out.

More...

By default, xgrep will only search the first sheet in a workbook. You can search them all by passing --sheet-id 0.

Usage

Usage: xgrep [OPTIONS] PATTERN FILENAMES...

  Command-line interface.

Options:
  -o, --out FILE                  The output file. If not given, output is
                                  written to standard out. Note that if the
                                  --quiet (-q) option is also given, no output
                                  will be written to the output file.
  --header / --no-header          Don't look for a header line in input files.
                                  In this case, text in what would otherwise
                                  be considered a header can also be matched
                                  by the grep pattern.
  --skip INTEGER RANGE            Skip this many rows at the start of the
                                  input file(s).  [x>=0]
  --format [csv|excel|rich|tsv]   The output format. The 'rich' format
                                  produces a rich Table (see https://rich.read
                                  thedocs.io/en/stable/tables.html).
  -c, --count                     Only print the number of matching lines
                                  (like grep -c).
  --width INTEGER                 The width to use for --format rich tables.
  -v, --invert                    Only output rows that do not match (like
                                  grep -v).
  -q, --quiet, --silent           Do not show any output, just exit with a
                                  status indicating whether a match was found
                                  (0) or not (1) (like grep -q).
  --ignore-missing-sheets, --ims  Do not exit if a requested sheet cannot be
                                  found in an input Excel file. A warning will
                                  be printed unless --quiet is used.
  --only-matching-cols, --omc, --mco
                                  Only show columns that have a matching cell.
  -i, --ignore-case               Ignore case while matching (like grep -i).
  --color TEXT                    The highlight color.
  -u, --unmatched TEXT            The string to show for cells whose values do
                                  not match. If not given, non-matching cells
                                  are shown with their value (in which case
                                  you will need to use the output color to see
                                  matches).
  -n, --row-numbers, --rn, --line-number
                                  Show row numbers (like grep -n).
  --col-numbers, --cn             Show numeric column numbers. For alphabetic
                                  Excel column labels, use --excel-cols.
  --excel-cols, --ec              Add Excel column labels to column names.
  -b, --basename                  Only show basenames of input files in the
                                  output (and in Excel sheet names, in the
                                  case of --format excel).
  --save-empty-output, --seo      If there are no matches in a file (or Excel
                                  sheet), nothing will be written to the
                                  output file (so the output file will not
                                  exist after xgrep exits). Use this option to
                                  force the writing of empty output files (and
                                  creation of empty Excel worksheets in the
                                  case of --format excel).
  Filenames: [mutually_exclusive]
                                  Whether to display names of matching files.
    -H, --filenames-always, --fa  Always print the name of matching files
                                  (like grep -H).
    -h, --no-filename, --nf       Never print the name of matching files (like
                                  grep -h).
    --only-filename, --of         Only print the names of matching files, not
                                  their matched content.
  --sheet-name, --sn TEXT         The name(s) of the sheet(s) to read. May be
                                  repeated. Cannot be used with --sheet-id.
  --sheet-id, --si INTEGER        The numeric number(s) of the sheet(s) to
                                  read. The default is to search all workbook
                                  sheets in all Excel files (this is
                                  equivalent to --sheet-id 0). Individual
                                  sheet numbering starts from 1. May be
                                  repeated. Cannot be used with --sheet-name.
  --sheet-separator, --ss TEXT    The string used to separate filenames from
                                  sheet names when --format excel is used and
                                  multiple files are being searched. Note that
                                  Excel does not allow some characters (e.g.,
                                  ':') in sheet names.
  --help                          Show this message and exit.

Todo

  1. Document format, polars_df, and rich_table (in readthedocs.io).
  2. Can click allow a -e option that also can be used to specify the pattern?
  3. Write tests for unequal numbers of cols.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xgrep-0.2.7.tar.gz (4.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xgrep-0.2.7-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file xgrep-0.2.7.tar.gz.

File metadata

  • Download URL: xgrep-0.2.7.tar.gz
  • Upload date:
  • Size: 4.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for xgrep-0.2.7.tar.gz
Algorithm Hash digest
SHA256 d06087850c3dcc8cd3c7e4aff38fd775d2fbe305b68a507692f991c68f949df5
MD5 7f51cc1e2a6e2ecccddfb913232dc163
BLAKE2b-256 dcebf7e3c8ab363e9f6af347a768dda9d40480a85a347f9faf9baf6bb5224b47

See more details on using hashes here.

File details

Details for the file xgrep-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: xgrep-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for xgrep-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 6ce6aab7ef35f05fa3f774b0cf7a9c2eeb9c4184532b1789483e26a3322e674d
MD5 812b1523a32068cb54c9524421a97af6
BLAKE2b-256 88e05f22f0e8d46576ae738b1fb170beee8c51c65bd0a306ff293c93d9baf07a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page