Print paragraphs matching regular expressions
Project description
paragrep - Paragraph Grep utility
Usage
paragrep [-aiotv] [-p eop_regexp] [-e regexp] … [-f exp_file] … [file ] …
paragrep [-itv] [-p eop_regexp] regexp [file] …
Options
- -a, --and
Logically AND all regular expressions
- -e regexp, --regexp=regexp, --expr=regexp
Specify a regular expression to find. This option may be specified multiple times.
- -f expr_file, --file=expr_file
Specify a file of regular expressions, one per line.
- -h, --help
Show this message and exit
- -i, --caseblind
Match without regard to case
- -o, --or
Logically OR all regular expressions
- -p eop_regexp, --eop=eop_regexp
Specify an alternate regular expression to match the end of a paragraph. Default: ^\s*$
Description
paragrep is a paragraph grep utility. It searches for a series of regular expressions in a text file (or several text files) and prints out the paragraphs containing those expressions. Normally paragrep displays a paragraph if it contains any of the expressions; this behavior can be modified by using the -a option.
By default, a paragraph is defined as a block of text delimited by an empty or blank line; this behavior can be altered with the -p option.
If no files are specified on the command line, paragrep searches standard input.
This is the third implementation of paragrep. The first implementation, in 1989, was in C. The second implementation, in 2003, was in perl. This is the latest and greatest.
Options in Detail
-a
The and option: Only display a paragraph if it contains all the regular expressions specified. The default is to display a paragraph if it contains any of the regular expressions. See the -o option, below.
-e expression
Adds a regular expression to the set of expressions to use when matching paragraphs. More than one -e argument may be specified. If there’s only one expression, the -e may be omitted for brevity. (Think sed.)
-f expfile
Specifies a file containing regular expressions, one expression per line. Each expression in the file is added to the set of expression against which paragraphs are to be matched. More than one -f argument is permitted. Also, -f and -e may be specified together.
-i
Considers upper- and lower-case letters to be identical when making comparisons.
-o
The or option: Display a paragraph if it contains any the regular expressions specified. Since this option is the default, it is rarely specified on the command line. It exists primarily to negate the effect of a previous -a option. (e.g., If you’ve defined an alias for paragrep that specifies the -a option, -o would be necessary to force the or behavior.)
-p eop_expression
Specifies a regular expression to be used match paragraph delimiters. Any line that matches this regular expression is assumed to delimit paragraphs without actually being part of a paragraph (i.e., lines matching this expression are never printed). If this option is not specified, it defaults to:
^[ \t]*$
which matches blank or empty lines. (\\t represents the horizontal tab character. If you need to specify a horizontal tab, you’ll need to type the actual character; paragrep doesn’t recognize C-style metacharacters.)
-v
Displays all lines that do not match specified expressions. The negation logic works on DeMorgan’s Laws. Normally, if -a is specified, paragrep uses the following logic to match the paragraph:
match = contains(expr1) AND contains(expr2) ...
Specifying -v along with -a changes this logic to:
match = lacks(expr1) OR lacks(expr2) ...
Likewise, without -a or -v (i.e., using -o, which is the default), the matching logic is:
match = contains(expr1) OR contains(expr2) ...
Negating that logic with -v causes paragrep to match paragraphs with:
match = lacks(expr1) AND lacks(expr2) ...
See Also
The Unix grep command
The Python re module (http://docs.python.org/lib/module-re.html)
Copyright and License
Copyright (c) 1989-2008 Brian M. Clapper
This is free software, released under the following BSD-like license:
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
The end-user documentation included with the redistribution, if any, must include the following acknowlegement:
This product includes software developed by Brian M. Clapper (bmc@clapper.org, http://www.clapper.org/bmc/). That software is copyright (c) 2008 Brian M. Clapper.
Alternately, this acknowlegement may appear in the software itself, if and wherever such third-party acknowlegements normally appear.
THIS SOFTWARE IS PROVIDED B{AS IS} AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL BRIAN M. CLAPPER BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.