Skip to main content

A command line tool for searching multiple substrings in a multiline text file

Project description

findany

A command-line utility that retains only those lines from a text file or standard input that contain at least one substring from a list.

Features

  • Search for multiple substrings. It is designed for efficient matching of millions of substrings.
  • Reads from standard input or a text file.
  • Writes filtered lines to standard output or redirects them to a text file.
  • Optional case-insensitive search.
  • Optional inversion of search.
  • Optionally prints only the first matched substring instead of the entire line (incompatible with inverted search).
  • Supports binary files.
  • Progress bar to show search progress when processing large files, if the output is redirected to a file.
  • Runs on Windows and Linux.

Installation

Option 1: Use npm

npm i -g findany

Option 2: Use pip

pip install findany

On Windows, make sure the /Scripts directory is added to PATH.

Option 3: Download binary

  1. Download the latest release from the Releases page.
  2. Save it to any folder (e.g. /usr/local/bin/ or C:\Users\<user>\AppData\Local\), add that folder to the PATH variable.
  3. Make executable if needed (chmod +x /usr/local/bin/findany).

Build

The program is written in C and can be compiled with gcc. It is recommended to enable SSE4.1 for low-level optimizations.

gcc -msse4.1 ./src/findany.c -o findany -O3

Test

Functional tests are configured using YAML files and run with pytest.

cd ./test && python -m pytest ./test.py

Usage

findany [OPTIONS] [SUBSTRINGS] [FILE]

Options

  • -i, --case-insensitive: Perform a case-insensitive search. By default, searches are case-sensitive.
  • -v, --invert: Search for lines that contain none of the specified substrings.
  • -o, --output OUTPUT: Redirect the output to OUTPUT instead of printing to standard output. It enables a progress-bar.
  • -s, --substring SUBSTRING: Receive a substring from a command-line argument instead of a file. It can be used multiple times. Must not be used together with the SUBSTRINGS argument.
  • -h, --help: Display the help message and exit.

Arguments

  • SUBSTRINGS: A file containing substrings to search for. Each line in this file represents a substring to search for.
  • FILE: The file to search in. If not provided, standard input will be used.

Example

  1. Search in a file for any of the substrings specified in substrings.txt and write the matching lines to output.txt:
findany -o output.txt substrings.txt input.txt
  1. Case-insensitive search for substrings:
findany -i substrings.txt input.txt
  1. Read from standard input and write to standard output:
cat input.txt | findany substrings.txt > output.txt
  1. Read from standard input, write to standard output, pass two substrings via command-line arguments:
findany -s mySubstring -s otherSubstring < input.txt > output.txt

More examples are available in the test cases folder.

License

This program is licensed under the GNU General Public License v3.0. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

findany-1.2.0.tar.gz (126.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

findany-1.2.0-py3-none-any.whl (125.8 kB view details)

Uploaded Python 3

File details

Details for the file findany-1.2.0.tar.gz.

File metadata

  • Download URL: findany-1.2.0.tar.gz
  • Upload date:
  • Size: 126.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for findany-1.2.0.tar.gz
Algorithm Hash digest
SHA256 171c6722b6d5a7e69812db7da39555b554c06ca63c1c702f09544868ed416abc
MD5 5ba000525285d261f12d3414415f996e
BLAKE2b-256 33236de917bb786f1e1c3f78cf38d24399bb3cf349013937462c76927cf66a75

See more details on using hashes here.

File details

Details for the file findany-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: findany-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 125.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for findany-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a85feb3fa87ea689bba226d224cef3ed0720aa5d36c8da5387ec86c6773250b
MD5 02ce1d6cb1c1f159f5e29d3f33e7ff6f
BLAKE2b-256 58160970594bf3bff7fbfe4ea8bb71aa71ba7e243a818e08ad48be2a34ed4737

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page