Search a directory path for any of a list of strings.
Walk a directory tree, searching for files containing any of a set of text strings.
Why not just use find and grep?
- Avoids long, hard-to-debug shell commands with lots of backticks and parentheses.
- Works on Windows without needing to install a unix work-alike like Cygwin.
- Searches for a bunch of different strings in one go.
- Searches within (possibly compressed) jar, tar, or zip archives.
- Outputs results in CSV or Excel format.
- Tested on Windows 10 and Linux. May work on other platforms supported by Python.
- Python 3.4 or later (https://www.python.org/downloads/).
- Pip module appropriate to the installed Python version. (https://pip.pypa.io/en/stable/installing/).
A note about installing python and pip
Some Linux systems (also Cygwin on Windows) come with python 2 pre-installed. You have to install python 3 yourself. There are gotchas involved:
- The "python" and "pip" packages may be reserved for version 2. It may not be as easy as "apt-get install python3" either. The latest package might be called "python3.7" or similar. Ditto with pip.
- Once installed, the python 3 binary may be called "python3", not "python". Ditto with pip.
Installation from pypi:
$ python -m pip install string_path_search --user
Installation from GitHub
$ git clone email@example.com:j-lawrence-b1/string-path-search.git $ python string_path_search/setup.py install --user
Although you can import and use this package in other Python scripts, string_path_search is primarily intended to be invoked as a console app:
$ python -m string_path_search [OPTIONS] <scan-root> [<search-term> [...]] where: -a, --scan-archives = Unpack and scan within archives (Default: Skip arhive files. Only jar, tar, and zip archives will be unpacked. Tar bzip2, gzip, and xz compression is supported. -B, --branding-text=<branding-text> = A string of text containing company or other information to add above the column headers in scan reports (Default: no text). -b, --branding-logo=<branding-logo> = (MS Excel only) An image file containing a corporate logo or other graphic to add above the column headers in scan reports (Default: no logo). -h, --help = Print usage information and exit. -e, --excel-output = Generate Microsoft Excel 2007 (.xlsx) output (Default: Generate comma-separated-value (CSV) text output) -i --ingore-case = Ignore UPPER/lowercase differences when matching strings (Default: case differences are significant). -o, --output-dir=<output-dir> = Location for output (Default: <current working directory>). -s, --search-strings-file=<search-strings> = A file containing strings to search for, one per line (Default: Get search strings from the command line). -q, --quiet = Decrease logging verbosity (may repeat). -qqqq will suppress all logging. -t, --temp-dir=<temp-dir> = Location for unpacking archives (Default: <output_dir>/temp). -v, --verbose = Increase logging verbosity. -x, --exclusions-file=<exclusion-file> = A file containing (base) filenames to exclude from the search results, one per line (Default: Include all results). <scan-root> = Directory to scan. <search-term> ... = One or more terms to search for in <scan-root>.
Perform a caseless search of the test/data directory for any occurrence of 'copyright', 'gpl', 'foo', 'bar', or 'baz' and output the results to a file called 'scan-<timestamp>.csv' in the current working directory.
$ python -m string_path_search -i test/data copyright gpl foo bar baz
Same as example 1, except output to an Excel spreadsheet:
$ python -m string_path_search -i -e test/data copyright gpl foo bar baz
string_path_search is distributed under the MIT License.
Disclaimer regarding the test data:
The files in the test/data folder were randomly downloaded from publicly available Open Source projects. Distributing these materials with string_path_search as test data may or may not be in violation of the applicable licenses.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size string_path_search-0.3.1.tar.gz (12.6 kB)||File type Source||Python version None||Upload date||Hashes View|
|Filename, size string_path_search-0.3.1-py3.7.egg (23.0 kB)||File type Egg||Python version 3.7||Upload date||Hashes View|
|Filename, size string_path_search-0.3.1-py3-none-any.whl (14.2 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
Hashes for string_path_search-0.3.1-py3.7.egg
Hashes for string_path_search-0.3.1-py3-none-any.whl