Skip to main content

Utility library for gitignore style pattern matching of file paths.

Project description

pathspec: Path Specification

pathspec is a utility library for pattern matching of file paths. So far this only includes Git’s wildmatch pattern matching which itself is derived from Rsync’s wildmatch. Git uses wildmatch for its gitignore files.

Tutorial

Say you have a “Projects” directory and you want to back it up, but only certain files, and ignore others depending on certain conditions:

>>> import pathspec
>>> # The gitignore-style patterns for files to select, but we're including
>>> # instead of ignoring.
>>> spec = """
...
... # This is a comment because the line begins with a hash: "#"
...
... # Include several project directories (and all descendants) relative to
... # the current directory. To reference a directory you must end with a
... # slash: "/"
... /project-a/
... /project-b/
... /project-c/
...
... # Patterns can be negated by prefixing with exclamation mark: "!"
...
... # Ignore temporary files beginning or ending with "~" and ending with
... # ".swp".
... !~*
... !*~
... !*.swp
...
... # These are python projects so ignore compiled python files from
... # testing.
... !*.pyc
...
... # Ignore the build directories but only directly under the project
... # directories.
... !/*/build/
...
... """

We want to use the GitWildMatchPattern class to compile our patterns. The PathSpec class provides an interface around pattern implementations:

>>> spec = pathspec.PathSpec.from_lines(pathspec.patterns.GitWildMatchPattern, spec.splitlines())

That may be a mouthful but it allows for additional patterns to be implemented in the future without them having to deal with anything but matching the paths sent to them. GitWildMatchPattern is the implementation of the actual pattern which internally gets converted into a regular expression. PathSpec is a simple wrapper around a list of compiled patterns.

To make things simpler, we can use the registered name for a pattern class instead of always having to provide a reference to the class itself. The GitWildMatchPattern class is registered as gitwildmatch:

>>> spec = pathspec.PathSpec.from_lines('gitwildmatch', spec.splitlines())

If we wanted to manually compile the patterns we can just do the following:

>>> patterns = map(pathspec.patterns.GitWildMatchPattern, spec.splitlines())
>>> spec = PathSpec(patterns)

PathSpec.from_lines() is simply a class method which does just that.

If you want to load the patterns from file, you can pass the file instance directly as well:

>>> with open('patterns.list', 'r') as fh:
>>>     spec = pathspec.PathSpec.from_lines('gitwildmatch', fh)

You can perform matching on a whole directory tree with:

>>> matches = spec.match_tree('path/to/directory')

Or you can perform matching on a specific set of file paths with:

>>> matches = spec.match_files(file_paths)

Or check to see if an individual file matches:

>>> is_matched = spec.match_file(file_path)

License

pathspec is licensed under the Mozilla Public License Version 2.0. See LICENSE or the FAQ for more information.

In summary, you may use pathspec with any closed or open source project without affecting the license of the larger work so long as you:

  • give credit where credit is due,

  • and release any custom changes made to pathspec.

Source

The source code for pathspec is available from the GitHub repo cpburnz/python-path-specification.

Installation

pathspec requires the following packages:

pathspec can be installed from source with:

python setup.py install

pathspec is also available for install through PyPI:

pip install pathspec

Documentation

Documentation for pathspec is available on Read the Docs.

Other Languages

pathspec is also available as a Ruby gem.

Change History

0.8.0 (2020-04-09)

  • Issue #30: Expose what patterns matched paths. Added util.detailed_match_files().

  • Issue #31: match_tree() doesn’t return symlinks.

  • Add PathSpec.match_tree_entries and util.iter_tree_entries() to support directories and symlinks.

  • API change: match_tree() has been renamed to match_tree_files(). The old name match_tree() is still available as an alias.

  • API change: match_tree_files() now returns symlinks. This is a bug fix but it will change the returned results.

0.7.0 (2019-12-27)

  • Issue #28: Add support for Python 3.8, and drop Python 3.4.

  • Issue #29: Publish bdist wheel.

0.6.0 (2019-10-03)

0.5.9 (2018-09-15)

  • Fixed file system error handling.

0.5.8 (2018-09-15)

  • Improved type checking.

  • Created scripts to test Python 2.6 because Tox removed support for it.

  • Improved byte string handling in Python 3.

  • Issue #22: Handle dangling symlinks.

0.5.7 (2018-08-14)

  • Issue #21: Fix collections deprecation warning.

0.5.6 (2018-04-06)

  • Improved unit tests.

  • Improved type checking.

  • Issue #20: Support current directory prefix.

0.5.5 (2017-09-09)

  • Add documentation link to README.

0.5.4 (2017-09-09)

  • Issue #17: Add link to Ruby implementation of pathspec.

  • Add sphinx documentation.

0.5.3 (2017-07-01)

0.5.2 (2017-04-04)

  • Fixed change log.

0.5.1 (2017-04-04)

  • Issue #13: Add equality methods to PathSpec and RegexPattern.

0.5.0 (2016-08-22)

  • Issue #12: Add PathSpec.match_file().

  • Renamed gitignore.GitIgnorePattern to patterns.gitwildmatch.GitWildMatchPattern.

  • Deprecated gitignore.GitIgnorePattern.

0.4.0 (2016-07-15)

  • Issue #11: Support converting patterns into regular expressions without compiling them.

  • API change: Subclasses of RegexPattern should implement pattern_to_regex().

0.3.4 (2015-08-24)

  • Issue #7: Fixed non-recursive links.

  • Issue #8: Fixed edge cases in gitignore patterns.

  • Issue #9: Fixed minor usage documentation.

  • Fixed recursion detection.

  • Fixed trivial incompatibility with Python 3.2.

0.3.3 (2014-11-21)

  • Improved documentation.

0.3.2 (2014-11-08)

  • Issue #5: Use tox for testing.

  • Issue #6: Fixed matching Windows paths.

  • Improved documentation.

  • API change: spec.match_tree() and spec.match_files() now return iterators instead of sets.

0.3.1 (2014-09-17)

  • Updated README.

0.3.0 (2014-09-17)

  • Issue #3: Fixed trailing slash in gitignore patterns.

  • Issue #4: Fixed test for trailing slash in gitignore patterns.

  • Added registered patterns.

0.2.2 (2013-12-17)

  • Fixed setup.py.

0.2.1 (2013-12-17)

  • Added tests.

  • Fixed comment gitignore patterns.

  • Fixed relative path gitignore patterns.

0.2.0 (2013-12-07)

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pathspec-0.8.0.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pathspec-0.8.0-py2.py3-none-any.whl (28.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file pathspec-0.8.0.tar.gz.

File metadata

  • Download URL: pathspec-0.8.0.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.6

File hashes

Hashes for pathspec-0.8.0.tar.gz
Algorithm Hash digest
SHA256 da45173eb3a6f2a5a487efba21f050af2b41948be6ab52b6a1e3ff22bb8b7061
MD5 1d0bd1eaa0e1cc20d085fd63313240f3
BLAKE2b-256 939c4bb0a33b0ec07d2076f0b3d7c6aae4dad0a99f9a7a14f7f7ff6f4ed7fa38

See more details on using hashes here.

File details

Details for the file pathspec-0.8.0-py2.py3-none-any.whl.

File metadata

  • Download URL: pathspec-0.8.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 28.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.6

File hashes

Hashes for pathspec-0.8.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7d91249d21749788d07a2d0f94147accd8f845507400749ea19c1ec9054a12b0
MD5 95b7a0fedf7eabe8983de54e6960f17c
BLAKE2b-256 5dd0887c58853bd4b6ffc7aa9cdba4fc57d7b979b45888a6bd47e4568e1cf868

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page