Skip to main content

A Python library for .gitignore pattern filtering

Project description

gitignore-filter

A Python library that provides Git-compatible .gitignore pattern filtering with high performance and flexibility. Perfect for building tools that need to process files while respecting Git ignore rules.

Features

  • Git-Compatible Pattern Matching:

    • Full support for all Git pattern syntax
    • Handles multiple .gitignore files with proper precedence
    • Supports negative patterns (!pattern)
    • Correctly processes directory-specific patterns
    • Compatible with global gitignore settings
  • High Performance:

    • Efficient parallel processing for large directories
    • Pattern caching system for faster matching
    • Memory-efficient file system scanning
    • Optimized pattern compilation
  • Robust File Processing:

    • Proper handling of UTF-8 and other encodings
    • Symlink handling with configurable behavior
    • Handles permission errors gracefully
    • Cross-platform path normalization
  • Flexible Configuration:

    • Custom pattern support
    • Case sensitivity control
    • Configurable worker count for parallel processing
    • Integration with Git config settings

Installation

Install from PyPI:

pip install gitignore-filter

Or install from source:

git clone https://github.com/ykawataki/gitignore-filter.git
cd gitignore-filter
pip install .

Basic Usage

Python API

from gitignore_filter import git_ignore_filter

# Basic usage
files = git_ignore_filter("./my_project")

# With custom patterns
custom_patterns = [
    "*.log",
    "temp/",
    "!important.log"
]
files = git_ignore_filter("./my_project", custom_patterns=custom_patterns)

# Control case sensitivity
files = git_ignore_filter("./my_project", case_sensitive=True)

# Configure parallel processing
files = git_ignore_filter("./my_project", num_workers=4)

Advanced Features

Pattern Support

Supports all Git pattern types:

  • Basic glob patterns (*.txt, [abc].txt)
  • Character ranges ([0-9], [a-z])
  • Single character matching (?)
  • Directory patterns (pattern/)
  • Relative paths (./pattern, pattern/subpattern)
  • Recursive matching (**/pattern)
  • Negative patterns (!pattern)
  • Pattern escaping (\#notcomment)
  • Slash patterns (foo/bar/baz)
  • Root-relative patterns (starting with /)
  • Brace expansion ({jpg,jpeg,png})
  • Negative character classes ([!abc])
  • Mid-pattern ** usage (foo/**/bar)
  • Directory-only patterns with trailing slash
  • Empty pattern handling

File Processing

  • Root .gitignore processing
  • Subdirectory .gitignore handling with precedence
  • Global gitignore support via core.excludesFile
  • Empty line skipping
  • Comment handling
  • Trailing space removal
  • BOM processing
  • Absolute path handling
  • Path separator normalization
  • Empty directory pattern handling
  • Symlink handling (configurable)
  • Case sensitivity control (core.ignorecase)
  • Non-existent directory handling
  • UTF-8 conversion for non-UTF-8 .gitignore files
  • Windows path separator handling
  • Trailing whitespace handling

Pattern Precedence

The library implements Git's pattern precedence rules:

  • Lower-level .gitignore files override higher ones
  • Negative patterns override positive ones
  • More specific patterns take precedence
  • Later patterns in the same file override earlier ones
  • Defined precedence between global and local patterns

Performance Optimization

  • Pattern caching mechanism
  • Efficient multi-pattern matching
  • Fast directory tree traversal
  • Parallel processing support

Special Handling

  • Automatic .git directory exclusion
  • Relative path return values
  • Permission error warnings
  • Pathspec-based implementation for simplicity

Error Handling

The library handles various error conditions gracefully:

  • Permission errors
  • Missing directories
  • Invalid patterns
  • Encoding issues
  • File system errors

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Bug reports and feature requests can be submitted through GitHub Issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitignore_filter-0.1.3.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gitignore_filter-0.1.3-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file gitignore_filter-0.1.3.tar.gz.

File metadata

  • Download URL: gitignore_filter-0.1.3.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for gitignore_filter-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0e69c9efb6cb1b65465989589e01a8885f461b9bf9ef566aef28d0b9540117bf
MD5 ccbf9c430a387207ae35c7b3c47fa431
BLAKE2b-256 335ddc28b7f02bfccecb547ab7bfeffc0f1a998c8e319a8d9c48467b5d73b7d2

See more details on using hashes here.

File details

Details for the file gitignore_filter-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for gitignore_filter-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bec5e4cdab1c32556af7c316cc2e9f50644f8dc1d0ed2887bf0e4471907fbee3
MD5 1a9f7e7e40977d038ae0e2e73d6030bf
BLAKE2b-256 c87f7b0b9a870d55e445612b63bf251903ae7c0a6f22d3bf0ecf07c927294f42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page