Filter files in a directory tree based on configurable glob rules.
Project description
filefilter - File Filter Library
A small, cross-platform Python library for selecting files from a directory tree using JSON-defined include/exclude filters.
It focuses on predictable pattern logic rather than the underlying filesystem walk.
Features
- Pattern-driven filtering — no manual path checks.
- Case-insensitive matching across all platforms.
- Supports rich glob-style patterns:
*→ one or more characters**→ zero or more characters- Works in both directory and filename segments.
- Anchored directory semantics:
- No prefix → root-anchored
*/→ exactly N levels below root**/→ anywhere
- Explicit extension filtering (
["py", "tar.gz", ".*"]) - Fine-grained include/exclude precedence
- Symlinks ignored (never followed)
Folder patterns
| Group | Example Pattern | Meaning (actual behavior) | Example Matches | Notes / Clever use-cases |
|---|---|---|---|---|
| Root-anchored | /folder |
Match directory named folder directly under the root directory |
/folder, /folder/file.txt |
Root only |
/folder/* |
Match subdirectories one level under /folder |
/folder/a |
Single depth | |
/folder/*/* |
Match subdirectories two levels under /folder |
/folder/a/b |
Exact depth control | |
/folder/** |
Match /folder and all its subdirectories recursively |
/folder/x/y |
Deep traversal | |
/folder/**/a |
Match any nested a under /folder |
/folder/a, /folder/x/y/a |
Anchored root start | |
| Relative (no prefix) | folder |
Same as /folder, relative to root_dir |
/folder |
Root alias |
folder/* |
Match subdirectories one level below folder |
/folder/sub |
Controlled depth | |
folder/*/* |
Match directories two levels below folder |
/folder/a/b |
Exact two levels | |
folder/** |
Match folder and all subdirectories recursively |
/folder/x/y |
Recursive within folder | |
| Single-level leading wildcard | */folder |
Match any folder one level below any top-level dir |
/src/folder, /data/folder |
“Exactly one level up” |
*/folder/** |
Same as above, but include everything under each matched folder | /src/folder/x/y |
Deep from one-level up | |
| Multi-level leading wildcard | */*/folder |
folder that is two levels below any top-level dir |
/apps/web/folder, /lib/core/folder |
“Exactly two levels up” |
*/*/*/folder |
folder that is three levels below any top-level dir |
/a/b/c/folder |
Scale to N levels: repeat */ |
|
*/*/*/*/folder/** |
Same as above, then recursive under folder |
/a/b/c/d/folder/x/y |
Useful in deep monorepos | |
| Recursive leading wildcard | **/folder |
Match any directory named folder at any depth |
/folder, /a/b/c/folder |
“Any depth before” |
**/folder/** |
Match any directory folder and all subdirectories under it |
/folder/x, /x/y/folder/z |
Fully recursive | |
** |
Match all directories | (everything) | Wildcard all | |
**/** |
Same as ** |
(everything) | Redundant doublestar | |
| Mixed / Middle wildcards | folder/**/sub |
Match any sub folder nested inside folder (any depth gap) |
/folder/x/y/sub |
“Any depth in between” |
*/folder/**/sub |
Match any sub inside any folder one level below any root dir |
/a/folder/sub, /a/folder/x/y/sub |
Combined leading + middle wildcards | |
**/folder/another |
Match any another directly inside any folder at any depth |
/folder/another, /x/folder/another |
Common nested pattern | |
**/pkg/* |
Match one level after any pkg at any depth |
/pkg/a, /x/y/pkg/z |
Limit scope to immediate children | |
**/pkg/*/* |
Match two levels after any pkg at any depth |
/pkg/a/b, /x/y/pkg/z/t |
Exact depth after target | |
/modules/*/test |
A test dir exactly one level under /modules/<name> |
/modules/auth/test, /modules/pay/test |
Standard per-module test folder | |
/apps/**/dist |
Any dist inside /apps |
/apps/web/dist, /apps/mobile/a/b/dist |
Clean build artifacts | |
| Clever use-cases (post-patch) | */*/folder/*/another_folder |
Exactly 1 segment between folder and another_folder |
a/b/folder/c/another_folder |
Middle * now means one segment |
*/*/folder/**/another_folder |
Any number (0+) of segments between them | a/b/folder/another_folder, a/b/folder/x/another_folder |
Use when the gap can vary | |
*/*/folder/*/*/another_folder |
Exactly 2 segments between them | a/b/folder/x/y/another_folder |
Chain * for exact middle depth |
|
**/services/**/migrations/* |
Immediate children under any migrations inside any services subtree |
/x/services/a/migrations/001, /services/m/2 |
“Find migration versions but not nested subfolders” | |
**/features/*/* |
Exactly two levels under any features directory |
/a/features/x/y, /features/p/q |
Depth-limited collection | |
/packages/*/**/dist |
Any dist under any direct child of /packages |
/packages/core/dist, /packages/ui/x/y/dist |
Works across package families | |
| Literal / Edge cases | **/folder/***/anotherfolder |
Requires a literal *** directory between folder and anotherfolder |
/folder/***/anotherfolder |
*** is literal (not wildcard) |
/folder/***/anotherfolder |
Same, but anchored at root | /folder/***/anotherfolder |
Rare in practice | |
| Edge / invalid | / (or empty) |
Ignored / invalid after normalization | — | Stripped by normalization |
File patterns
| Group | Example Pattern | Meaning (actual behavior) | Example Matches | Notes / Clever use-cases |
|---|---|---|---|---|
| Root-anchored | /file.py |
Match file.py at project root |
/file.py |
Exact root match |
/README.* |
Files starting with README. at root, with at least one char after the dot |
/README.md, /README.rst |
Because filename * = 1+ chars |
|
/README.** |
Files starting with README. at root, allowing zero or more chars after the dot |
/README, /README.md, /README.rst |
Filename ** can match empty |
|
/.* |
Dotfiles at root with at least one char after the dot | /.env, /.gitignore |
Matches hidden files | |
/folder/*.py |
Any .py directly under /folder |
/folder/a.py, /folder/test.py |
One level below /folder |
|
/folder/**/test_*.py |
test_*.py anywhere under /folder |
/folder/test_a.py, /folder/x/y/test_utils.py |
Classic recursive test selector | |
| Relative (no prefix) | Makefile |
Match Makefile at project root |
/Makefile |
Case-insensitive |
*.toml |
Any .toml file directly under the root directory |
/pyproject.toml |
Only at root; use **/*.toml for recursive |
|
**/*.toml |
Any .toml file anywhere |
/pyproject.toml, /pkg/a/b/config.toml |
Recursive | |
folder/**/*.md |
Any .md anywhere under folder |
/folder/README.md, /folder/docs/a/b/guide.md |
Folder-scoped recursion | |
| Single-level leading wildcard | */Dockerfile |
Dockerfile exactly one directory deep |
/api/Dockerfile, /web/Dockerfile |
“One level up” |
*/LICENSE* |
Files whose name starts with LICENSE exactly one level deep |
/pkg/LICENSE, /lib/LICENSE-MIT |
Filename * = 1+ chars (so LICENSE alone won’t match; use LICENSE**) |
|
| Multi-level leading wildcard | */*/Makefile |
Makefile exactly two directories deep |
/x/y/Makefile, /apps/web/Makefile |
Repeat */ to require N levels |
*/*/*/package.json |
package.json exactly three directories deep |
/a/b/c/package.json |
Deep monorepo layouts | |
| Recursive leading wildcard | **/Makefile |
Makefile at any depth |
/Makefile, /a/Makefile, /a/b/c/Makefile |
Ignore structure depth |
**/*.py |
Any .py file at any depth |
/a.py, /pkg/mod/x.py, /x/y/z/__init__.py |
Global Python selection | |
** |
All files (filename ** matches empty, so any name) |
Every file | Combine with directory excludes to prune | |
| Mixed / middle wildcards | src/**/test_*.py |
Any test_*.py under src, at any depth |
/src/test_a.py, /src/unit/core/test_utils.py |
Common test pattern |
**/migrations/* |
Any file one level under a migrations dir at any depth |
/app/migrations/001.sql, /x/y/migrations/init.py |
Excludes deeper levels; add /*/* for exactly two |
|
**/migrations/*/* |
Any file two levels under a migrations dir |
/app/migrations/v1/001.sql |
Exact depth after target | |
**/assets/**/*.png |
Any .png somewhere under any assets dir |
/assets/logo.png, /x/assets/img/icons/a.png |
Nested assets | |
/packages/*/**/dist/*.js |
.js directly in any dist below any direct child of /packages |
/packages/core/dist/index.js, /packages/ui/x/y/dist/app.js |
Package builds | |
**/pkg/*/*/index.* |
index.<ext> exactly two levels under any pkg dir |
/pkg/a/b/index.js, /x/y/pkg/z/t/index.html |
Depth-limited bundle entry | |
| Clever filename globs | **/file*.py |
Filenames starting with file then ≥1 char, ending .py |
/fileA.py, /x/y/file_utils.py |
Does not match /file.py (use file**.py for optional tail) |
**/file**.py |
file plus 0+ chars, .py |
/file.py, /x/fileA.py, /x/y/file_utils.py |
Optional tail | |
**/*test*.py |
Any file with test substring and .py |
/test_main.py, /x/y/unittest_tools.py |
Broad test sweep | |
**/.env** |
.env or .env.<suffix> anywhere |
/.env, /app/.env.local, /x/.envrc |
Dotfile families | |
**/*.tar.** |
.tar followed by any (even empty) extension |
/a.tar.gz, /b/c.tar.bz2, /d/e.tar. |
** in filename can match empty |
|
**/*.* |
Files with a dot and at least one char extension | /a.txt, /b/c.tar.gz |
“Has extension” filter | |
**/*. |
Filenames ending with a dot | /strange. |
Rare but supported | |
| Directory gaps (post-patch) | */*/folder/*/another.py |
Exactly 1 directory between folder and another.py, and 2 before folder |
/a/b/folder/c/another.py |
Middle * = one segment (post-patch) |
*/*/folder/**/another.py |
Any number (0+) directories between folder and another.py, and 2 before folder |
/a/b/folder/another.py, /a/b/folder/x/y/another.py |
Flexible gap | |
*/*/folder/*/*/another.py |
Exactly 2 directories between folder and another.py, and 2 before folder |
/a/b/folder/x/y/another.py |
Chain * for exact middle depth |
|
| Exact names at any depth | **/LICENSE** |
LICENSE with optional suffix at any depth |
/LICENSE, /pkg/LICENSE-MIT |
Use ** to allow empty/extra chars |
**/Dockerfile |
Dockerfile at any depth |
/Dockerfile, /services/api/Dockerfile |
Service Dockerfiles | |
**/Makefile |
Makefile at any depth |
/Makefile, /lib/x/Makefile |
Build roots | |
| Edge / literal cases | **/folder/***/file.txt |
Requires a literal directory named *** between folder and file |
/folder/***/file.txt |
*** is literal, not a wildcard |
folder/*/file.txt |
Exactly 1 dir between folder and file (post-patch) |
/folder/a/file.txt |
Before the patch, this would not have matched | |
folder/**/file.txt |
Any depth between folder and file |
/folder/file.txt, /folder/a/b/file.txt |
Versatile | |
| Gotchas | file*.py (at root) |
Requires at least one char after file |
Matches /fileA.py; does not match /file.py |
Use file**.py if file.py should match |
LICENSE* |
Requires at least one char after LICENSE |
Matches LICENSE-MIT; not LICENSE |
Use LICENSE** to include bare LICENSE |
|
? |
Not supported | — | Stick to * and ** |
Decision Order (Inclusion / Exclusion Logic)
When scanning files, the library applies filters in this exact sequence:
1️⃣ Include-file override
- If the file matches any
include.filespattern → included immediately,
even if its extension isn’t ininclude.extensions.
2️⃣ Hard excludes
- If the file extension matches
exclude.extensions→ excluded. - If the full path matches
exclude.files→ excluded. - If its parent directory matches
exclude.dirs→ excluded.
3️⃣ Inclusion gating
- If there are any include patterns (dirs or files), the file must match
one of them to be included.
Otherwise, it’s excluded.
4️⃣ Extension whitelist
- If
include.extensionsis not empty, only matching extensions are included.
5️⃣ Everything else
- Included by default.
Configuration Schema
{
"root_dir": ".",
"filters": {
"include": {
"dirs": [],
"files": [],
"extensions": []
},
"exclude": {
"dirs": [],
"files": [],
"extensions": []
}
}
}
| Field | Description |
|---|---|
root_dir |
Base directory (absolute or relative). |
filters.include.dirs |
Directory inclusion patterns. |
filters.include.files |
File inclusion patterns (glob-like). |
filters.include.extensions |
Extension whitelist. |
filters.exclude.* |
Same structure, but acts as exclusion filters. |
Pattern Rules Recap
| Symbol | Meaning |
|---|---|
* |
one or more characters (no slashes) |
** |
zero or more characters (can cross dirs) |
Leading **/ |
match anywhere in the tree |
Leading */ |
match exactly N levels below root |
Trailing /** |
match directory and all descendants |
.ext or ext |
file extension match (case-insensitive) |
.* in extensions |
any non-empty extension |
Examples (Practical Scenarios)
Example 1 — Include all .py files anywhere
{
"root_dir": ".",
"filters": {
"include": {
"dirs": ["**"],
"files": [],
"extensions": ["py"]
},
"exclude": {
"dirs": [],
"files": [],
"extensions": []
}
}
}
| File Path | Result | Reason |
|---|---|---|
main.py |
✅ | .py extension matches |
src/app.py |
✅ | .py extension matches |
docs/readme.md |
❌ | wrong extension |
a/b/c/module.py |
✅ | .py file anywhere |
✅ All
.pyfiles anywhere in the project.
Example 2 — Only Python files in root
{
"root_dir": ".",
"filters": {
"include": {
"dirs": [],
"files": ["*.py"],
"extensions": []
},
"exclude": {
"dirs": [],
"files": [],
"extensions": []
}
}
}
| File Path | Result | Reason |
|---|---|---|
main.py |
✅ | matches *.py at root |
helper.py |
✅ | matches *.py at root |
src/app.py |
❌ | subdirectory, not root |
src/helper.py |
❌ | subdirectory, not root |
✅ Matches
.pyfiles only in the project root, not deeper folders.
Example 3 — Include only certain substructure
{
"root_dir": ".",
"filters": {
"include": {
"dirs": [],
"files": ["*/hi/**/hello.py"],
"extensions": []
},
"exclude": {
"dirs": [],
"files": [],
"extensions": []
}
}
}
| File Path | Result | Reason |
|---|---|---|
a/hi/hello.py |
✅ | matches */hi/**/hello.py |
a/hi/x/hello.py |
✅ | hi one level below root, deeper path allowed |
hi/hello.py |
❌ | missing leading dir before hi |
a/b/hi/hello.py |
❌ | hi too deep |
a/hi/hello.txt |
❌ | wrong filename |
✅ Collects any
hello.pyfile under a folder namedhithat is exactly one level below root, with any depth below that.
Example 4 — Exclude specific pattern while including .py files
{
"root_dir": ".",
"filters": {
"include": {
"dirs": ["**"],
"files": [],
"extensions": [".py"]
},
"exclude": {
"dirs": [],
"files": ["*/hi/**/hello.py"],
"extensions": []
}
}
}
| File Path | Result | Reason |
|---|---|---|
main.py |
✅ | .py, no exclusion hit |
src/lib/module.py |
✅ | .py, no exclusion hit |
a/hi/hello.py |
❌ | excluded by */hi/**/hello.py |
src/hi/utils/hello.py |
❌ | matches exclusion pattern |
hi/hello.py |
✅ | not excluded (no leading dir before hi) |
✅ Includes all
.pyfiles except those namedhello.pyinside ahifolder that is one directory below root.
Example 5 — Directory pattern focus
{
"root_dir": ".",
"filters": {
"include": {
"dirs": ["**/myfolder/**"],
"files": [],
"extensions": ["py"]
},
"exclude": {
"dirs": ["**/__pycache__/**"],
"files": [],
"extensions": []
}
}
}
| File Path | Result | Reason |
|---|---|---|
src/myfolder/x/hello.py |
✅ | inside myfolder |
src/myfolder/sub/a.py |
✅ | inside myfolder |
src/myfolder/__pycache__/cached.py |
❌ | excluded dir |
src/other/hello.py |
❌ | outside myfolder |
✅ Collects
.pyfiles anywhere under directories namedmyfolder, excluding__pycache__.
Example 6 — Combined complex filters
{
"root_dir": ".",
"filters": {
"include": {
"dirs": ["src/**", "scripts"],
"files": ["**/*.sh", "*.py"],
"extensions": ["py", "sh"]
},
"exclude": {
"dirs": ["**/__pycache__/**", "build"],
"files": ["*/legacy/**"],
"extensions": ["log"]
}
}
}
| File Path | Result | Reason |
|---|---|---|
src/main.py |
✅ | .py, included dir |
scripts/setup.sh |
✅ | matched by **/*.sh |
build/config.py |
❌ | excluded dir build |
src/legacy/module.py |
❌ | excluded by */legacy/** |
docs/readme.md |
❌ | wrong extension |
src/utils/tool.log |
❌ | excluded by extension |
✅ Keeps
.pyor.shfiles undersrcorscripts, ignoring logs, builds, and legacy directories.
Summary of Behavior
- Exclusions always win first.
- A matching
include.filesforces inclusion. - If any include filters exist, at least one must match.
- Extensions act as a final whitelist.
- Matching is case-insensitive and normalized.
**can span directory boundaries, not just characters.
Example Usage
from filefilter import *
cfg_json = """{
"root_dir": ".",
"filters": {
"include": { "dirs": ["**"], "files": [], "extensions": ["py"] },
"exclude": { "dirs": ["__pycache__"], "files": [], "extensions": [] }
}
}"""
# Deprecated option
files = filter_paths(cfg_json)
# Option 1: one-liner convenience
matched_files = select(cfg_json)
# Option 2: explicit load + scan
rules = load(cfg_json, base="cwd")
matched_files = scan(rules)
# Print any matched file
for f in matched_files:
print(f)
License
MIT License © 2025 Ioannis D. (devcoons)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file filefilter-0.1.12.tar.gz.
File metadata
- Download URL: filefilter-0.1.12.tar.gz
- Upload date:
- Size: 26.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a221d975da4bfe28f52c8223d8c53f93de13666df07fdf3426ba0256df1f20c
|
|
| MD5 |
ee682a43befe700246b1577de4a936b2
|
|
| BLAKE2b-256 |
1b40ba3d97cacb410e21aa86008e1ad34a22321b599a880bf2daca84350362f2
|
File details
Details for the file filefilter-0.1.12-py3-none-any.whl.
File metadata
- Download URL: filefilter-0.1.12-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3723b1dd94802d73696aceeff22746aafb1168ccbd6f0b2e5d3db4737c5efbfb
|
|
| MD5 |
6637c5bd23739d8f9b2ab5d4ae080d40
|
|
| BLAKE2b-256 |
cd8e2a205ae66ac01bd8075cd380b17ec4eb65617ba252b2a48d7c153ff36d18
|