File search tools (by name and content) for AI agent pipelines
Project description
🔍 AzureAICommunity - Agent - File Search
File search tools (by name and content) for AI agent applications built on the Agent Framework.
Give your agent the ability to search files by name or content — with full glob support, encoding fallback, and configurable depth.
Overview
azureaicommunity-agent-file-search provides two @tool-decorated functions that can be wired directly into any agent-framework agent. The agent can search for files by glob pattern or scan file contents for a string — with sensible defaults and a fully configurable SearchConfig for fine-grained control.
✨ Features
| Feature | |
|---|---|
| 🔎 | Search by name — glob-pattern matching (*.py, main*, report.pdf) across a directory tree |
| 📄 | Search by content — full-text scan of file contents with case-sensitive or case-insensitive matching |
| ⚙️ | Configurable — control max results, depth, hidden files, extension filters, file size limits, and more |
| 🔌 | Agent-ready — tools decorated with @tool, drop straight into any agent-framework agent |
| 🌐 | Encoding-aware — tries multiple encodings (UTF-8, Latin-1) before skipping a file |
| 🚫 | Binary-safe — skips binary files automatically via null-byte detection |
| 🔁 | Symlink-safe — optional symlink following with loop detection |
| 📦 | Provider-agnostic — works with Ollama, Azure OpenAI, or any agent-framework compatible client |
📦 Installation
pip install azureaicommunity-agent-file-search
🚀 Quick Start
import asyncio
from agent_framework.ollama import OllamaChatClient
from file_search_module import file_search_by_name, file_search_by_content, configure
# Optional: set global defaults
configure(max_results=50, max_depth=10, skip_hidden=True)
agent = OllamaChatClient(model="llama3.2").as_agent(
name="FileSearchAgent",
instructions="You are a helpful file-search assistant. Always search under C:\\MyProject.",
tools=[file_search_by_name, file_search_by_content],
)
async def main():
session = agent.create_session()
response = await agent.run("Find all Python files in the project.", session=session)
print(response.text)
asyncio.run(main())
🧑💻 Usage
Search by file name
from file_search_module import file_search_by_name
# All Python files
results = file_search_by_name("*.py", path="C:\\MyProject")
# Files starting with "main"
results = file_search_by_name("main*", path="C:\\MyProject")
# Case-sensitive match
results = file_search_by_name("README.md", path="C:\\MyProject", case_sensitive=True)
# Only .py and .txt files
results = file_search_by_name("*", path="C:\\MyProject", file_types=[".py", ".txt"])
Search by file content
from file_search_module import file_search_by_content
# Case-sensitive (default)
results = file_search_by_content("middleware", path="C:\\MyProject")
# Case-insensitive
results = file_search_by_content("TODO", path="C:\\MyProject", case_sensitive=False)
# Restrict to Python files only
results = file_search_by_content("async def", path="C:\\MyProject", file_types=[".py"])
Per-call config override
from file_search_module import file_search_by_name, SearchConfig
custom = SearchConfig(max_results=5, include_extensions=[".py"], skip_hidden=True)
results = file_search_by_name("*.py", path="C:\\MyProject", config=custom)
⚙️ Configuration
configure() — set global defaults
from file_search_module import configure
configure(
max_results=100,
max_depth=5,
skip_hidden=True,
exclude_extensions=[".pyc", ".pyo", ".exe", ".dll"],
encodings=["utf-8", "latin-1"],
)
SearchConfig fields
| Parameter | Type | Default | Description |
|---|---|---|---|
max_results |
int |
200 |
Maximum number of paths returned per search call |
max_depth |
int |
20 |
Maximum directory recursion depth |
max_file_size_bytes |
int |
10485760 (10 MB) |
Files larger than this are skipped during content search |
binary_check_bytes |
int |
8192 |
Bytes read to detect binary files (null-byte probe) |
follow_symlinks |
bool |
False |
Whether to follow symbolic links |
skip_hidden |
bool |
False |
Skip files and folders starting with . |
include_extensions |
list[str] | None |
None |
Whitelist of extensions — None means all |
exclude_extensions |
list[str] | None |
None |
Blacklist of extensions |
encodings |
list[str] |
["utf-8", "latin-1"] |
Encoding fallback chain for content search |
⚙️ How It Works
file_search_by_name
1. Validate query and path inputs
2. Auto-normalize bare extensions: 'py' → '*.py', '.py' → '*.py'
3. Walk directory tree (respecting max_depth, skip_hidden, follow_symlinks)
4. For each file: check extension filters, apply glob match
5. Return relative paths, capped at max_results
file_search_by_content
1. Validate query and path inputs
2. Walk directory tree (same depth/hidden/symlink rules)
3. For each file: skip if > max_file_size_bytes, skip if binary
4. Try each encoding in the fallback chain until the file is readable
5. Search for the query string (case-sensitive or insensitive)
6. Return relative paths of matching files, capped at max_results
🤝 Contributing
Contributions are welcome! Please open an issue to discuss what you'd like to change before submitting a pull request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Commit your changes (
git commit -m 'Add my feature') - Push to the branch (
git push origin feature/my-feature) - Open a Pull Request
📄 License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file azureaicommunity_agent_file_search-1.0.0.tar.gz.
File metadata
- Download URL: azureaicommunity_agent_file_search-1.0.0.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a306ab149621d3fbf6aefacf9427865c100f04880a1cb4a2f715782647ac1c43
|
|
| MD5 |
c949e7d3ae030d20087745350294d728
|
|
| BLAKE2b-256 |
ebf9c06bd99cd25d33d7b327843cc00a94a3071e9a0722daae8c87cc87f0e485
|
File details
Details for the file azureaicommunity_agent_file_search-1.0.0-py3-none-any.whl.
File metadata
- Download URL: azureaicommunity_agent_file_search-1.0.0-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21e24a9bd832c72233bd22d46334c3a4dc9da89153185612c532c5685882e720
|
|
| MD5 |
158b46bd0633ebae389530774d7802b1
|
|
| BLAKE2b-256 |
9ce6d622771a8200e9ff1281ce7ae7d2ee90cb74e49689ad347bd059f9d6a26b
|