Skip to main content

File search tools (by name and content) for AI agent pipelines

Project description

🔍 AzureAICommunity - Agent - File Search

File search tools (by name and content) for AI agent applications built on the Agent Framework.

PyPI Version Python Versions PyPI Downloads License PyPI Status

Give your agent the ability to search files by name or content — with full glob support, encoding fallback, and configurable depth.

Getting Started · Configuration · Usage · Contributing


Overview

azureaicommunity-agent-file-search provides two @tool-decorated functions that can be wired directly into any agent-framework agent. The agent can search for files by glob pattern or scan file contents for a string — with sensible defaults and a fully configurable SearchConfig for fine-grained control.


✨ Features

Feature
🔎 Search by name — glob-pattern matching (*.py, main*, report.pdf) across a directory tree
📄 Search by content — full-text scan of file contents with case-sensitive or case-insensitive matching
⚙️ Configurable — control max results, depth, hidden files, extension filters, file size limits, and more
🔌 Agent-ready — tools decorated with @tool, drop straight into any agent-framework agent
🌐 Encoding-aware — tries multiple encodings (UTF-8, Latin-1) before skipping a file
🚫 Binary-safe — skips binary files automatically via null-byte detection
🔁 Symlink-safe — optional symlink following with loop detection
📦 Provider-agnostic — works with Ollama, Azure OpenAI, or any agent-framework compatible client

📦 Installation

pip install azureaicommunity-agent-file-search

🚀 Quick Start

import asyncio
from agent_framework.ollama import OllamaChatClient
from file_search_module import file_search_by_name, file_search_by_content, configure

# Optional: set global defaults
configure(max_results=50, max_depth=10, skip_hidden=True)

agent = OllamaChatClient(model="llama3.2").as_agent(
    name="FileSearchAgent",
    instructions="You are a helpful file-search assistant. Always search under C:\\MyProject.",
    tools=[file_search_by_name, file_search_by_content],
)

async def main():
    session = agent.create_session()
    response = await agent.run("Find all Python files in the project.", session=session)
    print(response.text)

asyncio.run(main())

🧑‍💻 Usage

Search by file name

from file_search_module import file_search_by_name

# All Python files
results = file_search_by_name("*.py", path="C:\\MyProject")

# Files starting with "main"
results = file_search_by_name("main*", path="C:\\MyProject")

# Case-sensitive match
results = file_search_by_name("README.md", path="C:\\MyProject", case_sensitive=True)

# Only .py and .txt files
results = file_search_by_name("*", path="C:\\MyProject", file_types=[".py", ".txt"])

Search by file content

from file_search_module import file_search_by_content

# Case-sensitive (default)
results = file_search_by_content("middleware", path="C:\\MyProject")

# Case-insensitive
results = file_search_by_content("TODO", path="C:\\MyProject", case_sensitive=False)

# Restrict to Python files only
results = file_search_by_content("async def", path="C:\\MyProject", file_types=[".py"])

Per-call config override

from file_search_module import file_search_by_name, SearchConfig

custom = SearchConfig(max_results=5, include_extensions=[".py"], skip_hidden=True)
results = file_search_by_name("*.py", path="C:\\MyProject", config=custom)

⚙️ Configuration

configure() — set global defaults

from file_search_module import configure

configure(
    max_results=100,
    max_depth=5,
    skip_hidden=True,
    exclude_extensions=[".pyc", ".pyo", ".exe", ".dll"],
    encodings=["utf-8", "latin-1"],
)

SearchConfig fields

Parameter Type Default Description
max_results int 200 Maximum number of paths returned per search call
max_depth int 20 Maximum directory recursion depth
max_file_size_bytes int 10485760 (10 MB) Files larger than this are skipped during content search
binary_check_bytes int 8192 Bytes read to detect binary files (null-byte probe)
follow_symlinks bool False Whether to follow symbolic links
skip_hidden bool False Skip files and folders starting with .
include_extensions list[str] | None None Whitelist of extensions — None means all
exclude_extensions list[str] | None None Blacklist of extensions
encodings list[str] ["utf-8", "latin-1"] Encoding fallback chain for content search

⚙️ How It Works

file_search_by_name

1. Validate query and path inputs
2. Auto-normalize bare extensions: 'py' → '*.py', '.py' → '*.py'
3. Walk directory tree (respecting max_depth, skip_hidden, follow_symlinks)
4. For each file: check extension filters, apply glob match
5. Return relative paths, capped at max_results

file_search_by_content

1. Validate query and path inputs
2. Walk directory tree (same depth/hidden/symlink rules)
3. For each file: skip if > max_file_size_bytes, skip if binary
4. Try each encoding in the fallback chain until the file is readable
5. Search for the query string (case-sensitive or insensitive)
6. Return relative paths of matching files, capped at max_results

🤝 Contributing

Contributions are welcome! Please open an issue to discuss what you'd like to change before submitting a pull request.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Commit your changes (git commit -m 'Add my feature')
  4. Push to the branch (git push origin feature/my-feature)
  5. Open a Pull Request

📄 License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azureaicommunity_agent_file_search-1.0.1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file azureaicommunity_agent_file_search-1.0.1.tar.gz.

File metadata

File hashes

Hashes for azureaicommunity_agent_file_search-1.0.1.tar.gz
Algorithm Hash digest
SHA256 33edc73fd9f21df6b9eecce9fe687bfc67b7ddc7c4abda917f373f656d586e07
MD5 d7c8d44332fcf2bf0f59086337a1ed9f
BLAKE2b-256 356e0601b617cb723c9844fff69c61ac2720db32cdc5ffc69fb266e92494bd26

See more details on using hashes here.

File details

Details for the file azureaicommunity_agent_file_search-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for azureaicommunity_agent_file_search-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb8548c6f4d9a3e93b6092e774c092f2d8092f0c7a20e930c4d8c04116ad97e0
MD5 c97cee25aba9c8131c57961b1d59789a
BLAKE2b-256 2eef1998343c39c3fddb5cb2db28bbb103efc788970abd9a600330fa6f2f750f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page