Production-ready filesystem access tools for LLMs with TOCTOU protection
Project description
LLM Filesystem Tools
Secure filesystem access for Large Language Models with governance-first design.
Stop reinventing filesystem tools for every LLM project. llm-fs-tools provides production-ready, secure file operations that work with any LLM supporting function calling - Ollama, OpenAI, Anthropic, and more.
The Problem
You want your AI assistant to analyze code, search files, or explore directories. You have three bad options:
- Inject everything into the prompt - Wastes tokens, hits context limits, can't scale
- Use heavy frameworks - LangChain/LlamaIndex lock you into their ecosystem
- Roll your own - Reinvent security, path validation, and tool schemas every time
The Solution
pip install llm-fs-tools
from llm_fs_tools import FileSystemTools, SecurityPolicy
# Define security boundaries
policy = SecurityPolicy(
allowed_roots=["./my-project"],
max_file_size_mb=5,
blocked_patterns=["*.env", ".git/*"]
)
# Initialize tools
fs_tools = FileSystemTools(policy)
# Use with any LLM (Ollama example)
import ollama
response = ollama.chat(
model='qwen2.5-coder',
messages=[{'role': 'user', 'content': 'Analyze the codebase structure'}],
tools=fs_tools.get_tool_definitions() # Auto-generates schemas
)
# Execute tool calls
for tool_call in response.message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
tool_call.function.arguments
)
That's it. Your model can now safely explore filesystems.
Features
๐ Security First
- Path traversal protection - Validates all paths stay within allowed roots
- Configurable boundaries - Whitelist directories, block patterns
- Automatic filtering - Excludes
.env,.git,node_modulesby default - Size limits - Prevents reading massive files that blow up context
๐ ๏ธ Rich Tool Set
get_directory_tree- Hierarchical structure with configurable depthread_file- Read with line numbers and range supportsearch_codebase- Grep-style regex search across fileslist_directory- Fast flat listings
๐ฏ Zero Lock-In
- Framework-agnostic - Works with raw API calls, not just frameworks
- Provider-agnostic - Same tools work with Ollama, OpenAI, Anthropic
- Minimal dependencies - No heavy frameworks required
- Standard schemas - Uses OpenAI function calling format
๐ Production Ready
- Comprehensive error handling - Graceful failures with detailed messages
- Type hints throughout - Full mypy compliance
- Extensive logging - Debug tool execution and security checks
- Tested - 80%+ coverage
Quick Examples
Ollama (Local Models)
import ollama
from llm_fs_tools import FileSystemTools, SecurityPolicy
policy = SecurityPolicy(allowed_roots=["./src"])
fs_tools = FileSystemTools(policy)
response = ollama.chat(
model='codellama',
messages=[{
'role': 'user',
'content': 'Find all database queries in this codebase'
}],
tools=fs_tools.get_tool_definitions()
)
# Handle tool calls in a loop
messages = [{'role': 'user', 'content': 'Find all database queries'}]
while response.message.tool_calls:
messages.append(response.message)
for tool_call in response.message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
tool_call.function.arguments
)
messages.append({
'role': 'tool',
'content': json.dumps(result),
'tool_call_id': tool_call.id
})
response = ollama.chat(model='codellama', messages=messages)
print(response.message.content)
OpenAI
from openai import OpenAI
from llm_fs_tools import FileSystemTools, SecurityPolicy
client = OpenAI()
policy = SecurityPolicy(allowed_roots=["./"])
fs_tools = FileSystemTools(policy)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Summarize the README"}],
tools=fs_tools.get_tool_definitions(format="openai")
)
# Execute tool calls
for tool_call in response.choices[0].message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
json.loads(tool_call.function.arguments)
)
Anthropic Claude
import anthropic
from llm_fs_tools import FileSystemTools, SecurityPolicy
client = anthropic.Anthropic()
policy = SecurityPolicy(allowed_roots=["./docs"])
fs_tools = FileSystemTools(policy)
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": "What's in the docs?"}],
tools=fs_tools.get_tool_definitions(format="anthropic")
)
# Handle tool use
for block in response.content:
if block.type == "tool_use":
result = fs_tools.execute(block.name, block.input)
Security Model
Path Validation
Every file operation validates paths through the security policy:
policy = SecurityPolicy(
allowed_roots=[
"/home/user/projects",
"/home/user/documents"
],
blocked_patterns=[
"*.env", # Environment files
"*.key", # Key files
".git/*", # Git internals
"node_modules/*", # Dependencies
"__pycache__/*" # Python cache
],
blocked_extensions=[
".pem",
".secret"
],
max_file_size_mb=10
)
Validation Process:
- Resolve symlinks and relative paths
- Check if resolved path is within
allowed_roots - Match against
blocked_patternsandblocked_extensions - Verify file size is under
max_file_size_mb
Security guarantees:
- โ No path traversal attacks (
../../../etc/passwd) - โ No symlink escapes
- โ No sensitive file access
- โ Explicit allowlist model
Error Handling
Security violations return structured errors, never raising exceptions to the LLM:
{
"success": False,
"error": "Access denied: Path outside allowed roots",
"data": None,
"metadata": {
"tool": "read_file",
"attempted_path": "/etc/passwd",
"violation_type": "outside_allowed_roots"
}
}
Tool Reference
get_directory_tree
Returns hierarchical directory structure.
Parameters:
path(str, required) - Directory to analyzemax_depth(int, default=3) - Maximum recursion depthinclude_hidden(bool, default=False) - Include hidden files
Example Output:
{
"success": true,
"data": {
"name": "src",
"type": "directory",
"children": [
{
"name": "main.py",
"type": "file",
"size": 1024
},
{
"name": "utils",
"type": "directory",
"children": [...]
}
]
}
}
read_file
Reads file content with optional line ranges.
Parameters:
path(str, required) - File to readstart_line(int, optional) - First line to read (1-indexed)end_line(int, optional) - Last line to read (inclusive)
Example:
# Read entire file
fs_tools.execute("read_file", {"path": "./main.py"})
# Read lines 10-20
fs_tools.execute("read_file", {
"path": "./main.py",
"start_line": 10,
"end_line": 20
})
search_codebase
Grep-style search with regex support.
Parameters:
pattern(str, required) - Search pattern (regex)path(str, required) - Directory to searchfile_pattern(str, default="*") - File glob filtercase_sensitive(bool, default=False) - Case sensitivitymax_results(int, default=100) - Result limit
Example:
# Find all TODO comments in Python files
fs_tools.execute("search_codebase", {
"pattern": r"TODO:.*",
"path": "./src",
"file_pattern": "*.py"
})
Output:
{
"success": true,
"data": {
"matches": [
{
"file": "./src/main.py",
"line": 42,
"content": "# TODO: Refactor this function",
"match": "TODO: Refactor this function"
}
],
"total_matches": 1,
"truncated": false
}
}
list_directory
Fast flat directory listing.
Parameters:
path(str, required) - Directory to listinclude_hidden(bool, default=False) - Include hidden files
Configuration
Basic Setup
from llm_fs_tools import FileSystemTools, SecurityPolicy
policy = SecurityPolicy(
allowed_roots=["./project"],
)
fs_tools = FileSystemTools(policy)
Advanced Configuration
from pathlib import Path
policy = SecurityPolicy(
# Multiple allowed directories
allowed_roots=[
"./src",
"./docs",
str(Path.home() / "projects")
],
# File size limits
max_file_size_mb=5,
# Block sensitive patterns
blocked_patterns=[
"*.env",
"*.key",
"*.pem",
".git/*",
"node_modules/*",
"__pycache__/*",
"*.pyc",
".venv/*"
],
# Block by extension
blocked_extensions=[
".secret",
".private"
],
# Custom validation
custom_validator=lambda path: not path.name.startswith("temp_")
)
fs_tools = FileSystemTools(policy)
Configuration File
# llm-fs-config.yaml
security:
allowed_roots:
- ./src
- ./docs
max_file_size_mb: 10
blocked_patterns:
- "*.env"
- ".git/*"
import yaml
from llm_fs_tools import SecurityPolicy, FileSystemTools
with open("llm-fs-config.yaml") as f:
config = yaml.safe_load(f)
policy = SecurityPolicy(**config["security"])
fs_tools = FileSystemTools(policy)
Architecture
Design Principles
- Governance Over Scale - Security boundaries define capability, not model size
- Explicit Over Implicit - Allowlists, not denylists
- Simple Over Complex - Minimal API surface, zero magic
- Portable Over Coupled - Works everywhere, depends on nothing
Component Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Your Application โ
โ (Ollama/OpenAI/Anthropic/etc) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโ get_tool_definitions()
โ (Returns JSON schemas)
โ
โโ execute(name, args)
(Runs tool, returns result)
โ
โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
โ FileSystemTools โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Security Policy โ โ
โ โ - Path validation โ โ
โ โ - Size limits โ โ
โ โ - Pattern blocking โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Tool Implementations โ โ
โ โ - get_directory_tree โ โ
โ โ - read_file โ โ
โ โ - search_codebase โ โ
โ โ - list_directory โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Use Cases
AI Coding Assistants
# Let Claude explore and refactor your codebase
policy = SecurityPolicy(allowed_roots=["./src"])
fs_tools = FileSystemTools(policy)
response = claude.chat(
messages=[{
'role': 'user',
'content': 'Refactor the authentication module for better testability'
}],
tools=fs_tools.get_tool_definitions()
)
Automated Code Reviews
# LLM reviews your PR changes
policy = SecurityPolicy(
allowed_roots=["./"],
blocked_patterns=["*.env", "node_modules/*"]
)
fs_tools = FileSystemTools(policy)
response = gpt4.chat(
messages=[{
'role': 'user',
'content': 'Review the changes in src/ for security issues and best practices'
}],
tools=fs_tools.get_tool_definitions()
)
Documentation Generation
# Generate docs from codebase structure
policy = SecurityPolicy(allowed_roots=["./src", "./docs"])
fs_tools = FileSystemTools(policy)
response = ollama.chat(
model='codellama',
messages=[{
'role': 'user',
'content': 'Generate API documentation from the source files'
}],
tools=fs_tools.get_tool_definitions()
)
Dependency Analysis
# Find all imports and dependencies
fs_tools.execute("search_codebase", {
"pattern": r"^import |^from .* import",
"path": "./src",
"file_pattern": "*.py"
})
Comparison
| Feature | llm-filesystem-tools | LangChain | MCP Servers | Roll Your Own |
|---|---|---|---|---|
| Installation | pip install |
pip install langchain |
Server setup + client | โ N/A |
| Dependencies | Minimal | 50+ packages | MCP protocol | โ You maintain |
| Security Model | Built-in policy engine | Manual | Per-server | โ You build |
| Provider Support | All (OpenAI/Anthropic/Ollama) | LangChain models only | MCP clients only | โ Up to you |
| Framework Lock-in | โ None | โ LangChain ecosystem | โ MCP protocol | โ None |
| Path Validation | โ Automatic | โ Manual | Varies | โ You build |
| Learning Curve | 5 minutes | Days | Hours | โ Weeks |
Roadmap
v0.1.0 (Current)
- โ Core filesystem tools
- โ Security policy engine
- โ Multi-provider schemas
- โ Path validation
v0.2.0 (Next)
- Caching layer for repeated reads
- File watching/change detection
- Batch operations
- Performance optimizations
v0.3.0
- Git integration tools
- Diff/patch operations
- Binary file support
- Archive handling (zip, tar)
v1.0.0
- Stable API
- Full test coverage
- Production hardening
- Performance benchmarks
Contributing
We welcome contributions! This project follows the governance-first philosophy: intelligence emerges from coordination, not complexity.
Development Setup
# Clone the repo
git clone https://github.com/dansasser/llm-filesystem-tools.git
cd llm-filesystem-tools
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check .
mypy llm_fs_tools
Guidelines
- Security first - All PRs must maintain security guarantees
- Test coverage - New features need tests
- Type hints - Full typing required
- Documentation - Update docs for API changes
Areas for Contribution
- ๐ง New tool implementations
- ๐ก๏ธ Enhanced security features
- ๐ Documentation improvements
- ๐งช Test coverage expansion
- ๐ Bug fixes
FAQ
Q: Does this work with LangChain/LlamaIndex?
A: Yes! You can wrap these tools in LangChain/LlamaIndex tool interfaces, but you don't need those frameworks to use this package.
Q: Can I use this in production?
A: Yes, but audit the security policy for your use case. The default blocked patterns are a starting point, not a complete security solution.
Q: What about write operations?
A: Currently read-only by design. Write operations may come in v0.3.0 with additional safeguards.
Q: Does this work on Windows?
A: Yes! Path handling is cross-platform using pathlib.
Q: Can I use this with streaming responses?
A: Yes! Tool calls work with both streaming and non-streaming LLM responses.
Q: What's the performance impact?
A: Minimal. Tool execution is typically <100ms. Directory trees are cached per call.
License
MIT License - see LICENSE for details.
Credits
Created by Dan Sasser as part of the SIM-ONE Framework - governance-first AI architecture.
Related Projects:
- ollama-prompt - Command-line tool using llm-filesystem-tools
- SIM-ONE - Comprehensive AI governance system
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: Contact
Star this repo if it's useful! โญ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_fs_tools-1.1.1.tar.gz.
File metadata
- Download URL: llm_fs_tools-1.1.1.tar.gz
- Upload date:
- Size: 33.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53345cd2964a93022a9dd1d9ca56677cc55dcb6ad738ce335c72c90fd7c48995
|
|
| MD5 |
acd0fa94d83984cd95d8f33f60e2624b
|
|
| BLAKE2b-256 |
6459ff5aa78ab71b07394825503e4cc5ac8632223f47ff63afb549e36dff56d6
|
Provenance
The following attestation bundles were made for llm_fs_tools-1.1.1.tar.gz:
Publisher:
publish.yml on dansasser/llm-filesystem-tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_fs_tools-1.1.1.tar.gz -
Subject digest:
53345cd2964a93022a9dd1d9ca56677cc55dcb6ad738ce335c72c90fd7c48995 - Sigstore transparency entry: 748351674
- Sigstore integration time:
-
Permalink:
dansasser/llm-filesystem-tools@1413f56f230e3e951cbe218dbde1104dd48fa3a6 -
Branch / Tag:
refs/tags/V1.1.1-fixed - Owner: https://github.com/dansasser
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1413f56f230e3e951cbe218dbde1104dd48fa3a6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llm_fs_tools-1.1.1-py3-none-any.whl.
File metadata
- Download URL: llm_fs_tools-1.1.1-py3-none-any.whl
- Upload date:
- Size: 33.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6d10f8194fa904e8cdd4835410fe8f619f584cdeb20a15fc358032a375c62a1
|
|
| MD5 |
0dc5f24a15fb0b53496611ff27f83ae9
|
|
| BLAKE2b-256 |
84c207bb06a154b59f18d632a216f2ea546fc80eb8f4b2361e45a155194b47ca
|
Provenance
The following attestation bundles were made for llm_fs_tools-1.1.1-py3-none-any.whl:
Publisher:
publish.yml on dansasser/llm-filesystem-tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_fs_tools-1.1.1-py3-none-any.whl -
Subject digest:
a6d10f8194fa904e8cdd4835410fe8f619f584cdeb20a15fc358032a375c62a1 - Sigstore transparency entry: 748351677
- Sigstore integration time:
-
Permalink:
dansasser/llm-filesystem-tools@1413f56f230e3e951cbe218dbde1104dd48fa3a6 -
Branch / Tag:
refs/tags/V1.1.1-fixed - Owner: https://github.com/dansasser
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1413f56f230e3e951cbe218dbde1104dd48fa3a6 -
Trigger Event:
release
-
Statement type: