Modern MongoDB query sanitizer with layered security protection
Project description
Sanitongo 🛡️
Modern MongoDB Query Sanitizer with Layered Security Protection
Sanitongo is a comprehensive security library for sanitizing MongoDB queries with multiple layers of protection against NoSQL injection attacks, malicious queries, and complexity-based DoS attacks.
Features
Five-Layer Protection System
- Type Validation - Ensure inputs are valid types and structures
- Schema Enforcement - Validate against predefined field schemas
- Operator Filtering - Remove/validate MongoDB operators
- Pattern Validation - Detect dangerous patterns in string values
- Complexity Limiting - Prevent DoS through query complexity limits
Security Features
- NoSQL Injection Prevention - Blocks
$where,$function, and other dangerous operators - JavaScript Injection Protection - Detects and prevents JS code execution attempts
- ReDoS Prevention - Identifies potentially dangerous regex patterns
- XSS Protection - Blocks script tags and malicious HTML
- Command Injection Prevention - Detects shell command injection attempts
- Prototype Pollution Protection - Prevents
__proto__and constructor manipulation - Complexity DoS Prevention - Limits query depth, key count, array sizes, and string lengths
Modern Architecture
- Layered Design - Each protection layer operates independently
- Configurable - Extensive configuration options for different security postures
- Detailed Reporting - Comprehensive sanitization reports with security insights
- Type Safety - Full TypeScript-style type hints and validation
- Performance Focused - Optimized for production use
- Extensive Testing - Security-focused test suites
Installation
pip install sanitongo
For development with all dependencies:
pip install "sanitongo[dev]"
For documentation building:
pip install "sanitongo[docs]"
For testing (includes MongoDB integration tests):
pip install "sanitongo[test]"
Quick Start
Basic Usage
from sanitongo import create_sanitizer
# Create a sanitizer with default settings
sanitizer = create_sanitizer(strict_mode=True)
# Sanitize a query
query = {
"name": "John Doe",
"age": {"$gte": 18},
"email": "john@example.com"
}
# Safe sanitization
sanitized_query = sanitizer.sanitize_query(query)
print(sanitized_query) # Clean query ready for MongoDB
# Check if query is safe without modification
is_safe = sanitizer.is_query_safe(query)
print(f"Query is safe: {is_safe}")
Detailed Sanitization Report
from sanitongo import MongoSanitizer, SanitizerConfig
# Create sanitizer with custom config
config = SanitizerConfig(
strict_operators=False, # Remove dangerous operators instead of failing
enable_logging=True,
log_level="INFO"
)
sanitizer = MongoSanitizer(config)
# Dangerous query example
dangerous_query = {
"name": "John",
"$where": "function() { return true; }", # Dangerous JS execution
"payload": "<script>alert('xss')</script>", # XSS attempt
}
# Get detailed report
report = sanitizer.sanitize(dangerous_query)
print(f"Success: {report.success}")
print(f"Modified: {report.has_modifications()}")
print(f"Warnings: {len(report.warnings)}")
print(f"Removed items: {report.removed_items}")
print(f"Security issues: {report.security_issues}")
print(f"Summary: {report.get_summary()}")
Schema-Based Validation
from sanitongo import create_sanitizer
# Define your schema using simple field configs
schema = {
"_id": {"type": "objectid"},
"name": {
"type": "string",
"required": True,
"min_length": 1,
"max_length": 100
},
"email": {
"type": "string",
"pattern": r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
},
"age": {"type": "integer"},
"active": {"type": "boolean"},
"tags": {"type": "array"}
}
# Create sanitizer with schema
sanitizer = create_sanitizer(schema=schema, strict_mode=True)
# Valid query
valid_query = {
"name": "John Doe",
"email": "john@example.com",
"age": 30,
"active": True
}
result = sanitizer.sanitize_query(valid_query) # ✅ Passes
# Invalid query with schema violations
invalid_query = {
"name": "", # Too short
"email": "invalid-email", # Wrong format
"unknown_field": "not allowed" # Not in schema
}
try:
sanitizer.sanitize_query(invalid_query) # ❌ Raises ValidationError
except Exception as e:
print(f"Validation failed: {e}")
Configuration
Basic Configuration
from sanitongo import SanitizerConfig, MongoSanitizer
config = SanitizerConfig(
# Type validation
strict_types=True,
# Operator filtering
strict_operators=True,
dangerous_operators={"$where", "$function", "$regex"},
allowed_operators={"$eq", "$ne", "$gt", "$gte", "$lt", "$lte", "$in", "$nin"},
# Pattern validation
enable_pattern_validation=True,
custom_dangerous_patterns={
"sql_injection": r"(?i)(union\s+select|drop\s+table)",
"custom_threat": r"malicious_pattern"
},
# Complexity limits
max_depth=10,
max_keys=100,
max_array_length=1000,
max_string_length=10000,
# Error handling
fail_on_schema_violation=True,
fail_on_dangerous_operators=True,
fail_on_dangerous_patterns=True,
fail_on_complexity_exceeded=True,
# Logging
enable_logging=True,
log_level="INFO",
log_removed_items=True
)
sanitizer = MongoSanitizer(config)
Advanced Configuration
from sanitongo import SanitizerConfig, MongoSanitizer
# Create detailed configuration
config = SanitizerConfig(
strict_types=True,
strict_operators=False, # Remove dangerous operators instead of failing
enable_pattern_validation=True,
max_depth=15,
max_keys=200,
enable_logging=True,
log_level="WARNING"
)
sanitizer = MongoSanitizer(config)
Schema-Based Configuration
from sanitongo import SanitizerConfig, MongoSanitizer
from sanitongo.schema import SchemaValidator, FieldRule, FieldType
# Define schema rules
schema_rules = {
"_id": FieldRule(FieldType.OBJECT_ID),
"name": FieldRule(
FieldType.STRING,
required=True,
min_length=1,
max_length=100
),
"email": FieldRule(
FieldType.STRING,
pattern=r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
),
"age": FieldRule(FieldType.INTEGER),
"active": FieldRule(FieldType.BOOLEAN)
}
# Create schema validator
schema_validator = SchemaValidator(schema_rules)
# Create configuration with schema
config = SanitizerConfig(
schema_validator=schema_validator,
strict_types=True,
enable_logging=True
)
sanitizer = MongoSanitizer(config)
Security Features
Attack Prevention Examples
from sanitongo import create_sanitizer
sanitizer = create_sanitizer(strict_mode=True)
# NoSQL Injection Prevention
malicious_queries = [
{"$where": "function() { return true; }"},
{"$where": "this.username == 'admin'"},
{"username": {"$ne": None}},
]
# JavaScript Injection Prevention
js_attacks = [
{"payload": "function() { while(true) {} }"},
{"code": "eval('rm -rf /')"},
{"script": "setTimeout(() => { attack(); }, 1000)"},
]
# XSS Prevention
xss_attacks = [
{"html": "<script>alert('xss')</script>"},
{"payload": "<img src=x onerror=alert('xss')>"},
{"injection": "javascript:alert('xss')"},
]
# All of these will be blocked or sanitized
for attack in malicious_queries + js_attacks + xss_attacks:
try:
sanitizer.sanitize_query(attack)
print("❌ Attack not blocked!")
except Exception as e:
print(f"✅ Attack blocked: {type(e).__name__}")
Complexity DoS Prevention
# These will be blocked by complexity limits
complex_attacks = [
# Deep nesting attack
{"a": {"b": {"c": {"d": {"e": {"f": {"g": {"h": {"i": {"j": {"k": "deep"}}}}}}}}}}},
# Key explosion attack
{f"key_{i}": f"value_{i}" for i in range(500)},
# Large array attack
{"large_array": list(range(5000))},
# Long string attack
{"long_string": "A" * 50000}
]
for attack in complex_attacks:
try:
sanitizer.sanitize_query(attack)
print("❌ Complexity attack not blocked!")
except Exception as e:
print(f"✅ Complexity attack blocked: {type(e).__name__}")
Testing
Run the comprehensive test suite:
# Run all tests
pytest
# Run with coverage
pytest --cov=src/sanitongo --cov-report=html
# Run only security tests
pytest tests/test_security.py -m security
# Run performance benchmarks
pytest --benchmark-only
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
# Clone the repository
git clone https://github.com/izikeros/sanitongo.git
cd sanitongo
# Install in development mode
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install
# Run tests
make test
# Run linting
make lint
# Format code
make format
Implementation Status
Fully Implemented
- Core Protection Layers: All five security layers are implemented and tested
- Basic API:
MongoSanitizer,SanitizerConfig, andcreate_sanitizer - Schema Validation: Field types, constraints, and validation rules
- Security Features: NoSQL injection, XSS, pattern detection, complexity limits
- Error Handling: Comprehensive exception hierarchy
- Testing: 83 tests with 69% code coverage
Partially Implemented
- Configuration Management: Basic config loading exists but needs more testing
- Advanced Schema Types: Some field types may need refinement
- Documentation: API docs could be expanded
Planned Features
- JSON/YAML Configuration: File-based configuration loading
- Environment Variables: Configuration from environment
- Advanced Patterns: More sophisticated threat detection
- Performance Optimizations: Further speed improvements
Performance
Sanitongo is designed for production use with minimal performance impact:
- Lightweight: Small memory footprint
- Fast: Optimized algorithms for each protection layer
- Scalable: Handles complex queries efficiently
- Configurable: Adjust security vs. performance trade-offs
Benchmark results (from automated tests):
- Simple queries: ~45-80μs processing time
- Complex queries: ~100-200μs processing time
- Schema validation: ~38-45μs processing time
- Memory usage: <10MB for typical configurations
Security Considerations
When to Use Strict Mode
Use strict mode when:
- Handling untrusted user input
- Building public APIs
- Processing queries from external sources
- Maximum security is required
Use lenient mode when:
- Processing internal/trusted queries
- You need detailed sanitization reports
- Gradual security implementation
- Legacy system integration
Security Best Practices
- Always validate input - Use schema validation for all external input
- Log security events - Enable logging for security auditing
- Monitor removed items - Track what gets sanitized
- Regular updates - Keep sanitongo updated for latest security fixes
- Test thoroughly - Include security tests in your test suite
API Reference
Main Classes
MongoSanitizer- Main sanitizer class with full configurationSanitizerConfig- Configuration containerSanitizationReport- Detailed sanitization resultsSchemaValidator- Schema-based field validation
Exceptions
SanitizerError- Base exception classValidationError- Input validation failuresSchemaViolationError- Schema constraint violationsSecurityError- Security threats detectedComplexityError- Query complexity limits exceededPatternError- Dangerous patterns detected
Factory Functions
create_sanitizer()- Create sanitizer with common configurations
Schema Components
FieldType- Enum of supported field types (STRING, INTEGER, BOOLEAN, etc.)FieldRule- Validation rules for individual fieldsSchemaValidator- Schema-based field validation
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Thanks to the MongoDB security community for vulnerability research
- Inspired by various NoSQL injection prevention techniques
- Built with modern Python security best practices
Support
- Documentation: GitHub Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Security: Report security issues privately to ksafjan@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sanitongo-0.1.1.tar.gz.
File metadata
- Download URL: sanitongo-0.1.1.tar.gz
- Upload date:
- Size: 29.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
650d30bacc0cb59a6b9ecace5d3e5813dbcaa53f8cf16ce92583aff476da0058
|
|
| MD5 |
091f616e5f72ca71b7dc8004ac0e8d2f
|
|
| BLAKE2b-256 |
9f8cf7136d2fbda7036ca9df62c31c07e8cd04943cee71f92383ebb893021e3a
|
File details
Details for the file sanitongo-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sanitongo-0.1.1-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
838bb36a1e6075d457438d61e8976ae38a37d0dcb5eb81996185b3a9aa7f77c0
|
|
| MD5 |
91762dac224942674a7ab3465dbf005c
|
|
| BLAKE2b-256 |
18afe31457281fe22256186e8e833e73dd9c56abb6ad794d06d2c75c72397c69
|