Modern MongoDB query sanitizer with layered security protection

These details have not been verified by PyPI

Project links

Project description

Sanitongo 🛡️

Modern MongoDB Query Sanitizer with Layered Security Protection

Sanitongo is a comprehensive security library for sanitizing MongoDB queries with multiple layers of protection against NoSQL injection attacks, malicious queries, and complexity-based DoS attacks.

Features

Five-Layer Protection System

Type Validation - Ensure inputs are valid types and structures
Schema Enforcement - Validate against predefined field schemas
Operator Filtering - Remove/validate MongoDB operators
Pattern Validation - Detect dangerous patterns in string values
Complexity Limiting - Prevent DoS through query complexity limits

Security Features

NoSQL Injection Prevention - Blocks $where, $function, and other dangerous operators
JavaScript Injection Protection - Detects and prevents JS code execution attempts
ReDoS Prevention - Identifies potentially dangerous regex patterns
XSS Protection - Blocks script tags and malicious HTML
Command Injection Prevention - Detects shell command injection attempts
Prototype Pollution Protection - Prevents __proto__ and constructor manipulation
Complexity DoS Prevention - Limits query depth, key count, array sizes, and string lengths

Modern Architecture

Layered Design - Each protection layer operates independently
Configurable - Extensive configuration options for different security postures
Detailed Reporting - Comprehensive sanitization reports with security insights
Type Safety - Full TypeScript-style type hints and validation
Performance Focused - Optimized for production use
Extensive Testing - Security-focused test suites

Installation

pip install sanitongo

For development with all dependencies:

pip install "sanitongo[dev]"

For documentation building:

pip install "sanitongo[docs]"

For testing (includes MongoDB integration tests):

pip install "sanitongo[test]"

Quick Start

Basic Usage

from sanitongo import create_sanitizer

# Create a sanitizer with default settings
sanitizer = create_sanitizer(strict_mode=True)

# Sanitize a query
query = {
    "name": "John Doe",
    "age": {"$gte": 18},
    "email": "john@example.com"
}

# Safe sanitization
sanitized_query = sanitizer.sanitize_query(query)
print(sanitized_query)  # Clean query ready for MongoDB

# Check if query is safe without modification
is_safe = sanitizer.is_query_safe(query)
print(f"Query is safe: {is_safe}")

Detailed Sanitization Report

from sanitongo import MongoSanitizer, SanitizerConfig

# Create sanitizer with custom config
config = SanitizerConfig(
    strict_operators=False,  # Remove dangerous operators instead of failing
    enable_logging=True,
    log_level="INFO"
)
sanitizer = MongoSanitizer(config)

# Dangerous query example
dangerous_query = {
    "name": "John",
    "$where": "function() { return true; }",  # Dangerous JS execution
    "payload": "<script>alert('xss')</script>",  # XSS attempt
}

# Get detailed report
report = sanitizer.sanitize(dangerous_query)

print(f"Success: {report.success}")
print(f"Modified: {report.has_modifications()}")
print(f"Warnings: {len(report.warnings)}")
print(f"Removed items: {report.removed_items}")
print(f"Security issues: {report.security_issues}")
print(f"Summary: {report.get_summary()}")

Schema-Based Validation

from sanitongo import create_sanitizer

# Define your schema using simple field configs
schema = {
    "_id": {"type": "objectid"},
    "name": {
        "type": "string",
        "required": True,
        "min_length": 1,
        "max_length": 100
    },
    "email": {
        "type": "string", 
        "pattern": r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    },
    "age": {"type": "integer"},
    "active": {"type": "boolean"},
    "tags": {"type": "array"}
}

# Create sanitizer with schema
sanitizer = create_sanitizer(schema=schema, strict_mode=True)

# Valid query
valid_query = {
    "name": "John Doe",
    "email": "john@example.com", 
    "age": 30,
    "active": True
}

result = sanitizer.sanitize_query(valid_query)  # ✅ Passes

# Invalid query with schema violations
invalid_query = {
    "name": "",  # Too short
    "email": "invalid-email",  # Wrong format
    "unknown_field": "not allowed"  # Not in schema
}

try:
    sanitizer.sanitize_query(invalid_query)  # ❌ Raises ValidationError
except Exception as e:
    print(f"Validation failed: {e}")

Configuration

Basic Configuration

from sanitongo import SanitizerConfig, MongoSanitizer

config = SanitizerConfig(
    # Type validation
    strict_types=True,
    
    # Operator filtering  
    strict_operators=True,
    dangerous_operators={"$where", "$function", "$regex"},
    allowed_operators={"$eq", "$ne", "$gt", "$gte", "$lt", "$lte", "$in", "$nin"},
    
    # Pattern validation
    enable_pattern_validation=True,
    custom_dangerous_patterns={
        "sql_injection": r"(?i)(union\s+select|drop\s+table)",
        "custom_threat": r"malicious_pattern"
    },
    
    # Complexity limits
    max_depth=10,
    max_keys=100, 
    max_array_length=1000,
    max_string_length=10000,
    
    # Error handling
    fail_on_schema_violation=True,
    fail_on_dangerous_operators=True,
    fail_on_dangerous_patterns=True,
    fail_on_complexity_exceeded=True,
    
    # Logging
    enable_logging=True,
    log_level="INFO",
    log_removed_items=True
)

sanitizer = MongoSanitizer(config)

Advanced Configuration

from sanitongo import SanitizerConfig, MongoSanitizer

# Create detailed configuration
config = SanitizerConfig(
    strict_types=True,
    strict_operators=False,  # Remove dangerous operators instead of failing
    enable_pattern_validation=True,
    max_depth=15,
    max_keys=200,
    enable_logging=True,
    log_level="WARNING"
)

sanitizer = MongoSanitizer(config)

Schema-Based Configuration

from sanitongo import SanitizerConfig, MongoSanitizer
from sanitongo.schema import SchemaValidator, FieldRule, FieldType

# Define schema rules
schema_rules = {
    "_id": FieldRule(FieldType.OBJECT_ID),
    "name": FieldRule(
        FieldType.STRING,
        required=True,
        min_length=1,
        max_length=100
    ),
    "email": FieldRule(
        FieldType.STRING,
        pattern=r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    ),
    "age": FieldRule(FieldType.INTEGER),
    "active": FieldRule(FieldType.BOOLEAN)
}

# Create schema validator
schema_validator = SchemaValidator(schema_rules)

# Create configuration with schema
config = SanitizerConfig(
    schema_validator=schema_validator,
    strict_types=True,
    enable_logging=True
)

sanitizer = MongoSanitizer(config)

Security Features

Attack Prevention Examples

from sanitongo import create_sanitizer

sanitizer = create_sanitizer(strict_mode=True)

# NoSQL Injection Prevention
malicious_queries = [
    {"$where": "function() { return true; }"},
    {"$where": "this.username == 'admin'"},
    {"username": {"$ne": None}},
]

# JavaScript Injection Prevention  
js_attacks = [
    {"payload": "function() { while(true) {} }"},
    {"code": "eval('rm -rf /')"},
    {"script": "setTimeout(() => { attack(); }, 1000)"},
]

# XSS Prevention
xss_attacks = [
    {"html": "<script>alert('xss')</script>"},
    {"payload": "<img src=x onerror=alert('xss')>"},
    {"injection": "javascript:alert('xss')"},
]

# All of these will be blocked or sanitized
for attack in malicious_queries + js_attacks + xss_attacks:
    try:
        sanitizer.sanitize_query(attack)
        print("❌ Attack not blocked!")
    except Exception as e:
        print(f"✅ Attack blocked: {type(e).__name__}")

Complexity DoS Prevention

# These will be blocked by complexity limits
complex_attacks = [
    # Deep nesting attack
    {"a": {"b": {"c": {"d": {"e": {"f": {"g": {"h": {"i": {"j": {"k": "deep"}}}}}}}}}}},
    
    # Key explosion attack  
    {f"key_{i}": f"value_{i}" for i in range(500)},
    
    # Large array attack
    {"large_array": list(range(5000))},
    
    # Long string attack
    {"long_string": "A" * 50000}
]

for attack in complex_attacks:
    try:
        sanitizer.sanitize_query(attack)
        print("❌ Complexity attack not blocked!")
    except Exception as e:
        print(f"✅ Complexity attack blocked: {type(e).__name__}")

Testing

Run the comprehensive test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=src/sanitongo --cov-report=html

# Run only security tests
pytest tests/test_security.py -m security

# Run performance benchmarks
pytest --benchmark-only

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

# Clone the repository
git clone https://github.com/izikeros/sanitongo.git
cd sanitongo

# Install in development mode
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
make test

# Run linting
make lint

# Format code
make format

Implementation Status

Fully Implemented

Core Protection Layers: All five security layers are implemented and tested
Basic API: MongoSanitizer, SanitizerConfig, and create_sanitizer
Schema Validation: Field types, constraints, and validation rules
Security Features: NoSQL injection, XSS, pattern detection, complexity limits
Error Handling: Comprehensive exception hierarchy
Testing: 83 tests with 69% code coverage

Partially Implemented

Configuration Management: Basic config loading exists but needs more testing
Advanced Schema Types: Some field types may need refinement
Documentation: API docs could be expanded

Planned Features

JSON/YAML Configuration: File-based configuration loading
Environment Variables: Configuration from environment
Advanced Patterns: More sophisticated threat detection
Performance Optimizations: Further speed improvements

Performance

Sanitongo is designed for production use with minimal performance impact:

Lightweight: Small memory footprint
Fast: Optimized algorithms for each protection layer
Scalable: Handles complex queries efficiently
Configurable: Adjust security vs. performance trade-offs

Benchmark results (from automated tests):

Simple queries: ~45-80μs processing time
Complex queries: ~100-200μs processing time
Schema validation: ~38-45μs processing time
Memory usage: <10MB for typical configurations

Security Considerations

When to Use Strict Mode

Use strict mode when:

Handling untrusted user input
Building public APIs
Processing queries from external sources
Maximum security is required

Use lenient mode when:

Processing internal/trusted queries
You need detailed sanitization reports
Gradual security implementation
Legacy system integration

Security Best Practices

Always validate input - Use schema validation for all external input
Log security events - Enable logging for security auditing
Monitor removed items - Track what gets sanitized
Regular updates - Keep sanitongo updated for latest security fixes
Test thoroughly - Include security tests in your test suite

API Reference

Main Classes

MongoSanitizer - Main sanitizer class with full configuration
SanitizerConfig - Configuration container
SanitizationReport - Detailed sanitization results
SchemaValidator - Schema-based field validation

Exceptions

SanitizerError - Base exception class
ValidationError - Input validation failures
SchemaViolationError - Schema constraint violations
SecurityError - Security threats detected
ComplexityError - Query complexity limits exceeded
PatternError - Dangerous patterns detected

Factory Functions

create_sanitizer() - Create sanitizer with common configurations

Schema Components

FieldType - Enum of supported field types (STRING, INTEGER, BOOLEAN, etc.)
FieldRule - Validation rules for individual fields
SchemaValidator - Schema-based field validation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thanks to the MongoDB security community for vulnerability research
Inspired by various NoSQL injection prevention techniques
Built with modern Python security best practices

Support

Documentation: GitHub Wiki
Issues: GitHub Issues
Discussions: GitHub Discussions
Security: Report security issues privately to ksafjan@gmail.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Jan 16, 2026

0.1.2

Jan 15, 2026

This version

0.1.1

Oct 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sanitongo-0.1.1.tar.gz (29.7 kB view details)

Uploaded Oct 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sanitongo-0.1.1-py3-none-any.whl (22.6 kB view details)

Uploaded Oct 24, 2025 Python 3

File details

Details for the file sanitongo-0.1.1.tar.gz.

File metadata

Download URL: sanitongo-0.1.1.tar.gz
Upload date: Oct 24, 2025
Size: 29.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sanitongo-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`650d30bacc0cb59a6b9ecace5d3e5813dbcaa53f8cf16ce92583aff476da0058`
MD5	`091f616e5f72ca71b7dc8004ac0e8d2f`
BLAKE2b-256	`9f8cf7136d2fbda7036ca9df62c31c07e8cd04943cee71f92383ebb893021e3a`

See more details on using hashes here.

File details

Details for the file sanitongo-0.1.1-py3-none-any.whl.

File metadata

Download URL: sanitongo-0.1.1-py3-none-any.whl
Upload date: Oct 24, 2025
Size: 22.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sanitongo-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`838bb36a1e6075d457438d61e8976ae38a37d0dcb5eb81996185b3a9aa7f77c0`
MD5	`91762dac224942674a7ab3465dbf005c`
BLAKE2b-256	`18afe31457281fe22256186e8e833e73dd9c56abb6ad794d06d2c75c72397c69`

See more details on using hashes here.

sanitongo 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Sanitongo 🛡️

Features

Five-Layer Protection System

Security Features

Modern Architecture

Installation

Quick Start

Basic Usage

Detailed Sanitization Report

Schema-Based Validation

Configuration

Basic Configuration

Advanced Configuration

Schema-Based Configuration

Security Features

Attack Prevention Examples

Complexity DoS Prevention

Testing

Contributing

Development Setup

Implementation Status

Fully Implemented

Partially Implemented

Planned Features

Performance

Security Considerations

When to Use Strict Mode

Security Best Practices

API Reference

Main Classes

Exceptions

Factory Functions

Schema Components

License

Acknowledgments

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes