High-performance regex with JIT/SIMD optimizations
Project description
FastRegex
A high-performance regular expression library for Python with JIT compilation and SIMD optimizations.
🚀 Features
- JIT Compilation: LLVM-based just-in-time compilation for complex patterns
- SIMD Optimizations: AVX2/AVX512/SSE4.2/NEON support for vectorized operations
- Smart Caching: Automatic caching of compiled patterns to avoid recompilation
- Python Integration: Seamless integration via pybind11
- High Performance: Up to 1000x faster than standard
remodule for complex patterns
📊 Performance Benchmarks
| Test Case | Python re (ms) | FastRegex (ms) | Speedup |
|---|---|---|---|
| Email validation | 2.250 ±0.157 | 0.021 ±0.002 | 107x ✅ |
| Word boundaries | 0.166 ±0.011 | 0.025 ±0.002 | 6.6x ✅ |
| Complex pattern | 19.665 ±4.100 | 0.017 ±0.003 | 1156x ✅ |
| Multiline text | 0.855 ±0.163 | 0.219 ±0.002 | 3.9x ✅ |
Key insights:
- Up to 1156x faster for complex patterns
- 3-100x acceleration for typical scenarios
- Best performance on repetitive operations
🛠 Installation
From PyPI (Recommended)
pip install fastregex
From Source
git clone https://github.com/baksvell/fastregex.git
cd fastregex
pip install -e .
Prerequisites
- CMake 3.20+
- Python 3.10+
- C++17 compiler (GCC/MSVC/Clang)
📖 Usage
Basic Usage
import fastregex
# Simple search
result = fastregex.search(r'\d+', 'abc123def')
print(result) # True
# Find all matches
matches = fastregex.find_all(r'\w+', 'hello world test')
print(matches) # ['hello', 'world', 'test']
# Replace
new_text = fastregex.replace(r'\d+', 'abc123def456', 'XXX')
print(new_text) # 'abcXXXdefXXX'
# Compile for reuse
compiled = fastregex.compile(r'\d+')
result = compiled.search('abc123def')
print(result) # True
Advanced Features
# Check cache statistics
print(f"Cache size: {fastregex.cache_size()}")
print(f"Hit rate: {fastregex.hit_rate():.2%}")
# SIMD capabilities
caps = fastregex.simd_capabilities()
print(f"AVX2 support: {caps['avx2']}")
print(f"AVX512 support: {caps['avx512']}")
# SIMD statistics
stats = fastregex.get_simd_stats()
print(f"Total calls: {stats['total_calls']}")
🎯 When to Use FastRegex
✅ Use FastRegex when:
- Complex patterns (JIT compilation shines)
- Repetitive matching (cache pays off)
- SIMD-friendly patterns (literals, digit checks)
- Large texts (>1MB optimized chunks)
⚠️ Use standard re when:
- Simple one-time matches (no JIT overhead)
- Need 100% compatibility with Python's regex
- Dynamic patterns (generated on-the-fly)
🔄 Hybrid approach:
import re
import fastregex as fr
def smart_match(pattern, text):
if len(pattern) > 15 and len(text) > 1000:
return fr.search(pattern, text)
return re.search(pattern, text)
🧪 Testing
Run the test suite:
python -m pytest tests/
Run performance benchmarks:
python tests/benchmark.py
📚 API Reference
Core Functions
fastregex.match(pattern, text)- Match from start of stringfastregex.search(pattern, text)- Search anywhere in stringfastregex.find_all(pattern, text)- Find all matchesfastregex.replace(pattern, text, replacement)- Replace matchesfastregex.compile(pattern)- Compile pattern for reuse
Cache Management
fastregex.cache_size()- Get current cache sizefastregex.hit_rate()- Get cache hit ratefastregex.clear_cache()- Clear the cache
SIMD Features
fastregex.simd_capabilities()- Get SIMD support infofastregex.get_simd_stats()- Get SIMD usage statisticsfastregex.set_simd_mode(mode)- Set SIMD modefastregex.get_simd_mode()- Get current SIMD mode
🤝 Contributing
Contributions are welcome! Please see CONTRIBUTING.md for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Links
🙏 Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastregex-0.1.0.tar.gz.
File metadata
- Download URL: fastregex-0.1.0.tar.gz
- Upload date:
- Size: 41.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c65a7aa1254a87c1f6d9864c67f7d830b59a8ddc67220c957f68246e19ff4339
|
|
| MD5 |
fa7398f59c625c9cff1cb71670f82a42
|
|
| BLAKE2b-256 |
3d8f2399399f8b592aa17635b1a17c208bf84b8c2ad752fe95fe1bee2bad48cb
|
File details
Details for the file fastregex-0.1.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: fastregex-0.1.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 4.5 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6b8e4822fa877c4614a62229911a310d5d68e02c28778cd83fdf3f19489151d
|
|
| MD5 |
c2de76173c90c3bfc0981b45d02b0ba7
|
|
| BLAKE2b-256 |
442f8621749c67db127f1592b9fc09de9024184aebe3155643b7aa0702fd3070
|