NLP Processor & SBERT Training Tool
Project description
🧠 Vatrix
Vatrix is a NLP log processor, rendering natural language descriptions from machine data, and serves several use cases:
- streaming NLP & vector embedding
- batch NDJSON file processing
- augmented data injection
- generating training pairs for fine-tuning Sentence Transformers (SBERT)
✨ Features
- CLI-powered NDJSON log processing
- Modular template system powered by Jinja2
- SBERT data generation and similarity scoring
- Supports file mode, stream mode, and CLI flags
- Exports training pairs to CSV
- Exports highly similar sentence pairs for SBERT fine-tuning
- Flexible and colorful logging with log rotation
- Direct integration with Qdrant vector database (OSAI-Demo Stack)
- Unit & integration testing
📦 Installation
pip install vatrix
Or install the latest from source:
git clone https://github.com/brianbatesactual/vatrix.git
cd vatrix
make setup
🛠️ Usage
vatrix --mode file \
--render-mode all \
--input data/input_logs.json \
--output data/processed_logs.csv \
--unmatched data/unmatched_logs.json \
--generate-sbert-data \
--log-level DEBUG \
--log-file logs/vatrix_debug.log
Makefile Commands
make setup # Create venv and install dependencies
make run # Run log processor on default file
make stream # Start reading NDJSON from stdin
make retrain # Export SBERT sentence pairs
make freeze # Regenerate requirements.txt
make clean # Clean environment and build artifacts
make nuke # Full reset of the project environment
🧠 Example
🧪 Testing
make test
📁 Logs
All logs are saved to the logs/ directory with daily rotation.
🧼 Cleanup
make clean # Clean temp data
make nuke # Wipe and rebuild virtualenv
📚 License
MIT © Brian Bates
Built with ❤️ for log intelligibility and NLP adventures.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vatrix-0.2.1.tar.gz.
File metadata
- Download URL: vatrix-0.2.1.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9228dc4bbd539d47b5146981fc12573971dc4bc5c67baf242e539c1633511e19
|
|
| MD5 |
04d6f8d9acfe1e96115c10d53f8a026d
|
|
| BLAKE2b-256 |
4d6ef9e0248e52d6d0f083da01613471b830d7d4a9b5971c52e795f25f52f5b5
|
File details
Details for the file vatrix-0.2.1-py3-none-any.whl.
File metadata
- Download URL: vatrix-0.2.1-py3-none-any.whl
- Upload date:
- Size: 26.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf0b05fd25a8e86f475f5c3ff90e8aa25d282e4b3e8969d58a0618c263fd74b1
|
|
| MD5 |
54506ca1406c99a81e74bee7fa1bd2bb
|
|
| BLAKE2b-256 |
50c5414dd1457f6bd4af3a0ce075f2abcf60515e7d75877b2cf070304257ccf4
|