Pure Python ByteCNN for edge deployment
Project description
PinyByteCNN
Pure Python implementation of ByteCNN for toxicity detection and edge deployment.
Overview
PinyByteCNN is a lightweight, dependency-free neural network implementation designed for production deployment in constrained environments. It provides CNN-based text classification with minimal memory footprint and fast inference.
Quick Start
from tinybytecnn.model import ByteCNN
# Create model
model = ByteCNN(
vocab_size=256,
embed_dim=14,
conv_filters=28,
conv_kernel_size=3,
hidden_dim=48,
max_len=512
)
# Predict toxicity
score = model.predict("Hello world") # Returns float [0.0, 1.0]
Features
- Pure Python: No external dependencies beyond standard library
- Memory Efficient: Optimized for minimal RAM usage
- Fast Inference: Single-pass prediction with pre-allocated buffers
- Multiple Architectures: Support for 1-3 layer CNN configurations
- Flexible Input: Handles variable-length text with multiple strategies
Architecture
ByteCNN processes text through the following pipeline:
- Byte Encoding: Convert text to UTF-8 bytes (0-255)
- Embedding: Map bytes to dense vectors
- Convolution: 1D CNN with ReLU activation
- Pooling: Global average/max pooling
- Classification: Dense layers with sigmoid output
Installation
Clone the repository and import directly:
git clone <repository-url>
cd PinyByteCNN
python3 -c "from tinybytecnn.model import ByteCNN; print('Success')"
Usage
Basic Classification
from tinybytecnn.model import ByteCNN
model = ByteCNN(vocab_size=256, embed_dim=14, conv_filters=28,
conv_kernel_size=3, hidden_dim=48)
# Single prediction
score = model.predict("This is a test message")
# Batch processing
texts = ["Hello", "Goodbye", "Test message"]
scores = [model.predict(text) for text in texts]
Multi-Layer Models
from tinybytecnn.multi_layer_optimized import MultiLayerByteCNN
# Define layer configuration
layers = [
{"in_channels": 14, "out_channels": 28, "kernel_size": 3},
{"in_channels": 28, "out_channels": 40, "kernel_size": 3}
]
model = MultiLayerByteCNN(layers_config=layers, hidden_dim=128, max_len=512)
score = model.predict("Multi-layer processing")
Prediction Strategies
truncate: Use first max_len bytes (fastest)average: Average predictions over sliding windowsattention: Weighted average with attention mechanism
score = model.predict("Long text...", strategy="average")
Testing
Run the test suite:
python3 -m unittest discover tests/
Smoke Tests
Validate against production models:
python3 tests/test_bytecnn_10k_smoke.py
Performance
| Model | Parameters | Accuracy | Inference Time |
|---|---|---|---|
| ByteCNN-10K | 10,009 | 78.97% | 0.5ms |
| ByteCNN-32K | 32,768 | 82.15% | 1.2ms |
Benchmarks on MacBook Pro M1, single-threaded
Production Deployment
PinyByteCNN is designed for edge deployment scenarios:
- Cloudflare Workers: Sub-10ms inference
- AWS Lambda: Cold start friendly
- Mobile/IoT: Minimal memory footprint
- Air-gapped Systems: No external dependencies
See DEPLOYMENT.md for detailed deployment guides.
Model Architecture Details
For detailed architecture information and training procedures, see ARCHITECTURE.md.
Development
Setup Development Environment
With uv (recommended):
# Install dev dependencies
uv sync --dev
# Run linting (performance-optimized rules)
uv run python scripts/lint.py
# Quick lint check
uv run ruff check tinybytecnn/
# Format code
uv run ruff format .
With pip:
# Install development tools
python scripts/setup_dev.py
# Run linting
python scripts/lint.py
Linting Philosophy
PinyByteCNN uses performance-focused linting rules:
- Core library (
tinybytecnn/): Strict quality checks - Performance exceptions: Complexity rules relaxed for optimization
- Documentation: Optional (prioritizes code density)
- Tests/Scripts: Lenient rules for development flexibility
Contributing
- Run
python scripts/setup_dev.pyto install dev tools - Ensure
python scripts/lint.pypasses on core library - Maintain 80%+ test coverage with
python scripts/coverage_analyzer.py - Add tests for new features
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pinybytecnn-1.0.0.tar.gz.
File metadata
- Download URL: pinybytecnn-1.0.0.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83a88cdb025946086d3e23c318aab51580b0a24ec31c0e9e90ae43a782d010bc
|
|
| MD5 |
e0c13289bc2d042953b7bf7840e88069
|
|
| BLAKE2b-256 |
39da4c1946323a021a483425d310fbf5656c3b543193b7e9a1a0493c92424a07
|
File details
Details for the file pinybytecnn-1.0.0-py3-none-any.whl.
File metadata
- Download URL: pinybytecnn-1.0.0-py3-none-any.whl
- Upload date:
- Size: 22.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e99f5e28c1e16855d315d420d6bb2bd3d6829dfe038a4147116b5169d23e08bf
|
|
| MD5 |
d01b7ca0ab0380b167b1cd032a1e3fc4
|
|
| BLAKE2b-256 |
2d78df32e4158f2a9ba5a815564fda70badd74b5f604838a9b28ab5ade90acd1
|