Monitor and filter Fediverse hashtags, curate quality content, and distribute via external tools like Zhongli
Project description
FenLiu (分流)
Created by marvin8 with assistance from Claude and DeepSeek AI assistants.
⚠️ DISCLAIMER / PROVISO: This project is a work in progress with major changes still happening. It is in no way anywhere close to finished and is only borderline useful for actual production use. Expect breaking changes, incomplete features, and significant architectural evolution as development continues.
Divide the Fediverse content flow
FenLiu is a web application that monitors Fediverse hashtags, filters spam, allows human review, learns from feedback, and exports quality content for boosting. Inspired by the ancient Chinese Dujiangyan irrigation system (256 BC) that separated silt from water, FenLiu applies 2,300-year engineering wisdom to modern digital content streams.
Current Status — v0.7.0
FenLiu is a fully functional spam filtering and content management system with complete Curated Queue integration, flexible pattern-based user blocking, automated queue lifecycle management, production-ready containerization, and ML training data collection. Monitor hashtags, score posts for spam, manually review content, reliably export quality posts, and manage queue health with automatic cleanup and trimming.
Latest updates (v0.7.0): Review page pagination (20 posts/page), bulk approve/reject buttons, auto-refresh when page empties, ML training data snapshot collection on every review action, stream deletion cascade fix; 402 total tests passing.
Features
Core Functionality
- Hashtag Monitoring: Monitor multiple Fediverse hashtags with customizable instance sources and scheduling
- Spam Scoring: Rule-based detection (0-100 scale) with 7 intelligent detection rules
- Manual Review Interface: Web interface for reviewing and approving/rejecting posts with scoring
- Bulk Operations: Fetch and process posts in bulk with real-time progress tracking
- Curated Queue Export: API-driven queue with ack/nack/error reliability pattern
Reblog Controls (Export Filters)
- Pattern-Based User Blocking: Block users with flexible matching modes:
- exact: Exact account identifier (e.g.,
@user@mastodon.social) - suffix: Block all users from domain (e.g.,
bsky.appfor all Bluesky users) - prefix: Block by username prefix (e.g.,
bot_for bot accounts) - contains: Block by substring (e.g.,
spamfor accounts with "spam" in name)
- exact: Exact account identifier (e.g.,
- "Don't Reblog" Hashtag Blocklist: Exclude posts with blocked hashtags
- Attachments-Only Mode: Export only posts with media attachments
- Auto-Reject on Fetch: Automatically reject blocked content before review
- Blocklist Refresh: Apply Settings changes to review page instantly without losing progress
Web Interface
- Dashboard: Real-time analytics, top hashtags, review progress
- Streams Management: Create, edit, manage hashtag streams with CRUD operations
- Review Workflow: Approve/reject posts with manual score adjustment and spam breakdown
- Pagination: 20 posts per page with prev/next navigation
- Bulk Actions: Approve All / Reject All buttons for the current page
- Auto-refresh: Page reloads automatically when emptied but more posts remain
- Pattern Blocking Settings: Intuitive UI for adding pattern-based user blocks with examples
- Queue Preview: Monitor queue health (pending/reserved/delivered/error counts)
- Statistics: Charts for posts over time and hashtag distribution
- Responsive Design: Fully responsive across desktop, tablet, mobile
REST API
- Hashtag Streams: Full CRUD for stream management and bulk fetching
- Posts: List, filter, update with approval/rejection and scoring
- Curated Queue:
/next,/ack,/nack,/error,/requeueendpoints - Reblog Controls: Manage blocked users (with pattern types) and hashtags
- Statistics: Post counts, hashtag distribution, approval rates
- Authentication: API key-based authentication for queue endpoints
- Health: Health check and application info endpoints
Technical Quality
- Type Safety: Comprehensive type hints throughout
- Testing: 402 tests with 100% pass rate
- Resource Management: Proper cleanup of DB sessions and HTTP connections
- Database Migrations: Alembic with automatic schema migration on startup
- API Key Security: Secure generation and management of API keys
- Code Complexity: All functions optimized for maintainability
- No JavaScript Bloat: Pure HTML/CSS frontend, no external JS dependencies
Quick Start
Prerequisites
- Python 3.12 or higher
uvpackage manager (recommended)
Installation
# Install dependencies
uv sync -U --all-groups
# Optional: Set up pre-commit hooks
uv run pre-commit install
Running the Application
# Development mode with auto-reload
fenliu --reload --debug
# Alternative development mode
uv run python -m fenliu --reload --debug
# Production mode
fenliu --host 0.0.0.0 --port 8000
# See all options
fenliu --help
Container Deployment (Docker/Podman)
FenLiu includes production-ready containerization with minimal image size (~207 MB):
podman build -t fenliu -f Containerfile .
cp .env.example .env # edit with your settings
podman run -d -p 8000:8000 \
-v fenliu-data:/app/data \
-v fenliu-logs:/app/logs \
--env-file .env \
fenliu
See the Container Deployment guide for full instructions including volumes, compose examples, and security notes.
First Steps
- Start the server:
fenliu --reload - Open browser: Navigate to
http://localhost:8000 - Add a hashtag: Go to Streams page and create a hashtag stream (e.g., "python")
- Fetch posts: Click "Fetch" on the stream to retrieve posts from Fediverse
- Review posts: Use the Review interface to approve quality content or reject spam
- Block patterns: Go to Settings to add pattern-based blocks (optional)
- Export: Monitor the Queue Preview to see posts flowing to Curated Queue
Pattern-Based Blocking Examples
Settings Page Usage
- Go to Settings → Don't Reblog — Users
- Enter pattern:
bsky.app - Select type: suffix
- Click "Block"
- Result: All users from Bluesky are now blocked
Common Patterns
- Block all Bluesky users: Pattern
bsky.app, Typesuffix - Block bot accounts: Pattern
bot_, Typeprefix - Block accounts with spam keyword: Pattern
spam, Typecontains - Block specific user: Pattern
@user@mastodon.social, Typeexact
Applying to Review Page
- While reviewing posts, go to Settings to add new patterns
- Return to Review page
- Click Refresh Blocklists button (next to Refresh)
- Current posts instantly re-evaluated with new patterns
- Continue reviewing without page reload
Debug Logging
Enable detailed debug logging with the --debug flag:
# Enable debug logging to file
fenliu --debug
# View logs in real-time
tail -f logs/fenliu_debug.log
# Custom log directory
fenliu --debug --log-dir=/var/log/fenliu
In your code: from fenliu.logging import get_logger then logger.debug(f"message")
API Usage
Authentication
All queue endpoints require API key authentication. Generate a key in Settings, then include it in requests:
curl -H "X-API-Key: your-api-key-here" \
http://localhost:8000/api/v1/curated/next
Common Examples
# List all hashtag streams
curl http://localhost:8000/api/v1/streams
# Create a new hashtag stream
curl -X POST http://localhost:8000/api/v1/streams \
-H "Content-Type: application/json" \
-d '{"hashtag": "python", "instance": "mastodon.social", "active": true}'
# Fetch posts for a stream
curl -X POST http://localhost:8000/api/v1/streams/1/fetch?limit=20
# Get next post from Curated Queue
curl -H "X-API-Key: your-api-key-here" \
http://localhost:8000/api/v1/curated/next
# Acknowledge successful reblog
curl -X POST -H "X-API-Key: your-api-key-here" \
http://localhost:8000/api/v1/curated/123/ack
# Report permanent failure
curl -X POST -H "X-API-Key: your-api-key-here" \
-H "Content-Type: application/json" \
-d '{"reason": "Account suspended"}' \
http://localhost:8000/api/v1/curated/123/error
# Review a post (approve)
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
-H "Content-Type: application/json" \
-d '{"approved": true, "reviewer_notes": "Quality content"}'
# Adjust spam score manually
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
-H "Content-Type: application/json" \
-d '{"manual_spam_score": 15}'
# Add a pattern-based block (suffix type)
curl -X POST http://localhost:8000/api/v1/reblog-controls/blocked-users \
-H "Content-Type: application/json" \
-d '{"account_identifier": "bsky.app", "pattern_type": "suffix", "notes": "Block all Bluesky"}'
# List blocked users with pattern types
curl http://localhost:8000/api/v1/reblog-controls/blocked-users
API Endpoints
Streams & Posts:
GET /api/v1/streams- List streamsPOST /api/v1/streams- Create streamGET/PUT/DELETE /api/v1/streams/{id}- Stream operationsPOST /api/v1/streams/{id}/fetch- Fetch posts for streamPOST /api/v1/streams/fetch-all- Fetch all active streamsGET /api/v1/posts- List posts with filteringGET /api/v1/posts/{id}- Get post detailsPATCH /api/v1/posts/{id}- Update post (review, approve, score)GET /api/v1/stats- Application statistics
Curated Queue:
GET /api/v1/curated/next- Get next post (returns 204 if empty)POST /api/v1/curated/{post_id}/ack- Confirm successful reblogPOST /api/v1/curated/{post_id}/nack- Return to queue (transient failure)POST /api/v1/curated/{post_id}/error- Mark permanently failedPOST /api/v1/curated/{post_id}/requeue- Return errored post to queue
Reblog Controls (Pattern-Based Blocking):
GET /api/v1/reblog-controls/settings- Get reblog filter settingsPUT /api/v1/reblog-controls/settings- Update settingsGET /api/v1/reblog-controls/blocked-users- List blocked users with pattern typesPOST /api/v1/reblog-controls/blocked-users- Add blocked user (with pattern_type)DELETE /api/v1/reblog-controls/blocked-users/{id}- Remove blocked userGET /api/v1/reblog-controls/blocked-hashtags- List blocked hashtagsPOST /api/v1/reblog-controls/blocked-hashtags- Add blocked hashtagDELETE /api/v1/reblog-controls/blocked-hashtags/{id}- Remove blocked hashtagPOST /api/v1/reblog-controls/reject-blocked- Bulk reject posts matching any pattern
System:
GET /health- Health checkGET /info- Application info
Configuration
Environment variables (via .env file):
# Database
DATABASE_URL=sqlite:///./fenliu.db
# Fediverse settings
DEFAULT_INSTANCE=mastodon.social
API_TIMEOUT=30
MAX_POSTS_PER_FETCH=20
RATE_LIMIT_DELAY=1.0
# Application
DEBUG=false
SECRET_KEY=your-secret-key-change-in-production
APP_NAME=FenLiu
# Spam scoring thresholds
VERY_HIGH_THRESHOLD=76
LOW_MAX_THRESHOLD=25
# Queue timeout
RESERVE_TIMEOUT_SECONDS=300
Development
Testing
# Run full test suite
pytest
# Run with coverage
pytest --cov=src/fenliu tests/
# Quick validation
python -m pytest -q
# Run specific test file
pytest tests/test_pattern_blocking.py -v
Code Quality
# Linting
ruff check src/fenliu/
# Formatting
ruff format src/fenliu/
# Complexity check
complexipy src
# Pre-commit checks
prek run --all-files
# Full CI simulation
nox
Database Migrations
# Apply pending migrations
alembic upgrade head
# Create new migration
alembic revision --autogenerate -m "description"
# Show current revision
alembic current
# View all revisions
alembic history
Development Workflow
# After dependency changes
uv sync -U --all-groups
# Quick validation before commits
prek run --all-files
# Full validation before commits
nox
Project Structure
fenliu/
├── src/fenliu/
│ ├── __init__.py # Package definition
│ ├── __main__.py # CLI entry point
│ ├── main.py # PyView application
│ ├── config.py # Configuration
│ ├── database.py # Database setup
│ ├── models.py # SQLAlchemy models
│ ├── schemas.py # Pydantic validation
│ ├── api/ # REST API endpoints
│ │ ├── curated.py # Queue API
│ │ ├── reblog_controls.py # Filter management (pattern-based)
│ │ └── api_keys.py # API key management
│ ├── services/ # Business logic
│ │ ├── spam_scoring.py # Spam detection
│ │ ├── fediverse.py # Fediverse client
│ │ ├── export_eligibility.py # Export filtering with pattern matching
│ │ ├── scheduler.py # Task scheduling
│ │ └── api_key.py # API key service
│ ├── templates/ # HTML templates
│ └── static/ # CSS and assets
├── alembic/ # Database migrations
├── tests/ # Test suite (384 tests)
├── docs/ # MkDocs documentation
├── pyproject.toml # Project configuration
├── ROADMAP.md # Development roadmap
├── README.md # This file
└── PATTERN_BLOCKING_FEATURE.md # Pattern blocking documentation
Documentation
Complete documentation available in the docs/ folder built with MkDocs:
# Serve locally with hot reload
mkdocs serve
# Build static site
mkdocs build
📚 Live Documentation: https://marvinsmastodontools.codeberg.page/fenliu/
Includes: Installation, Quick Start, API Reference, Pattern Blocking Guide, Curated Queue Integration, Contributing Guide, Roadmap, and FAQ.
Technical Stack
- Framework: PyView (Starlette-based LiveView) with real-time capabilities
- Database: SQLAlchemy with SQLite, optimized with eager loading
- API Client: minimal-activitypub for Fediverse integration
- Async: Full async/await throughout (sync for SQLite only)
- Type Hints: Comprehensive type annotations with Pydantic validation
- Frontend: Jinja2 templates with Tailwind CSS, responsive design
- Testing: pytest with 384 tests (100% pass rate)
- Linting: ruff for formatting and linting
- Migrations: Alembic for schema management
- Package Manager: uv for dependency management
Upcoming Features
See Roadmap for detailed plans. Phase 4 focus:
- Docker containerization and CI/CD
- Performance optimization and caching for pattern matching
- Multi-user support with roles
- Advanced monitoring dashboard
- PostgreSQL/MySQL support
What's New in v0.7.0
Review Page Improvements
The review workflow is now faster and more ergonomic for large queues:
- Pagination: Posts are shown 20 at a time with prev/next navigation — no more infinite scrolling through hundreds of posts
- Bulk Actions: "Approve All" and "Reject All" buttons at the bottom-right of the table act on all posts currently visible on the page (already individually reviewed posts are excluded)
- Auto-refresh: When the current page is emptied by reviewing all posts, the page automatically loads the next batch if more unreviewed posts exist
- Scroll-to-top: Page scrolls to the top automatically when navigating between pages or when the auto-refresh triggers
ML Training Data Collection
Review decisions now capture a full feature snapshot at review time, so ML training data survives the queue cleanup job that deletes old posts:
- Snapshot fields on
ReviewFeedback: content snippet, spam score, hashtag count/list, attachment count, video flag, engagement counts (boosts/likes/replies), author bot flag, instance, stream ID - Complete coverage: Snapshots are captured for every approve, reject, score-adjust, and bulk action from the LiveView UI (previously only REST API reviews were recorded)
- Post-deletion safe: Training data is self-contained in
ReviewFeedbackrows and does not depend on the original post existing
Bug Fix: Stream Deletion
Deleting a hashtag stream no longer raises an integrity error. Previously, cascade-deleting posts would fail because SQLAlchemy tried to NULL-out review_feedback.post_id (a NOT NULL column) rather than deleting the orphaned rows. The Post → ReviewFeedback relationship now uses cascade="all, delete-orphan".
Code Quality
- 13 new tests: cascade delete (2), ReviewFeedback creation on review actions (3), pagination and scroll behaviour (8); 402 total
- Type safety: Zero errors under
ty check - Linting: All code passes ruff checks
Previous Release — v0.6.0
Queue Lifecycle Management
- Auto-Delete Delivered Posts: Posts automatically deleted after 7 days (configurable), with historical stats preserved
- Trim Excess Pending Posts: Weighted random deletion maintains invariant:
pending_count ≥ 2 × daily_consumption_rate - Cleanup API Endpoints:
POST /api/v1/curated/cleanupandPOST /api/v1/curated/trim-pending - Queue UI Controls: "Purge old delivered" and "Trim excess pending" buttons on Queue Preview page
- Historical Stats: All-time deletion counts preserved; stats page shows active and historical data
Production Containerization
- Multi-stage Dockerfile: Minimal final image (~207 MB)
- Non-root User: Runs as
fenliu(UID 1000) - Persistent Volumes: Separate data and logs volumes
- Automatic Migrations: Schema migrated automatically on container startup
- Docker/Podman Support: Works with both runtimes
Previous Release — v0.5.3
Pattern-Based User Blocking (v0.5.3)
Users can block Fediverse accounts using flexible pattern matching:
- Four Pattern Types: exact, suffix, prefix, contains
- Real-World Examples: Block all Bluesky users, all bot accounts, or any account with a keyword
- Settings UI: Intuitive pattern selector with helpful examples
- Review Page Integration: Pattern-based blocks show on review page with instant visibility
- Blocklist Refresh: New button allows applying Settings changes to review page without losing progress
See PATTERN_BLOCKING_FEATURE.md for complete details and examples.
Cultural Context
The name "FenLiu" (分流) means "divide the flow" in Chinese, inspired by the ancient Dujiangyan irrigation system (256 BC). This project applies the same engineering wisdom to digital content streams, separating valuable content from spam and noise while maintaining the natural flow of community conversation.
Key Resources
- Roadmap - Development plans and future features
- Pattern Blocking Guide - Detailed pattern matching documentation
- LLM System Prompt - Development standards
- Live Docs - Complete documentation
License
AGPL-3.0 License - See LICENSE file for details.
Contributing
- Follow existing code style (ruff formatted with comprehensive type hints)
- Write tests for new functionality (maintain 100% test pass rate)
- Update documentation as needed
- Run
noxbefore submitting changes - Run
alembic upgrade headafter pulling changes with new migrations
Version: 0.7.0 Status: Production Ready ✅ Released: 2026-03-14 Tests: 402 passing ✅ Code Quality: All checks passing ✅ Container Size: ~207 MB (multi-stage optimized) Framework: PyView (Starlette-based LiveView) Architecture: Async Python with comprehensive type hints Repository: https://codeberg.org/marvinsmastodontools/fenliu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fenliu-0.7.0.tar.gz.
File metadata
- Download URL: fenliu-0.7.0.tar.gz
- Upload date:
- Size: 427.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe3d418cbbb4627cb930a4c9face104193eb4816175a51f69ce83e277229a00f
|
|
| MD5 |
080234fab3bab4b2b35681e483466ed1
|
|
| BLAKE2b-256 |
49fbf68c7615e7a8a3b5de0b45a09508dafd695d41fa46338efb8c98e6188fb0
|
File details
Details for the file fenliu-0.7.0-py3-none-any.whl.
File metadata
- Download URL: fenliu-0.7.0-py3-none-any.whl
- Upload date:
- Size: 441.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
365ae3e338428fbee54363d081eb29e85f32de7cb76db2cafed539f95ec212bf
|
|
| MD5 |
deffada301afddf0f0ec236d23530020
|
|
| BLAKE2b-256 |
a3a807352ab0455eaa9bafa20d9e9d761436504a692f3c4d72bb11d0ba565832
|