Monitor and filter Fediverse hashtags, curate quality content, and distribute via external tools like Zhongli

These details have not been verified by PyPI

Project description

FenLiu (分流)

Created by marvin8 with assistance from Claude and DeepSeek AI assistants.

⚠️ DISCLAIMER / PROVISO: This project is a work in progress with major changes still happening. It is in no way anywhere close to finished and is only borderline useful for actual production use. Expect breaking changes, incomplete features, and significant architectural evolution as development continues.

Divide the Fediverse content flow

FenLiu is a web application that monitors Fediverse hashtags, filters spam, allows human review, learns from feedback, and exports quality content for boosting. Inspired by the ancient Chinese Dujiangyan irrigation system (256 BC) that separated silt from water, FenLiu applies 2,300-year engineering wisdom to modern digital content streams.

Current Status — v0.7.0

FenLiu is a fully functional spam filtering and content management system with complete Curated Queue integration, flexible pattern-based user blocking, automated queue lifecycle management, production-ready containerization, and ML training data collection. Monitor hashtags, score posts for spam, manually review content, reliably export quality posts, and manage queue health with automatic cleanup and trimming.

Latest updates (v0.7.0): Review page pagination (20 posts/page), bulk approve/reject buttons, auto-refresh when page empties, ML training data snapshot collection on every review action, stream deletion cascade fix; 402 total tests passing.

Features

Core Functionality

Hashtag Monitoring: Monitor multiple Fediverse hashtags with customizable instance sources and scheduling
Spam Scoring: Rule-based detection (0-100 scale) with 7 intelligent detection rules
Manual Review Interface: Web interface for reviewing and approving/rejecting posts with scoring
Bulk Operations: Fetch and process posts in bulk with real-time progress tracking
Curated Queue Export: API-driven queue with ack/nack/error reliability pattern

Reblog Controls (Export Filters)

Pattern-Based User Blocking: Block users with flexible matching modes:
- exact: Exact account identifier (e.g., @user@mastodon.social)
- suffix: Block all users from domain (e.g., bsky.app for all Bluesky users)
- prefix: Block by username prefix (e.g., bot_ for bot accounts)
- contains: Block by substring (e.g., spam for accounts with "spam" in name)
"Don't Reblog" Hashtag Blocklist: Exclude posts with blocked hashtags
Attachments-Only Mode: Export only posts with media attachments
Auto-Reject on Fetch: Automatically reject blocked content before review
Blocklist Refresh: Apply Settings changes to review page instantly without losing progress

Web Interface

Dashboard: Real-time analytics, top hashtags, review progress
Streams Management: Create, edit, manage hashtag streams with CRUD operations
Review Workflow: Approve/reject posts with manual score adjustment and spam breakdown
- Pagination: 20 posts per page with prev/next navigation
- Bulk Actions: Approve All / Reject All buttons for the current page
- Auto-refresh: Page reloads automatically when emptied but more posts remain
Pattern Blocking Settings: Intuitive UI for adding pattern-based user blocks with examples
Queue Preview: Monitor queue health (pending/reserved/delivered/error counts)
Statistics: Charts for posts over time and hashtag distribution
Responsive Design: Fully responsive across desktop, tablet, mobile

REST API

Hashtag Streams: Full CRUD for stream management and bulk fetching
Posts: List, filter, update with approval/rejection and scoring
Curated Queue: /next, /ack, /nack, /error, /requeue endpoints
Reblog Controls: Manage blocked users (with pattern types) and hashtags
Statistics: Post counts, hashtag distribution, approval rates
Authentication: API key-based authentication for queue endpoints
Health: Health check and application info endpoints

Technical Quality

Type Safety: Comprehensive type hints throughout
Testing: 402 tests with 100% pass rate
Resource Management: Proper cleanup of DB sessions and HTTP connections
Database Migrations: Alembic with automatic schema migration on startup
API Key Security: Secure generation and management of API keys
Code Complexity: All functions optimized for maintainability
No JavaScript Bloat: Pure HTML/CSS frontend, no external JS dependencies

Quick Start

Prerequisites

Python 3.12 or higher
uv package manager (recommended)

Installation

# Install dependencies
uv sync -U --all-groups

# Optional: Set up pre-commit hooks
uv run pre-commit install

Running the Application

# Development mode with auto-reload
fenliu --reload --debug

# Alternative development mode
uv run python -m fenliu --reload --debug

# Production mode
fenliu --host 0.0.0.0 --port 8000

# See all options
fenliu --help

Container Deployment (Docker/Podman)

FenLiu includes production-ready containerization with minimal image size (~207 MB):

podman build -t fenliu -f Containerfile .
cp .env.example .env  # edit with your settings
podman run -d -p 8000:8000 \
  -v fenliu-data:/app/data \
  -v fenliu-logs:/app/logs \
  --env-file .env \
  fenliu

See the Container Deployment guide for full instructions including volumes, compose examples, and security notes.

First Steps

Start the server: fenliu --reload
Open browser: Navigate to http://localhost:8000
Add a hashtag: Go to Streams page and create a hashtag stream (e.g., "python")
Fetch posts: Click "Fetch" on the stream to retrieve posts from Fediverse
Review posts: Use the Review interface to approve quality content or reject spam
Block patterns: Go to Settings to add pattern-based blocks (optional)
Export: Monitor the Queue Preview to see posts flowing to Curated Queue

Pattern-Based Blocking Examples

Settings Page Usage

Go to Settings → Don't Reblog — Users
Enter pattern: bsky.app
Select type: suffix
Click "Block"
Result: All users from Bluesky are now blocked

Common Patterns

Block all Bluesky users: Pattern bsky.app, Type suffix
Block bot accounts: Pattern bot_, Type prefix
Block accounts with spam keyword: Pattern spam, Type contains
Block specific user: Pattern @user@mastodon.social, Type exact

Applying to Review Page

While reviewing posts, go to Settings to add new patterns
Return to Review page
Click Refresh Blocklists button (next to Refresh)
Current posts instantly re-evaluated with new patterns
Continue reviewing without page reload

Debug Logging

Enable detailed debug logging with the --debug flag:

# Enable debug logging to file
fenliu --debug

# View logs in real-time
tail -f logs/fenliu_debug.log

# Custom log directory
fenliu --debug --log-dir=/var/log/fenliu

In your code: from fenliu.logging import get_logger then logger.debug(f"message")

API Usage

Authentication

All queue endpoints require API key authentication. Generate a key in Settings, then include it in requests:

curl -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/next

Common Examples

# List all hashtag streams
curl http://localhost:8000/api/v1/streams

# Create a new hashtag stream
curl -X POST http://localhost:8000/api/v1/streams \
  -H "Content-Type: application/json" \
  -d '{"hashtag": "python", "instance": "mastodon.social", "active": true}'

# Fetch posts for a stream
curl -X POST http://localhost:8000/api/v1/streams/1/fetch?limit=20

# Get next post from Curated Queue
curl -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/next

# Acknowledge successful reblog
curl -X POST -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/123/ack

# Report permanent failure
curl -X POST -H "X-API-Key: your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{"reason": "Account suspended"}' \
  http://localhost:8000/api/v1/curated/123/error

# Review a post (approve)
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
  -H "Content-Type: application/json" \
  -d '{"approved": true, "reviewer_notes": "Quality content"}'

# Adjust spam score manually
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
  -H "Content-Type: application/json" \
  -d '{"manual_spam_score": 15}'

# Add a pattern-based block (suffix type)
curl -X POST http://localhost:8000/api/v1/reblog-controls/blocked-users \
  -H "Content-Type: application/json" \
  -d '{"account_identifier": "bsky.app", "pattern_type": "suffix", "notes": "Block all Bluesky"}'

# List blocked users with pattern types
curl http://localhost:8000/api/v1/reblog-controls/blocked-users

API Endpoints

Streams & Posts:

GET /api/v1/streams - List streams
POST /api/v1/streams - Create stream
GET/PUT/DELETE /api/v1/streams/{id} - Stream operations
POST /api/v1/streams/{id}/fetch - Fetch posts for stream
POST /api/v1/streams/fetch-all - Fetch all active streams
GET /api/v1/posts - List posts with filtering
GET /api/v1/posts/{id} - Get post details
PATCH /api/v1/posts/{id} - Update post (review, approve, score)
GET /api/v1/stats - Application statistics

Curated Queue:

GET /api/v1/curated/next - Get next post (returns 204 if empty)
POST /api/v1/curated/{post_id}/ack - Confirm successful reblog
POST /api/v1/curated/{post_id}/nack - Return to queue (transient failure)
POST /api/v1/curated/{post_id}/error - Mark permanently failed
POST /api/v1/curated/{post_id}/requeue - Return errored post to queue

Reblog Controls (Pattern-Based Blocking):

GET /api/v1/reblog-controls/settings - Get reblog filter settings
PUT /api/v1/reblog-controls/settings - Update settings
GET /api/v1/reblog-controls/blocked-users - List blocked users with pattern types
POST /api/v1/reblog-controls/blocked-users - Add blocked user (with pattern_type)
DELETE /api/v1/reblog-controls/blocked-users/{id} - Remove blocked user
GET /api/v1/reblog-controls/blocked-hashtags - List blocked hashtags
POST /api/v1/reblog-controls/blocked-hashtags - Add blocked hashtag
DELETE /api/v1/reblog-controls/blocked-hashtags/{id} - Remove blocked hashtag
POST /api/v1/reblog-controls/reject-blocked - Bulk reject posts matching any pattern

System:

GET /health - Health check
GET /info - Application info

Configuration

Environment variables (via .env file):

# Database
DATABASE_URL=sqlite:///./fenliu.db

# Fediverse settings
DEFAULT_INSTANCE=mastodon.social
API_TIMEOUT=30
MAX_POSTS_PER_FETCH=20
RATE_LIMIT_DELAY=1.0

# Application
DEBUG=false
SECRET_KEY=your-secret-key-change-in-production
APP_NAME=FenLiu

# Spam scoring thresholds
VERY_HIGH_THRESHOLD=76
LOW_MAX_THRESHOLD=25

# Queue timeout
RESERVE_TIMEOUT_SECONDS=300

Development

Testing

# Run full test suite
pytest

# Run with coverage
pytest --cov=src/fenliu tests/

# Quick validation
python -m pytest -q

# Run specific test file
pytest tests/test_pattern_blocking.py -v

Code Quality

# Linting
ruff check src/fenliu/

# Formatting
ruff format src/fenliu/

# Complexity check
complexipy src

# Pre-commit checks
prek run --all-files

# Full CI simulation
nox

Database Migrations

# Apply pending migrations
alembic upgrade head

# Create new migration
alembic revision --autogenerate -m "description"

# Show current revision
alembic current

# View all revisions
alembic history

Development Workflow

# After dependency changes
uv sync -U --all-groups

# Quick validation before commits
prek run --all-files

# Full validation before commits
nox

Project Structure

fenliu/
├── src/fenliu/
│   ├── __init__.py              # Package definition
│   ├── __main__.py              # CLI entry point
│   ├── main.py                  # PyView application
│   ├── config.py                # Configuration
│   ├── database.py              # Database setup
│   ├── models.py                # SQLAlchemy models
│   ├── schemas.py               # Pydantic validation
│   ├── api/                     # REST API endpoints
│   │   ├── curated.py           # Queue API
│   │   ├── reblog_controls.py   # Filter management (pattern-based)
│   │   └── api_keys.py          # API key management
│   ├── services/                # Business logic
│   │   ├── spam_scoring.py      # Spam detection
│   │   ├── fediverse.py         # Fediverse client
│   │   ├── export_eligibility.py # Export filtering with pattern matching
│   │   ├── scheduler.py         # Task scheduling
│   │   └── api_key.py           # API key service
│   ├── templates/               # HTML templates
│   └── static/                  # CSS and assets
├── alembic/                     # Database migrations
├── tests/                       # Test suite (384 tests)
├── docs/                        # MkDocs documentation
├── pyproject.toml               # Project configuration
├── ROADMAP.md                   # Development roadmap
├── README.md                    # This file
└── PATTERN_BLOCKING_FEATURE.md  # Pattern blocking documentation

Documentation

Complete documentation available in the docs/ folder built with MkDocs:

# Serve locally with hot reload
mkdocs serve

# Build static site
mkdocs build

📚 Live Documentation: https://marvinsmastodontools.codeberg.page/fenliu/

Includes: Installation, Quick Start, API Reference, Pattern Blocking Guide, Curated Queue Integration, Contributing Guide, Roadmap, and FAQ.

Technical Stack

Framework: PyView (Starlette-based LiveView) with real-time capabilities
Database: SQLAlchemy with SQLite, optimized with eager loading
API Client: minimal-activitypub for Fediverse integration
Async: Full async/await throughout (sync for SQLite only)
Type Hints: Comprehensive type annotations with Pydantic validation
Frontend: Jinja2 templates with Tailwind CSS, responsive design
Testing: pytest with 384 tests (100% pass rate)
Linting: ruff for formatting and linting
Migrations: Alembic for schema management
Package Manager: uv for dependency management

Upcoming Features

See Roadmap for detailed plans. Phase 4 focus:

Docker containerization and CI/CD
Performance optimization and caching for pattern matching
Multi-user support with roles
Advanced monitoring dashboard
PostgreSQL/MySQL support

What's New in v0.7.0

Review Page Improvements

The review workflow is now faster and more ergonomic for large queues:

Pagination: Posts are shown 20 at a time with prev/next navigation — no more infinite scrolling through hundreds of posts
Bulk Actions: "Approve All" and "Reject All" buttons at the bottom-right of the table act on all posts currently visible on the page (already individually reviewed posts are excluded)
Auto-refresh: When the current page is emptied by reviewing all posts, the page automatically loads the next batch if more unreviewed posts exist
Scroll-to-top: Page scrolls to the top automatically when navigating between pages or when the auto-refresh triggers

ML Training Data Collection

Review decisions now capture a full feature snapshot at review time, so ML training data survives the queue cleanup job that deletes old posts:

Snapshot fields on ReviewFeedback: content snippet, spam score, hashtag count/list, attachment count, video flag, engagement counts (boosts/likes/replies), author bot flag, instance, stream ID
Complete coverage: Snapshots are captured for every approve, reject, score-adjust, and bulk action from the LiveView UI (previously only REST API reviews were recorded)
Post-deletion safe: Training data is self-contained in ReviewFeedback rows and does not depend on the original post existing

Bug Fix: Stream Deletion

Deleting a hashtag stream no longer raises an integrity error. Previously, cascade-deleting posts would fail because SQLAlchemy tried to NULL-out review_feedback.post_id (a NOT NULL column) rather than deleting the orphaned rows. The Post → ReviewFeedback relationship now uses cascade="all, delete-orphan".

Code Quality

13 new tests: cascade delete (2), ReviewFeedback creation on review actions (3), pagination and scroll behaviour (8); 402 total
Type safety: Zero errors under ty check
Linting: All code passes ruff checks

Previous Release — v0.6.0

Queue Lifecycle Management

Auto-Delete Delivered Posts: Posts automatically deleted after 7 days (configurable), with historical stats preserved
Trim Excess Pending Posts: Weighted random deletion maintains invariant: pending_count ≥ 2 × daily_consumption_rate
Cleanup API Endpoints: POST /api/v1/curated/cleanup and POST /api/v1/curated/trim-pending
Queue UI Controls: "Purge old delivered" and "Trim excess pending" buttons on Queue Preview page
Historical Stats: All-time deletion counts preserved; stats page shows active and historical data

Production Containerization

Multi-stage Dockerfile: Minimal final image (~207 MB)
Non-root User: Runs as fenliu (UID 1000)
Persistent Volumes: Separate data and logs volumes
Automatic Migrations: Schema migrated automatically on container startup
Docker/Podman Support: Works with both runtimes

Previous Release — v0.5.3

Pattern-Based User Blocking (v0.5.3)

Users can block Fediverse accounts using flexible pattern matching:

Four Pattern Types: exact, suffix, prefix, contains
Real-World Examples: Block all Bluesky users, all bot accounts, or any account with a keyword
Settings UI: Intuitive pattern selector with helpful examples
Review Page Integration: Pattern-based blocks show on review page with instant visibility
Blocklist Refresh: New button allows applying Settings changes to review page without losing progress

See PATTERN_BLOCKING_FEATURE.md for complete details and examples.

Cultural Context

The name "FenLiu" (分流) means "divide the flow" in Chinese, inspired by the ancient Dujiangyan irrigation system (256 BC). This project applies the same engineering wisdom to digital content streams, separating valuable content from spam and noise while maintaining the natural flow of community conversation.

Key Resources

Roadmap - Development plans and future features
Pattern Blocking Guide - Detailed pattern matching documentation
LLM System Prompt - Development standards
Live Docs - Complete documentation

License

AGPL-3.0 License - See LICENSE file for details.

Contributing

Follow existing code style (ruff formatted with comprehensive type hints)
Write tests for new functionality (maintain 100% test pass rate)
Update documentation as needed
Run nox before submitting changes
Run alembic upgrade head after pulling changes with new migrations

Version: 0.7.0 Status: Production Ready ✅ Released: 2026-03-14 Tests: 402 passing ✅ Code Quality: All checks passing ✅ Container Size: ~207 MB (multi-stage optimized) Framework: PyView (Starlette-based LiveView) Architecture: Async Python with comprehensive type hints Repository: https://codeberg.org/marvinsmastodontools/fenliu

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.12.1

Apr 6, 2026

0.11.0

Apr 3, 2026

0.10.1

Apr 1, 2026

0.10.0

Mar 31, 2026

0.9.0

Mar 24, 2026

0.8.0

Mar 19, 2026

0.7.10

Mar 18, 2026

0.7.9

Mar 18, 2026

0.7.8

Mar 18, 2026

0.7.5

Mar 17, 2026

0.7.3

Mar 17, 2026

0.7.2

Mar 15, 2026

0.7.1

Mar 15, 2026

This version

0.7.0

Mar 14, 2026

0.6.5

Mar 13, 2026

0.6.4

Mar 13, 2026

0.6.3

Mar 12, 2026

0.6.2

Mar 11, 2026

0.5.2

Mar 7, 2026

0.5.1

Mar 6, 2026

0.4.1

Mar 4, 2026

0.1.0

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fenliu-0.7.0.tar.gz (427.2 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fenliu-0.7.0-py3-none-any.whl (441.0 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file fenliu-0.7.0.tar.gz.

File metadata

Download URL: fenliu-0.7.0.tar.gz
Upload date: Mar 14, 2026
Size: 427.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fenliu-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`fe3d418cbbb4627cb930a4c9face104193eb4816175a51f69ce83e277229a00f`
MD5	`080234fab3bab4b2b35681e483466ed1`
BLAKE2b-256	`49fbf68c7615e7a8a3b5de0b45a09508dafd695d41fa46338efb8c98e6188fb0`

See more details on using hashes here.

File details

Details for the file fenliu-0.7.0-py3-none-any.whl.

File metadata

Download URL: fenliu-0.7.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 441.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fenliu-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`365ae3e338428fbee54363d081eb29e85f32de7cb76db2cafed539f95ec212bf`
MD5	`deffada301afddf0f0ec236d23530020`
BLAKE2b-256	`a3a807352ab0455eaa9bafa20d9e9d761436504a692f3c4d72bb11d0ba565832`

See more details on using hashes here.

fenliu 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

FenLiu (分流)

Current Status — v0.7.0

Features

Core Functionality

Reblog Controls (Export Filters)

Web Interface

REST API

Technical Quality

Quick Start

Prerequisites

Installation

Running the Application

Container Deployment (Docker/Podman)

First Steps

Pattern-Based Blocking Examples

Settings Page Usage

Common Patterns

Applying to Review Page

Debug Logging

API Usage

Authentication

Common Examples

API Endpoints

Configuration

Development

Testing

Code Quality

Database Migrations

Development Workflow

Project Structure

Documentation

Technical Stack

Upcoming Features

What's New in v0.7.0

Review Page Improvements

ML Training Data Collection

Bug Fix: Stream Deletion

Code Quality

Previous Release — v0.6.0

Queue Lifecycle Management

Production Containerization

Previous Release — v0.5.3

Pattern-Based User Blocking (v0.5.3)

Cultural Context

Key Resources

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes