Skip to main content

Monitor and filter Fediverse hashtags, curate quality content, and distribute via external tools like Zhongli

Project description

FenLiu (分流)

Created by marvin8 with assistance from Claude and DeepSeek AI assistants.

⚠️ DISCLAIMER / PROVISO: This project is a work in progress with major changes still happening. It is in no way anywhere close to finished and is only borderline useful for actual production use. Expect breaking changes, incomplete features, and significant architectural evolution as development continues.

Divide the Fediverse content flow

FenLiu is a web application that monitors Fediverse hashtags, filters spam, allows human review, learns from feedback, and exports quality content for boosting. Inspired by the ancient Chinese Dujiangyan irrigation system (256 BC) that separated silt from water, FenLiu applies 2,300-year engineering wisdom to modern digital content streams.

Current Status — v0.6.0

FenLiu is a fully functional spam filtering and content management system with complete Curated Queue integration, flexible pattern-based user blocking, automated queue lifecycle management, and production-ready containerization. Monitor hashtags, score posts for spam, manually review content, reliably export quality posts, and manage queue health with automatic cleanup and trimming.

Latest updates (v0.6.0): Auto-delete old delivered posts (7-day retention), weighted random deletion of excess pending posts based on age/engagement/author-activity, cleanup/trim API endpoints with manual UI controls, production containerization with Podman/Docker support, persistent data volumes for database and logs; 389 total tests passing.

Features

Core Functionality

  • Hashtag Monitoring: Monitor multiple Fediverse hashtags with customizable instance sources and scheduling
  • Spam Scoring: Rule-based detection (0-100 scale) with 7 intelligent detection rules
  • Manual Review Interface: Web interface for reviewing and approving/rejecting posts with scoring
  • Bulk Operations: Fetch and process posts in bulk with real-time progress tracking
  • Curated Queue Export: API-driven queue with ack/nack/error reliability pattern

Reblog Controls (Export Filters)

  • Pattern-Based User Blocking: Block users with flexible matching modes:
    • exact: Exact account identifier (e.g., @user@mastodon.social)
    • suffix: Block all users from domain (e.g., bsky.app for all Bluesky users)
    • prefix: Block by username prefix (e.g., bot_ for bot accounts)
    • contains: Block by substring (e.g., spam for accounts with "spam" in name)
  • "Don't Reblog" Hashtag Blocklist: Exclude posts with blocked hashtags
  • Attachments-Only Mode: Export only posts with media attachments
  • Auto-Reject on Fetch: Automatically reject blocked content before review
  • Blocklist Refresh: Apply Settings changes to review page instantly without losing progress

Web Interface

  • Dashboard: Real-time analytics, top hashtags, review progress
  • Streams Management: Create, edit, manage hashtag streams with CRUD operations
  • Review Workflow: Approve/reject posts with manual score adjustment and spam breakdown
  • Pattern Blocking Settings: Intuitive UI for adding pattern-based user blocks with examples
  • Queue Preview: Monitor queue health (pending/reserved/delivered/error counts)
  • Statistics: Charts for posts over time and hashtag distribution
  • Responsive Design: Fully responsive across desktop, tablet, mobile

REST API

  • Hashtag Streams: Full CRUD for stream management and bulk fetching
  • Posts: List, filter, update with approval/rejection and scoring
  • Curated Queue: /next, /ack, /nack, /error, /requeue endpoints
  • Reblog Controls: Manage blocked users (with pattern types) and hashtags
  • Statistics: Post counts, hashtag distribution, approval rates
  • Authentication: API key-based authentication for queue endpoints
  • Health: Health check and application info endpoints

Technical Quality

  • Type Safety: Comprehensive type hints throughout
  • Testing: 384 tests with 100% pass rate (including 29 pattern matching tests)
  • Resource Management: Proper cleanup of DB sessions and HTTP connections
  • Database Migrations: Alembic with automatic schema migration on startup
  • API Key Security: Secure generation and management of API keys
  • Code Complexity: All functions optimized for maintainability
  • No JavaScript Bloat: Pure HTML/CSS frontend, no external JS dependencies

Quick Start

Prerequisites

  • Python 3.12 or higher
  • uv package manager (recommended)

Installation

# Install dependencies
uv sync -U --all-groups

# Optional: Set up pre-commit hooks
uv run pre-commit install

Running the Application

# Development mode with auto-reload
fenliu --reload --debug

# Alternative development mode
uv run python -m fenliu --reload --debug

# Production mode
fenliu --host 0.0.0.0 --port 8000

# See all options
fenliu --help

Container Deployment (Docker/Podman)

FenLiu includes production-ready containerization with minimal image size (~207 MB).

Build the Image

# With Podman (recommended)
podman build -t fenliu -f Containerfile .

# Or with Docker
docker build -t fenliu -f Containerfile .

Run the Container

# Copy environment file and edit with your settings
cp .env.example .env
# Edit .env with your configuration

# Run with persistent volumes (recommended)
podman run -p 8000:8000 \
  -v fenliu-data:/app/data \
  -v fenliu-logs:/app/logs \
  --env-file .env \
  fenliu

# Or specify DATABASE_URL and SECRET_KEY directly
podman run -p 8000:8000 \
  -v fenliu-data:/app/data \
  -v fenliu-logs:/app/logs \
  -e DATABASE_URL="sqlite:////app/data/fenliu.db" \
  -e SECRET_KEY="your-production-secret-key" \
  fenliu

Container Features

  • Multi-stage build: Minimal final image (~207 MB)
  • Non-root user: Runs as fenliu user (UID 1000) for security
  • Persistent volumes: Separate volumes for data (/app/data) and logs (/app/logs)
  • Automatic migrations: Alembic migrations run on container startup
  • Production ready: Uses python:3.13-slim-bookworm base image

Docker Compose Example

version: '3.8'

services:
  fenliu:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - fenliu-data:/app/data
      - fenliu-logs:/app/logs
    environment:
      - DATABASE_URL=sqlite:////app/data/fenliu.db
      - DEBUG=false
      - SECRET_KEY=your-secret-key-change-in-production
    restart: unless-stopped

volumes:
  fenliu-data:
  fenliu-logs:

Run with: docker-compose up -d

First Steps

  1. Start the server: fenliu --reload
  2. Open browser: Navigate to http://localhost:8000
  3. Add a hashtag: Go to Streams page and create a hashtag stream (e.g., "python")
  4. Fetch posts: Click "Fetch" on the stream to retrieve posts from Fediverse
  5. Review posts: Use the Review interface to approve quality content or reject spam
  6. Block patterns: Go to Settings to add pattern-based blocks (optional)
  7. Export: Monitor the Queue Preview to see posts flowing to Curated Queue

Pattern-Based Blocking Examples

Settings Page Usage

  1. Go to Settings → Don't Reblog — Users
  2. Enter pattern: bsky.app
  3. Select type: suffix
  4. Click "Block"
  5. Result: All users from Bluesky are now blocked

Common Patterns

  • Block all Bluesky users: Pattern bsky.app, Type suffix
  • Block bot accounts: Pattern bot_, Type prefix
  • Block accounts with spam keyword: Pattern spam, Type contains
  • Block specific user: Pattern @user@mastodon.social, Type exact

Applying to Review Page

  1. While reviewing posts, go to Settings to add new patterns
  2. Return to Review page
  3. Click Refresh Blocklists button (next to Refresh)
  4. Current posts instantly re-evaluated with new patterns
  5. Continue reviewing without page reload

Debug Logging

Enable detailed debug logging with the --debug flag:

# Enable debug logging to file
fenliu --debug

# View logs in real-time
tail -f logs/fenliu_debug.log

# Custom log directory
fenliu --debug --log-dir=/var/log/fenliu

In your code: from fenliu.logging import get_logger then logger.debug(f"message")

API Usage

Authentication

All queue endpoints require API key authentication. Generate a key in Settings, then include it in requests:

curl -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/next

Common Examples

# List all hashtag streams
curl http://localhost:8000/api/v1/streams

# Create a new hashtag stream
curl -X POST http://localhost:8000/api/v1/streams \
  -H "Content-Type: application/json" \
  -d '{"hashtag": "python", "instance": "mastodon.social", "active": true}'

# Fetch posts for a stream
curl -X POST http://localhost:8000/api/v1/streams/1/fetch?limit=20

# Get next post from Curated Queue
curl -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/next

# Acknowledge successful reblog
curl -X POST -H "X-API-Key: your-api-key-here" \
  http://localhost:8000/api/v1/curated/123/ack

# Report permanent failure
curl -X POST -H "X-API-Key: your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{"reason": "Account suspended"}' \
  http://localhost:8000/api/v1/curated/123/error

# Review a post (approve)
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
  -H "Content-Type: application/json" \
  -d '{"approved": true, "reviewer_notes": "Quality content"}'

# Adjust spam score manually
curl -X PATCH http://localhost:8000/api/v1/posts/123 \
  -H "Content-Type: application/json" \
  -d '{"manual_spam_score": 15}'

# Add a pattern-based block (suffix type)
curl -X POST http://localhost:8000/api/v1/reblog-controls/blocked-users \
  -H "Content-Type: application/json" \
  -d '{"account_identifier": "bsky.app", "pattern_type": "suffix", "notes": "Block all Bluesky"}'

# List blocked users with pattern types
curl http://localhost:8000/api/v1/reblog-controls/blocked-users

API Endpoints

Streams & Posts:

  • GET /api/v1/streams - List streams
  • POST /api/v1/streams - Create stream
  • GET/PUT/DELETE /api/v1/streams/{id} - Stream operations
  • POST /api/v1/streams/{id}/fetch - Fetch posts for stream
  • POST /api/v1/streams/fetch-all - Fetch all active streams
  • GET /api/v1/posts - List posts with filtering
  • GET /api/v1/posts/{id} - Get post details
  • PATCH /api/v1/posts/{id} - Update post (review, approve, score)
  • GET /api/v1/stats - Application statistics

Curated Queue:

  • GET /api/v1/curated/next - Get next post (returns 204 if empty)
  • POST /api/v1/curated/{post_id}/ack - Confirm successful reblog
  • POST /api/v1/curated/{post_id}/nack - Return to queue (transient failure)
  • POST /api/v1/curated/{post_id}/error - Mark permanently failed
  • POST /api/v1/curated/{post_id}/requeue - Return errored post to queue

Reblog Controls (Pattern-Based Blocking):

  • GET /api/v1/reblog-controls/settings - Get reblog filter settings
  • PUT /api/v1/reblog-controls/settings - Update settings
  • GET /api/v1/reblog-controls/blocked-users - List blocked users with pattern types
  • POST /api/v1/reblog-controls/blocked-users - Add blocked user (with pattern_type)
  • DELETE /api/v1/reblog-controls/blocked-users/{id} - Remove blocked user
  • GET /api/v1/reblog-controls/blocked-hashtags - List blocked hashtags
  • POST /api/v1/reblog-controls/blocked-hashtags - Add blocked hashtag
  • DELETE /api/v1/reblog-controls/blocked-hashtags/{id} - Remove blocked hashtag
  • POST /api/v1/reblog-controls/reject-blocked - Bulk reject posts matching any pattern

System:

  • GET /health - Health check
  • GET /info - Application info

Configuration

Environment variables (via .env file):

# Database
DATABASE_URL=sqlite:///./fenliu.db

# Fediverse settings
DEFAULT_INSTANCE=mastodon.social
API_TIMEOUT=30
MAX_POSTS_PER_FETCH=20
RATE_LIMIT_DELAY=1.0

# Application
DEBUG=false
SECRET_KEY=your-secret-key-change-in-production
APP_NAME=FenLiu

# Spam scoring thresholds
VERY_HIGH_THRESHOLD=76
LOW_MAX_THRESHOLD=25

# Queue timeout
RESERVE_TIMEOUT_SECONDS=300

Development

Testing

# Run full test suite
pytest

# Run with coverage
pytest --cov=src/fenliu tests/

# Quick validation
python -m pytest -q

# Run specific test file
pytest tests/test_pattern_blocking.py -v

Code Quality

# Linting
ruff check src/fenliu/

# Formatting
ruff format src/fenliu/

# Complexity check
complexipy src

# Pre-commit checks
prek run --all-files

# Full CI simulation
nox

Database Migrations

# Apply pending migrations
alembic upgrade head

# Create new migration
alembic revision --autogenerate -m "description"

# Show current revision
alembic current

# View all revisions
alembic history

Development Workflow

# After dependency changes
uv sync -U --all-groups

# Quick validation before commits
prek run --all-files

# Full validation before commits
nox

Project Structure

fenliu/
├── src/fenliu/
│   ├── __init__.py              # Package definition
│   ├── __main__.py              # CLI entry point
│   ├── main.py                  # PyView application
│   ├── config.py                # Configuration
│   ├── database.py              # Database setup
│   ├── models.py                # SQLAlchemy models
│   ├── schemas.py               # Pydantic validation
│   ├── api/                     # REST API endpoints
│   │   ├── curated.py           # Queue API
│   │   ├── reblog_controls.py   # Filter management (pattern-based)
│   │   └── api_keys.py          # API key management
│   ├── services/                # Business logic
│   │   ├── spam_scoring.py      # Spam detection
│   │   ├── fediverse.py         # Fediverse client
│   │   ├── export_eligibility.py # Export filtering with pattern matching
│   │   ├── scheduler.py         # Task scheduling
│   │   └── api_key.py           # API key service
│   ├── templates/               # HTML templates
│   └── static/                  # CSS and assets
├── alembic/                     # Database migrations
├── tests/                       # Test suite (384 tests)
├── docs/                        # MkDocs documentation
├── pyproject.toml               # Project configuration
├── ROADMAP.md                   # Development roadmap
├── README.md                    # This file
└── PATTERN_BLOCKING_FEATURE.md  # Pattern blocking documentation

Documentation

Complete documentation available in the docs/ folder built with MkDocs:

# Serve locally with hot reload
mkdocs serve

# Build static site
mkdocs build

📚 Live Documentation: https://marvinsmastodontools.codeberg.page/fenliu/

Includes: Installation, Quick Start, API Reference, Pattern Blocking Guide, Curated Queue Integration, Contributing Guide, Roadmap, and FAQ.

Technical Stack

  • Framework: PyView (Starlette-based LiveView) with real-time capabilities
  • Database: SQLAlchemy with SQLite, optimized with eager loading
  • API Client: minimal-activitypub for Fediverse integration
  • Async: Full async/await throughout (sync for SQLite only)
  • Type Hints: Comprehensive type annotations with Pydantic validation
  • Frontend: Jinja2 templates with Tailwind CSS, responsive design
  • Testing: pytest with 384 tests (100% pass rate)
  • Linting: ruff for formatting and linting
  • Migrations: Alembic for schema management
  • Package Manager: uv for dependency management

Upcoming Features

See Roadmap for detailed plans. Phase 4 focus:

  • Docker containerization and CI/CD
  • Performance optimization and caching for pattern matching
  • Multi-user support with roles
  • Advanced monitoring dashboard
  • PostgreSQL/MySQL support

What's New in v0.6.0

Queue Lifecycle Management

Automatic management of pending and delivered posts to prevent indefinite queue growth:

  • Auto-Delete Delivered Posts: Posts automatically deleted after 7 days (configurable), with historical stats preserved
  • Trim Excess Pending Posts: Weighted random deletion maintains invariant: pending_count ≥ 2 × daily_consumption_rate
    • Age-based weighting: Older posts have higher deletion probability
    • Engagement-based weighting: Posts with fewer likes deleted preferentially
    • Author activity weighting: Posts from prolific authors in pending queue have higher deletion probability
  • Cleanup API Endpoints: POST /api/v1/curated/cleanup and POST /api/v1/curated/trim-pending for manual control
  • Queue UI Controls: New "Purge old delivered" and "Trim excess pending" buttons on Queue Preview page
  • Historical Stats: All-time deletion counts preserved in database; stats page shows both active and historical data
  • 5 New Tests: Comprehensive coverage of cleanup/trim logic (389 total tests)

Production Containerization

FenLiu is now production-ready for containerized deployment:

  • Multi-stage Dockerfile: Minimal final image (~207 MB)
  • Non-root User: Runs as fenliu (UID 1000) for enhanced security
  • Persistent Volumes: Separate data and logs volumes for durability
  • Automatic Migrations: Database schema migrated automatically on container startup
  • Environment Configuration: .env.example with comprehensive documentation
  • Docker/Podman Support: Works with both Docker and Podman
  • Docker Compose Example: Ready-to-use configuration in documentation

Code Quality

  • Complexity Optimization: Refactored _trim_pending_posts() from complexity 16 to 6 via helper functions
  • Type Safety: Full type hints across all new functions with zero type errors
  • Linting: All code passes ruff checks (no warnings)
  • Test Pass Rate: 389 tests passing (100%)

Previous Release — v0.5.3

Pattern-Based User Blocking (v0.5.3)

Users can block Fediverse accounts using flexible pattern matching:

  • Four Pattern Types: exact, suffix, prefix, contains
  • Real-World Examples: Block all Bluesky users, all bot accounts, or any account with a keyword
  • Settings UI: Intuitive pattern selector with helpful examples
  • Review Page Integration: Pattern-based blocks show on review page with instant visibility
  • Blocklist Refresh: New button allows applying Settings changes to review page without losing progress

See PATTERN_BLOCKING_FEATURE.md for complete details and examples.

Cultural Context

The name "FenLiu" (分流) means "divide the flow" in Chinese, inspired by the ancient Dujiangyan irrigation system (256 BC). This project applies the same engineering wisdom to digital content streams, separating valuable content from spam and noise while maintaining the natural flow of community conversation.

Key Resources

License

AGPL-3.0 License - See LICENSE file for details.

Contributing

  1. Follow existing code style (ruff formatted with comprehensive type hints)
  2. Write tests for new functionality (maintain 100% test pass rate)
  3. Update documentation as needed
  4. Run nox before submitting changes
  5. Run alembic upgrade head after pulling changes with new migrations

Version: 0.6.0 Status: Production Ready ✅ Released: 2026-03-12 Tests: 389 passing ✅ Code Quality: All checks passing ✅ Container Size: ~207 MB (multi-stage optimized) Framework: PyView (Starlette-based LiveView) Architecture: Async Python with comprehensive type hints Repository: https://codeberg.org/marvinsmastodontools/fenliu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fenliu-0.6.2.tar.gz (422.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fenliu-0.6.2-py3-none-any.whl (436.8 kB view details)

Uploaded Python 3

File details

Details for the file fenliu-0.6.2.tar.gz.

File metadata

  • Download URL: fenliu-0.6.2.tar.gz
  • Upload date:
  • Size: 422.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fenliu-0.6.2.tar.gz
Algorithm Hash digest
SHA256 740f3a905ae84a0961ae4c5b8bb5ad33c4f56bd42e8654fdc444aa6f0c5f009b
MD5 e25de193da5025277b6b0a302c0af81d
BLAKE2b-256 054b90162555c7524874bd3ca4d40d44a50c3dbdb0336aa18c094cac3b7d0cfc

See more details on using hashes here.

File details

Details for the file fenliu-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: fenliu-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 436.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fenliu-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2100eafff9a18020b592902b1c4bd0e2d3cb298cbf846a4349543f482388212c
MD5 f624011347cf6d905bdb32eaeef7af99
BLAKE2b-256 96c55bfefe1c6817f3e8d2cfda7dca03eb436dee8d20a93afe3a09e0407ecf1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page