Skip to main content

Search for mastodon profiles, use mechanical and LLM filtering.

Project description

mastodon_finder

Mastodon account discovery, enrichment, and LLM-based scoring tool.

This project automates the workflow:

  1. Discover candidate accounts on Mastodon via keywords, hashtags, profile terms, and "follow what they follow" expansion.
  2. Enrich each account into a uniform dossier (bio, fields, recent original posts, stats, discovery reasons).
  3. Apply a stack of deterministic pre-LLM filters (language, activity, link-only, bots, etc.).
  4. Optionally hand each dossier to an LLM for rubric-based FOLLOW / MAYBE / SKIP decisions.
  5. Output a human-readable report to the terminal (rich) and optionally to CSV / Markdown.

Install (pipx)

This tool is intended to be used as a CLI. Prefer pipx so that dependencies stay isolated and you can update easily.

# If your package is published or in a local path, do something like:
pipx install mastodon-finder  # or: pipx install .

If you are developing locally:

# from repo root
pipx install --editable .

That will expose the mastodon-finder (module: mastodon_finder) entrypoint on your PATH without polluting your global Python.

Quick Start

  1. Create a config (once):

    mastodon-finder init
    

    This writes finder.toml and suggests adding it to .gitignore.

  2. Authenticate to Mastodon (once per account):

    mastodon-finder auth
    

    This interactive flow will:

    • Ask for your instance URL (e.g. mastodon.social).
    • Register an app called mastodon_finder with read scope.
    • Open (print) an authorization URL.
    • Ask you to paste back the authorization code.
    • Write MASTODON_BASE_URL=... and MASTODON_ACCESS_TOKEN=... into .env.
  3. Run a discovery/eval pass:

    mastodon-finder run --yes
    

    --yes skips the interactive "are you sure" run-summary.

You can override most things from the CLI without editing the TOML.

CLI Overview

The main entrypoint is the package itself:

mastodon-finder [command]

Commands:

  • init — write a starter finder.toml with reasonable defaults.
  • auth — run interactive OAuth flow and append creds into .env.
  • run (default) — perform discovery → enrichment → filter → (optional) LLM → report.

If you run without a subcommand, it defaults to run.

Pipeline

  1. Gets your current friend lists
  2. Finds possible friends by keyword, hashtag, "follows special interest account"
  3. Remove accounts with bad metrics (inactivity, no original content, etc)
  4. Ask LLM to grade each candidate on a rubric
  5. Display "FOLLOW", "MAYBE", "SKIP" report

Features

  • Search by keyword, hashtag for post, for account bio
  • Search by "followers of an account" as signal of interest, geograph, etc.
  • Filters out already followed
  • Caching for info about self, e.g. current friends
  • Caching for other API calls, e.g. for resuming a failed run
  • langdetect for posts
  • Filter
    • by minimum number of posts - is anyone one home
    • post recency - is anyone home now
    • original vs retweet - do they write their own content
    • original vs all links - is this an RSS feed cross posted to mastodon?
    • does the author ever reply to anyone?
  • LLM filter
    • By static rubric
      • Is it the right language?
      • Is it the right topic?
      • Did it hit all the topics?
      • Is there some other unforeseen problem?

Example Runs

Run with alternative discovery terms:

mastodon-finder run \
  --keywords python ai mastodon \
  --hashtags fediverse \
  --profile-keywords "django" "fastapi" \
  --max-accounts 80 \
  --max-statuses 120 \
  --yes

Run without LLM, just pre-filters and report:

mastodon-finder run --no-llm --yes

Run focused on a single follow-target:

mastodon-finder run \
  --follow-targets "@coolconnector@mastodon.social" \
  --follow-target-limit 500 \
  --yes

How it works: Discovery & Filtering Pipeline

  1. Discovery (discovery.discover_accounts)

    • Collects IDs and tags them with reasons.
    • Important: follow-target expansion can return many accounts, so there is a per-target limit.
  2. Enrichment (enrich.build_dossiers)

    • Sorts candidates by number of discovery reasons (simple prioritization).
    • Stops at limits.max_accounts.
    • Produces a clean, LLM-ready, language-aware dossier list.
  3. Pre-LLM Filters (finder._pre_llm_filter)

    • Activity cutoff (posted within N days).
    • Bot flag.
    • Language match (or "none").
    • Must-have replies.
    • Link-only threshold.
    • Minimum original posts.
    • "Friend full up" (high following/follower + low follow-back ratio).
    • Too chatty (posts/year cap).
    • Reject bio keywords (e.g. to block crypto/NFT/etc.).
    • Non-empty bio.
    • Minimum account age.
    • Special-case block for bsky.brid.gy relays.
    • Each discard increments a counter, reported at the end.
  4. LLM Evaluation (optional)

    • Only on the survivors.
    • LLM rubric comes from settings.llm.topics and the fixed template.
    • Can be disabled with --no-llm, in which case all pre-filtered accounts become MAYBE.
  5. Output

    • Rich terminal render with follow URLs normalized to your instance.
    • Optional CSV/MD report for offline review.

Environment / Secrets

  • .env is the source of truth for:

    • MASTODON_BASE_URL
    • MASTODON_ACCESS_TOKEN
    • OPENROUTER_API_KEY + OPENROUTER_BASE_URL + OPENROUTER_MODEL (optional)
    • OPENAI_API_KEY (optional)

auth will append the Mastodon entries for you. Keep .env out of version control.

Reporting & Exports

  • Terminal shows decisions, discovery reasons, first reasoning line, and a follow link rooted at your Mastodon host.
  • CSV export makes it easy to sort/filter in spreadsheets.
  • Markdown export is human-friendly for PRs or sharing with teammates.

Prior Art

Bad profile search and bad search has been touted as an intentional privacy feature

Directories

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mastodon_finder-0.2.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mastodon_finder-0.2.0-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file mastodon_finder-0.2.0.tar.gz.

File metadata

  • Download URL: mastodon_finder-0.2.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mastodon_finder-0.2.0.tar.gz
Algorithm Hash digest
SHA256 303074d2147f3090750dc70be52c4352811e29828656afc1b163d0138e732f0b
MD5 81b377b0ed07bcf7a709a19d2f04dcaf
BLAKE2b-256 218b70cd6ccabe5ca19f77be7407b35488ab121db6e45e4fef0fab010eb4e6d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for mastodon_finder-0.2.0.tar.gz:

Publisher: publish_to_pypi.yml on matthewdeanmartin/mastodon-finder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mastodon_finder-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mastodon_finder-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf29be586256389893dbb3307fb339a96812ac29e750e868b0d7d8e5f954b84b
MD5 9cae7e102bbea6cb5ae6d37a42b6a4c0
BLAKE2b-256 eea9b5b9ad41666188efea2dba8eac999c19dd78829a6f9c6e41e0fa773f8853

See more details on using hashes here.

Provenance

The following attestation bundles were made for mastodon_finder-0.2.0-py3-none-any.whl:

Publisher: publish_to_pypi.yml on matthewdeanmartin/mastodon-finder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page