Skip to main content

AI-powered GitHub repository SEO optimization using X Algorithm's recommendation pipeline

Project description

repo-seo

License: MIT Python 3.9+

AI-powered tool to optimize GitHub repositories for better discoverability through improved descriptions, topics, and README content.

Architecture

Inspired by the X Algorithm's recommendation pipeline, repo-seo uses a composable pipeline architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                      SEO OPTIMIZATION PIPELINE                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐   │
│   │ Sources  │────▶│Hydrators │────▶│ Filters  │────▶│ Scorers  │   │
│   │          │     │          │     │          │     │          │   │
│   │ Local    │     │ README   │     │ Quality  │     │ README   │   │
│   │ GitHub   │     │ Language │     │ Dedup    │     │ Topic    │   │
│   └──────────┘     │ Keywords │     │ Relevance│     │ SEO      │   │
│                    └──────────┘     └──────────┘     └──────────┘   │
│                                                             │        │
│                                                             ▼        │
│                                                      ┌──────────┐   │
│                                                      │ Selector │   │
│                                                      │  Top-K   │   │
│                                                      └──────────┘   │
│                                                             │        │
└─────────────────────────────────────────────────────────────│────────┘
                                                              ▼
                                                    Optimized Results

Features

  • Auto-Apply SEO: Directly update GitHub topics & description with repo-seo suggest --apply
  • Phoenix SEO: X Algorithm's Two-Tower + Multi-Action ranking for topic recommendations
  • Pipeline Architecture: Composable sources, hydrators, filters, scorers, selectors
  • Dynamic Trending Topics: Real-time GitHub trending keywords matching
  • README Analysis: Section ordering suggestions, keyword optimization
  • AI-Powered Analysis: OpenAI, Anthropic Claude, DeepSeek support
  • Multi-Signal Scoring: README quality, topic relevance, trending score
  • Rule-Based Fallback: Works without API keys

Installation

# Install from source
pip install -e .

# With AI provider support
pip install -e ".[all]"

Quick Start

Using the Pipeline (Recommended)

from repo_seo import (
    Pipeline, Query,
    LocalRepoSource,
    ReadmeHydrator,
    ReadmeScorer, TopicScorer,
    TopKSelector,
)
from repo_seo.pipeline import QualityFilter, DuplicateFilter

# Create optimization pipeline
pipeline = Pipeline(
    sources=[LocalRepoSource()],
    hydrators=[ReadmeHydrator()],
    pre_filters=[QualityFilter(), DuplicateFilter()],
    scorers=[ReadmeScorer(), TopicScorer()],
    selector=TopKSelector(k=10),
)

# Run optimization
query = Query(repo_path="./my-project", repo_name="my-project")
results = pipeline.run(query)

# Process results
for candidate in results:
    print(f"{candidate.type}: {candidate.id} (score: {candidate.final_score:.1f})")

Command Line

# SEO suggestions with README/topic analysis + auto-apply to GitHub
repo-seo suggest --top-k 10
repo-seo suggest --apply  # Actually update GitHub topics & description

# Phoenix SEO recommendations (X Algorithm style)
repo-seo phoenix --detailed

# Get trending topic suggestions
repo-seo trending --language python

# Analyze current repository
repo-seo analyze

# Optimize with AI
repo-seo optimize --repo-path . --provider openai

Auto-Apply SEO Changes

The suggest command analyzes your repo and can directly update GitHub:

# Preview suggestions
repo-seo suggest --top-k 8

# Apply changes to GitHub (updates topics + description)
repo-seo suggest --apply

Output:

📝 README Optimization Suggestions:
  1. Add [installation]: Include installation instructions
  2. Add status badges (build, coverage, version, license)

🏷️ Topic Keywords (Priority Order):
  🔥 1. api (score: 84 +20)        # +20 = content match boost
  🔥 2. machine-learning (score: 82 +20)
  📊 3. cli (score: 65)

📋 Description Optimization:
  Current:   My project...
  Suggested: AI-powered tool for X. Built with Python. Features api support.

🚀 Applying Changes to GitHub
  ✅ Topics updated successfully!
  ✅ Description updated successfully!

Simple API

from repo_seo import RepoAnalyzer

repo_info = {
    "name": "my-project",
    "description": "A sample project",
    "languages": ["Python"],
    "topics": ["python", "cli"],
    "readme": "# My Project\n\nDescription here.",
}

analyzer = RepoAnalyzer(repo_info)
results = analyzer.analyze()
print(f"SEO Score: {results['score']}/100")

Phoenix SEO (X Algorithm Style)

Topic recommendation using X Algorithm's Two-Tower architecture with Multi-Action User Behavior Prediction:

┌─────────────────────────────────────────────────────────────────┐
│                     PHOENIX SEO PIPELINE                        │
├─────────────────────────────────────────────────────────────────┤
│   ┌─────────────────┐              ┌─────────────────────────┐  │
│   │  REPO TOWER     │              │  TRENDING TOWER         │  │
│   │  (Your Repo)    │              │  (GitHub LIVE)          │  │
│   │  README         │   Dot        │  Trending Repos Topics  │  │
│   │  Description    │─ Product ───▶│  Featured Topics        │  │
│   │  Languages      │              │  (Real-time from API)   │  │
│   └─────────────────┘              └─────────────────────────┘  │
│            │                                    │               │
│            └────────────┬───────────────────────┘               │
│                         ▼                                       │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │         MULTI-ACTION USER BEHAVIOR PREDICTION            │   │
│   │  ┌────────────────────────┬────────────────────────────┐│   │
│   │  │ POSITIVE ACTIONS       │ NEGATIVE ACTIONS           ││   │
│   │  │ ⭐ P(star)             │ ⛔ P(ignore)               ││   │
│   │  │ 🍴 P(fork)             │ 🚫 P(report)               ││   │
│   │  │ 👆 P(click)            │                            ││   │
│   │  │ 👁️ P(watch)            │                            ││   │
│   │  │ 📥 P(clone)            │                            ││   │
│   │  │ 🤝 P(contribute)       │                            ││   │
│   │  └────────────────────────┴────────────────────────────┘│   │
│   │  Final Score = Σ(weight × P(positive)) - Σ(weight × P(negative))│
│   └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
from repo_seo.pipeline import PhoenixSEO, phoenix_recommend

# Quick recommendation with user behavior prediction
recommendations = phoenix_recommend(
    readme=open("README.md").read(),
    languages=["Python"],
)

for rec in recommendations:
    print(f"{rec['topic']}: Score={rec['final_score']}")
    actions = rec['action_scores']
    print(f"  ⭐ P(star)={actions['star']}  🍴 P(fork)={actions['fork']}")
    print(f"  👆 P(click)={actions['click']}  👁️ P(watch)={actions['watch']}")
    print(f"  ⛔ P(ignore)={actions['ignore']}")

CLI with detailed predictions:

repo-seo phoenix --detailed

Trending Topics

Dynamic matching with GitHub's trending keywords:

from repo_seo.pipeline import TrendingTopicSuggester, get_trending_topics

# Get trending topics for Python
topics = get_trending_topics("python", max_topics=10)
print(topics)  # ['machine-learning', 'fastapi', 'langchain', ...]

# Get personalized suggestions for your repo
suggester = TrendingTopicSuggester()
suggestions = suggester.suggest(
    repo_path="./my-project",
    current_topics=["python", "cli"],
    languages=["Python"],
    readme_content=open("README.md").read(),
)

for s in suggestions:
    print(f"{s['topic']}: {s['combined_score']:.1f}")

Pipeline Components

Component Description
Source Fetches candidates (LocalRepoSource, GitHubTrendingSource)
Hydrator Enriches with features (ReadmeHydrator, TrendingHydrator)
Filter Removes invalid items (QualityFilter, DuplicateFilter)
Scorer Computes scores (ReadmeScorer, TopicScorer, TrendingScorer)
Selector Picks top candidates (TopKSelector, DiversitySelector)

Configuration

# Set API keys
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Development

pip install -e ".[dev]"
pytest
black repo_seo/
ruff check repo_seo/

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repo_seo-0.6.0.tar.gz (65.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repo_seo-0.6.0-py3-none-any.whl (73.4 kB view details)

Uploaded Python 3

File details

Details for the file repo_seo-0.6.0.tar.gz.

File metadata

  • Download URL: repo_seo-0.6.0.tar.gz
  • Upload date:
  • Size: 65.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for repo_seo-0.6.0.tar.gz
Algorithm Hash digest
SHA256 4f195d2a0d2fc9d1716f4ab20d7624a44e0b6c75e7564c09fbdabce861df8195
MD5 f48bc205c3a133b60cdf33ef2e0866f7
BLAKE2b-256 1c6097ebb6e63c48f1782f312db882dd66ad2fa12a8587f32764deac98bde021

See more details on using hashes here.

File details

Details for the file repo_seo-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: repo_seo-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 73.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for repo_seo-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3a81c898a19ceaebfaf1d8546e02ee626bc0551259132db449e068657b8b59eb
MD5 0c8ceea3a7110f1b26be2af45276a6a2
BLAKE2b-256 01acad283adbb02f1c9427d0e6e93e21989fac16205dc0bc8e325fb5ae2b5c0d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page