Skip to main content

论文被引画像分析工具 — 自动爬取施引文献、识别著名学者、生成可视化 HTML 报告

Project description

English | 中文

CitationClaw Logo

CitationClaw: A Lightweight Engine for Discovering Scientific Impact through Citations

让每一次引用都成为可解释的影响力
Turning Every Citation into Explainable Impact

Homepage PyPI PyPI Downloads Visitors PRs Welcome Issues Python Platform LLM ScraperAPI License: CC BY-NC 4.0

Turn Every Citation into Explainable Impact.
Input paper titles (or import from Google Scholar profiles), and generate a full citation portrait report in minutes.

🚀 Contribute with PRs

CitationClaw is community-driven and PR-friendly.

📢 News

  • 2026-03-15: Released beta v1.0.6 — English README as default, Chinese switch at top, and usage flow linked to Guidelines Quick Start.
  • 2026-03-14: Released v1.0.5 — AI assistant widgets for UI/report pages and reliability fixes.
  • 2026-03-14: Released v1.0.4 — improved UI and introduced Basic/Advanced/Full service tiers.
  • 2026-03-12: Released v1.0 — first public release.

Key Features

  • 🧠 Five-Phase Citation Pipeline: crawl -> author intelligence -> export -> citing description -> dashboard.
  • 🎯 Renowned Scholar Focus: auto-identifies high-impact scholars and generates dedicated outputs.
  • Tiered Analysis Modes: Basic / Advanced / Full for speed-cost-depth tradeoff.
  • 🔁 Resumable + Cache-Aware: supports resume-by-page, author cache, and citing-description cache.
  • 📊 Shareable HTML Report: standalone dashboard file, no extra server needed for viewing.
  • 🧩 Skills Runtime Inside: keeps five-phase logic while moving execution to modular skills.

🏗️ Architecture

CitationClaw keeps deterministic business phases while using a skills-style runtime for orchestration.

UI/REST/WebSocket
      │
      ▼
TaskExecutor (Orchestrator)
      │
      ▼
Skills Runtime
  ├─ phase1_citation_fetch
  ├─ phase2_author_intel
  ├─ phase3_export
  ├─ phase4_citation_desc
  └─ phase5_report_generate

More details: Technical Report

Table of Contents

📦 Install

Requires Python 3.10+ (Python 3.12 recommended).

Install from PyPI (recommended)

pip install citationclaw
citationclaw                  # default: 127.0.0.1:8000
citationclaw --port 8080      # custom port

Install from source

git clone https://github.com/VisionXLab/CitationClaw.git
cd CitationClaw
pip install -r requirements.txt
python start.py               # default: 127.0.0.1:8000
python start.py --port 8080

🚀 Quick Start

For first-time users, follow the complete guide with screenshots:

⚙️ Configuration Highlights

  • Required keys:
    • ScraperAPI Key(s) for Google Scholar crawling
    • OpenAI-compatible API Key for LLM-based analysis
  • Recommended search model:
    • Keep gemini-3-flash-preview-search for search-capable stages
  • Service tiers:
    • Basic: lower cost and faster for first runs
    • Advanced: citing descriptions for renowned-scholar papers only
    • Full: citing descriptions for all citing papers
  • For papers with >1000 citations:
    • Enable year traverse mode

📁 Project Structure

citationclaw/
├── app/                 # FastAPI app, task orchestration, config, logs
├── core/                # scraping / search / export / dashboard engines
├── skills/              # skills runtime and five phase skills
├── static/              # frontend assets
├── templates/           # Jinja2 pages
docs/                    # docs and demos
test/                    # tests

📤 Outputs

Each run creates a timestamped folder under data/result-{timestamp}/, usually including:

  • paper_results.xlsx
  • paper_results_all_renowned_scholar.xlsx
  • paper_results_top-tier_scholar.xlsx
  • paper_results_with_citing_desc.xlsx
  • paper_results.json
  • paper_dashboard.html

🤝 Contribute & Roadmap

PRs are welcome and appreciated.

Suggested directions:

  • richer skill metadata and registry conventions
  • stronger retry and network-failure resilience
  • dashboard readability and UX improvement
  • tests for pipeline contracts and compatibility
  • provider/model compatibility presets

Useful links:

🌍 Community

User Group QR

⭐ Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

citationclaw-1.0.6.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

citationclaw-1.0.6-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file citationclaw-1.0.6.tar.gz.

File metadata

  • Download URL: citationclaw-1.0.6.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for citationclaw-1.0.6.tar.gz
Algorithm Hash digest
SHA256 29d46586495ea01c3b83441023c480263a57bf1f4b627ad9501ca7216bcbf14f
MD5 40220b91a34483fcff3920670402cc9d
BLAKE2b-256 8ec6f8fb6a5dc93acf1dfadd50817f845b5973c6cd74335be28b0e525beebf58

See more details on using hashes here.

File details

Details for the file citationclaw-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: citationclaw-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for citationclaw-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c22aa3314ecd6f7a73ad273420fbeed256d6e8e4fd0e014e735afcf579b0de69
MD5 154c6ad48211d00125a1f054af376267
BLAKE2b-256 6f376267a3ce335faa8d20caa4a251a71f0f97e4e1366a199847436218d2d04f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page