Skip to main content

Tool to automatically generate scripts that show usage of scientific code from scientific papers

Project description

PaperProbe

Automatically analyze and generate usage examples for scientific code from research papers

Python Version License Code style: ruff

PaperProbe TUI Screenshot

Beautiful terminal user interface for analyzing scientific repositories

What is PaperProbe?

PaperProbe bridges the gap between scientific papers and their associated code repositories. It automatically:

  • Scans research papers (arXiv URLs or local PDFs) for GitHub repository links
  • Analyzes repositories to understand structure, dependencies, and usage patterns
  • Generates usage examples using AI-powered code analysis
  • Provides insights on repository statistics, code quality, and documentation

Perfect for researchers, developers, and students who want to quickly understand and utilize code from scientific publications.

Features

Interactive TUI

  • Beautiful terminal interface built with Textual
  • Real-time progress updates and visual feedback
  • Keyboard shortcuts for efficient navigation

Multi-Source Support

  • arXiv papers: Direct URL scanning
  • Local PDFs: Parse papers from your filesystem
  • GitHub URLs: Direct repository analysis

AI-Powered Analysis

  • Intelligent code structure analysis
  • Automated usage example generation
  • Context-aware documentation extraction

Repository Insights

  • GitHub statistics (stars, forks, contributors)
  • Dependency analysis
  • Code quality metrics
  • File structure visualization

Installation

Using pipx (Recommended)

pipx install paperprobe

Using pip

pip install paperprobe

From Source

git clone https://github.com/Brook-B-Nigatu/PaperProbe.git
cd PaperProbe
pip install -e .

[!TIP] Package version may need to be specified. pipx install paperprobe==0.2.0

Prerequisites

  • Python: 3.12 or higher (some dependencies don't work on Python v3.13)
  • API Keys: AI API key for AI-powered analysis
  • GitHub Token (optional): For enhanced GitHub API access

Configuration

Environment Variables

PaperProbe requires API keys to function.

Environment Variables (Recommended)

Add to your shell configuration file (~/.zshrc, ~/.bashrc, etc.):

export CONSTRUCTOR_KM_ID=""
export CONSTRUCTOR_API_KEY=""
export CONSTRUCTOR_API_URL=""
export GITHUB_TOKEN="" #optional

Then reload your shell:

source ~/.zshrc

Usage

Launch the TUI

paperprobe

Basic Workflow

  1. Enter a source:

    • arXiv URL: https://arxiv.org/abs/2301.12345
    • Local PDF: /path/to/paper.pdf
    • GitHub URL: https://github.com/username/repo
  2. Scan for repositories: PaperProbe extracts GitHub links from papers

  3. Select a repository: Choose from the discovered repos

  4. Choose analysis mode:

    • Basic: Quick repository overview
    • Deep: Comprehensive analysis with usage examples
  5. View results: Interactive markdown display with insights

Keyboard Shortcuts

Shortcut Action
Ctrl+S Use sample URL
Ctrl+Q Quit application
Enter Submit input / Select item
↑/↓ Navigate lists

Sample Usage

Try the built-in sample by pressing Ctrl+S on the intro screen, or paste this URL:

https://github.com/Brook-B-Nigatu/PaperProbe

Project Structure

PaperProbe/
├── src/
│   ├── core/               # Core analysis logic
│   │   ├── llm_service.py  # AI service integration
│   │   └── task_manager.py # Async task orchestration
│   ├── github_repo/        # GitHub repository handling
│   ├── preprocessing_utilities/
│   │   └── pdf_parser.py   # PDF text extraction
│   ├── tool_providers/     # Analysis tool providers
│   └── ui/                 # Terminal UI components
│       ├── app.py          # Main TUI application
│       ├── controller.py   # Business logic
│       └── style.tcss      # TUI styling
├── pyproject.toml          # Project configuration
└── README.md

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/Brook-B-Nigatu/PaperProbe.git
cd PaperProbe

# Install in editable mode
pip install -e .

# Or use uv for faster installs
uv pip install -e .

Running Tests

# Run linter
ruff check .

# Format code
ruff format .

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

Acknowledgments


Made with ❤️ for the research community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paperprobe-0.2.0.tar.gz (136.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paperprobe-0.2.0-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file paperprobe-0.2.0.tar.gz.

File metadata

  • Download URL: paperprobe-0.2.0.tar.gz
  • Upload date:
  • Size: 136.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for paperprobe-0.2.0.tar.gz
Algorithm Hash digest
SHA256 01922a5e370be3031bdf08f96ae3e2a9aac272cda9c020c625b95711410be0fa
MD5 74b666340d0a2cf7e543dc89ded5891b
BLAKE2b-256 22cdef12b53d348b9985e37a31630d9b622e6d241a8bc9b9227c85731684bee5

See more details on using hashes here.

File details

Details for the file paperprobe-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: paperprobe-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for paperprobe-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 78a427cfca151cd9dd975c2341b4a43c37d22358f1bbb61c28865cf9c5296a7a
MD5 ab70a9aae8388ac88333d7a5d58ac4d2
BLAKE2b-256 dcca27062e0e3c43245c395dcac8d9d3d31ff35cd8531c348c5d599886fb58b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page