Convert arXiv research papers into runnable Python implementations using Claude AI
Project description
PaperForge SDK
Convert arXiv research papers into runnable Python implementations — in seconds.
PaperForge reads an arXiv paper, extracts the methodology using Claude AI, and generates a working Python implementation — complete with section references, usage examples, and honest limitations.
Installation
pip install paperforge
Quick Start
from paperforge import PaperForge
pf = PaperForge(base_url="http://localhost:8000") # or your deployed URL
# Analyze a paper
analysis = pf.analyze("https://arxiv.org/abs/1706.03762")
print(analysis.key_algorithm) # Transformer
print(analysis.implementation_difficulty) # hard
print(analysis.reported_results) # {'WMT 2014 EN-DE BLEU': '28.4'}
# Generate a Python implementation
code = pf.generate("https://arxiv.org/abs/1706.03762")
print(code.strategy) # core (hard papers get core mechanism only)
print(code.estimated_lines) # 67
code.save("transformer.py") # save to disk
# Full pipeline in one call
result = pf.paper("https://arxiv.org/abs/1706.03762")
result.save_code("output/")
print(f"Used {result.total_tokens:,} tokens")
Features
| Feature | Description |
|---|---|
| Paper analysis | Extracts algorithm, datasets, metrics, novelty, reproducibility notes |
| Code generation | Generates runnable Python with paper section references |
| Smart strategy | Easy → full impl · Hard → core mechanism · Non-implementable → skeleton |
| PDF upload | Works with local PDFs, not just arXiv |
| Benchmarking | Run generated code against your CSV dataset via E2B sandbox |
| Type-safe | Full dataclass models with properties and methods |
API Reference
PaperForge(base_url, timeout)
Main client class.
# Against local dev server
pf = PaperForge(base_url="http://localhost:8000")
# Against deployed API
pf = PaperForge(base_url="https://paperforge.onrender.com")
# As context manager (auto-closes HTTP client)
with PaperForge(base_url="http://localhost:8000") as pf:
analysis = pf.analyze("1706.03762")
pf.analyze(url) → PaperAnalysis
Fetch and analyze any arXiv paper.
analysis = pf.analyze("https://arxiv.org/abs/1706.03762")
# or bare ID:
analysis = pf.analyze("1706.03762")
print(analysis.title) # "Attention Is All You Need"
print(analysis.key_algorithm) # "Transformer"
print(analysis.implementation_difficulty) # "hard"
print(analysis.is_hard) # True
print(analysis.datasets_used) # ["WMT 2014 English-German", ...]
print(analysis.evaluation_metrics) # ["BLEU"]
print(analysis.reported_results) # {"WMT 2014 EN-DE BLEU": "28.4"}
print(analysis.dependencies) # ["torch", "numpy"]
print(analysis.reproducibility_notes) # "Full hyperparameters in appendix..."
print(analysis.tokens_used) # 12868
PaperAnalysis properties:
.is_hard— True if difficulty is "hard".is_easy— True if difficulty is "easy"
pf.generate(url) → GeneratedCode
Generate a Python implementation from an arXiv paper.
code = pf.generate("https://arxiv.org/abs/1603.02754")
print(code.strategy) # "full" (XGBoost is medium difficulty)
print(code.estimated_lines) # 85
print(code.explanation) # "Implements XGBoost gradient boosting..."
print(code.install_command) # "pip install sklearn numpy"
print(code.limitations) # "No distributed training, no GPU support..."
print(code.code) # Full Python source code
# Save to file
code.save("xgboost_impl.py") # saves to file
code.save("output/") # saves as paperforge_implementation.py
code.print_usage() # prints usage example to stdout
Generation strategies:
"full"— complete implementation (easy/medium papers)"core"— core mechanism only (hard papers like Transformer, BERT)"skeleton"— documented stubs (non-implementable or theory papers)
pf.generate_from_analysis(analysis) → GeneratedCode
Skip re-parsing when you already have an analysis.
analysis = pf.analyze("1706.03762")
# Regenerate without fetching the paper again:
code = pf.generate_from_analysis(analysis)
pf.paper(url) → PaperResult
Full pipeline: analyze + generate in one call.
result = pf.paper("https://arxiv.org/abs/1706.03762")
print(result.arxiv_id) # "1706.03762"
print(result.analysis.key_algorithm) # "Transformer"
print(result.code.strategy) # "core"
print(result.total_tokens) # 14968
result.save_code("output/") # save generated code
result.save_code("transformer_impl.py") # save to specific file
pf.analyze_pdf(path) → PaperAnalysis
Analyze a local PDF file.
analysis = pf.analyze_pdf("papers/my_paper.pdf")
print(analysis.title)
pf.benchmark(csv_path, analysis, code) → BenchmarkResult
Run generated code against your dataset in an E2B cloud sandbox.
Requires
E2B_API_KEYconfigured on the server.
analysis = pf.analyze("https://arxiv.org/abs/1603.02754")
code = pf.generate_from_analysis(analysis)
result = pf.benchmark("data/iris.csv", analysis, code)
print(result.status) # "success"
print(result.dataset_rows) # 150
print(result.interpretation) # Claude's plain-English analysis
print(result.execution_time_ms) # 6690
for metric in result.metrics:
print(metric.name, metric.your_value, metric.paper_value, metric.gap_pct)
print(metric.beat_paper) # True/False/None
Error Handling
from paperforge import PaperForge
from paperforge.exceptions import (
PaperNotFoundError,
InvalidArxivURLError,
TimeoutError,
ConnectionError,
APIError,
)
pf = PaperForge(base_url="http://localhost:8000")
try:
analysis = pf.analyze("https://arxiv.org/abs/1706.03762")
except PaperNotFoundError:
print("Paper not found on arXiv — check the ID")
except InvalidArxivURLError:
print("Invalid arXiv URL format")
except TimeoutError:
print("Request timed out — try increasing timeout parameter")
except ConnectionError:
print("Cannot connect to PaperForge API — is the server running?")
except APIError as e:
print(f"API error {e.status_code}: {e}")
Self-Hosting
The SDK points to any PaperForge API instance. To run locally:
# Clone the PaperForge backend
git clone https://github.com/GPREETHAMSAXON/PaperForge
cd PaperForge
# Install and configure
pip install -r requirements.txt
cp .env.example .env
# Add ANTHROPIC_API_KEY to .env
# Start the server
uvicorn app.main:app --reload
Then use the SDK:
pf = PaperForge(base_url="http://localhost:8000")
Examples
# Run the quickstart example (requires local server running)
python examples/quickstart.py
License
MIT © Saxon
Related
- PaperForge Web App — the full-stack product
- AutoViz AI — data analytics platform
- ModelPulse — ML model monitoring
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file paperforge_sdk-0.2.0.tar.gz.
File metadata
- Download URL: paperforge_sdk-0.2.0.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73163250087ab996ed0ff42d240fa8f7dd715c559da526d4bf38db889dd70758
|
|
| MD5 |
6d127d492a7d820822c9c4c34df827a2
|
|
| BLAKE2b-256 |
31a6edff458cb2836dbe99fdf3db2aba5aee7a4718e83f7fb7066529ea87c9cc
|
File details
Details for the file paperforge_sdk-0.2.0-py3-none-any.whl.
File metadata
- Download URL: paperforge_sdk-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
543bc7bc0987d8b54df377547da683bcba02011814d42d53df4d647edaf6a527
|
|
| MD5 |
4fb288b47816d4befd676eb09f815b56
|
|
| BLAKE2b-256 |
840487bb6b312d52c4a989992cb333cd1738d83012d1e3590dca3b870d2bc84d
|