Skip to main content

Production-ready GitHub repository forking built on PyGithub — retry, backoff, readiness polling, thread pool, background jobs, upstream remotes, and webhooks.

Project description

github-forker

PyPI version Python License: MIT

Production-ready GitHub repository forking built on PyGithub.

A bare repo.create_fork() call returns immediately but the fork is not actually usable yet — GitHub builds the copy asynchronously in the background. github-forker handles everything you need for real-world use:

  • Idempotency — detects pre-existing forks so re-runs never crash
  • Retry + exponential backoff with jitter — survives 5xx, timeouts, rate limits, and GitHub's secondary ("abuse") rate limit
  • Fork-readiness polling — waits until the fork is actually populated before returning
  • Thread poolfork_many() runs up to N forks concurrently
  • Background / fire-and-forgetfork_async() returns a ForkJob you can query or wait on from any thread
  • Streaming generatorfork_iter() yields results as each fork completes
  • Post-fork upstream remote — runs git remote add upstream <url> in your local clone
  • Post-fork webhook — registers GitHub push/fork (or any) events on the new fork

Installation

pip install github-forker

Requires Python ≥ 3.9 and PyGithub ≥ 1.55.


Quick start

from github import Github
from pygithub_fork import GitHubForker

gh = Github("ghp_your_token")
forker = GitHubForker(gh)

result = forker.fork("octocat/Hello-World")
print(result.status)      # ForkStatus.READY
print(result.clone_url)   # https://github.com/you/Hello-World.git

Usage

1. fork() — synchronous, blocking

Forks one repo and blocks until it is confirmed ready on GitHub's side.

from pygithub_fork import GitHubForker, ForkerConfig

forker = GitHubForker(gh)

result = forker.fork("octocat/Hello-World")
# result.status  → ForkStatus.READY
# result.fork    → github.Repository.Repository
# result.clone_url
# result.ssh_url
# result.already_existed  → False (or True on re-run)
# result.elapsed_seconds

Fork into an organization with a custom name:

result = forker.fork(
    "octocat/Hello-World",
    organization="my-org",
    name="hello-world-internal",
    default_branch_only=True,
)

2. fork_async() — fire-and-forget, separated process

Submit a fork to the background thread pool and return immediately.
Query the ForkJob handle from anywhere — the caller is never blocked.

job = forker.fork_async("octocat/Hello-World")

# --- do other things in the meantime ---

# Non-blocking status check:
print(job.done)          # True / False
print(job.status)        # ForkStatus.PENDING | CREATED | READY | FAILED …

# Access result without blocking (returns None if still running):
result = job.result      # ForkResult | None

# Block when you actually need the answer:
result = job.wait()      # blocks until done, returns ForkResult
result = job.wait(timeout=30)  # TimeoutError after 30s if not done

This is the answer to "fork then get status in a separate process" — submit with fork_async() and poll job.done / job.status from any thread at any time without blocking.

Concrete pattern — submit all, poll separately:

jobs = [forker.fork_async(repo) for repo in ["owner/a", "owner/b", "owner/c"]]

# ... do other work ...

# Later, collect all results:
results = [job.wait() for job in jobs]

# Or poll individually without waiting:
for job in jobs:
    if job.done:
        print(job.source_full_name, job.status)
    else:
        print(job.source_full_name, "still running")

3. fork_many() — bulk fork with thread pool

Fork a list in parallel (default) or sequentially:

results = forker.fork_many([
    "owner/repo-a",
    "owner/repo-b",
    "owner/repo-c",
])

for r in results:
    print(r.source_full_name, r.status, r.succeeded)

Parallel vs sequential:

# Parallel (default) — up to config.pool_workers concurrent forks
results = forker.fork_many(repos, parallel=True)

# Sequential — one at a time, guaranteed order, easier to debug
results = forker.fork_many(repos, parallel=False)

Per-item control with ForkRequest:

from pygithub_fork import ForkRequest

requests = [
    ForkRequest("owner/public-repo", organization="my-org"),
    ForkRequest("owner/private-repo", name="private-fork", default_branch_only=True),
    ForkRequest("owner/widget",       organization="other-org", register_webhook=True,
                webhook_url="https://ci.example.com/hooks/github"),
]
results = forker.fork_many(requests)

Stop on first failure:

results = forker.fork_many(repos, stop_on_error=True)

4. fork_iter() — streaming results (completion order)

Yields each ForkResult as soon as it finishes — useful for large batches where you want to start processing early:

for result in forker.fork_iter(["owner/a", "owner/b", "owner/c"]):
    # results arrive in completion order, not submission order
    print(result.source_full_name, result.status)

5. Post-fork: upstream remote

After forking, automatically run git remote add upstream <source_url> in a local clone:

from pygithub_fork import ForkerConfig

config = ForkerConfig(
    add_upstream_remote=True,
    local_clone_path="/path/to/your/local/clone",
)
forker = GitHubForker(gh, config)
result = forker.fork("octocat/Hello-World")

print(result.upstream_remote_added)  # True
# Now: git remote -v shows `upstream → https://github.com/octocat/Hello-World.git`

Override per-call:

result = forker.fork(
    "octocat/Hello-World",
    add_upstream_remote=True,
    local_path="/path/to/clone",
)

6. Post-fork: webhook registration

Register a GitHub webhook on the new fork immediately after creation:

config = ForkerConfig(
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push", "pull_request", "fork"],
    webhook_secret="s3cr3t",
)
forker = GitHubForker(gh, config)
result = forker.fork("octocat/Hello-World")

print(result.webhook_id)   # GitHub hook ID

Override per-call:

result = forker.fork(
    "octocat/Hello-World",
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push"],
)

7. Advanced configuration

from pygithub_fork import ForkerConfig

config = ForkerConfig(
    # Retry
    max_retries=8,
    base_backoff_seconds=2.0,
    max_backoff_seconds=120.0,

    # Readiness polling
    wait_for_ready=True,
    ready_timeout_seconds=120.0,
    ready_poll_interval_seconds=3.0,

    # Thread pool size (keep ≤ 4 to avoid GitHub secondary rate limits)
    pool_workers=4,

    # Post-fork actions
    add_upstream_remote=True,
    local_clone_path="/repos/my-clone",
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push", "fork"],
    webhook_secret="s3cr3t",

    # Callbacks
    on_retry=lambda attempt, exc, sleep: print(f"retry {attempt}: {exc}"),
    on_fork_done=lambda result: print(f"done: {result.source_full_name}"),
)

forker = GitHubForker(gh, config)

8. Context manager (pool cleanup)

with GitHubForker(gh) as forker:
    results = forker.fork_many(repos)
# Thread pool is shut down cleanly here

ForkResult fields

Field Type Description
source_full_name str "owner/repo" of the source
fork Repository | None The forked repo object (PyGithub)
status ForkStatus READY, CREATED, ALREADY_EXISTED, TIMED_OUT_WAITING, FAILED
already_existed bool True if the fork pre-existed
attempts int How many API attempts were made
elapsed_seconds float Wall time from call to return
clone_url str | None HTTPS clone URL of the fork
ssh_url str | None SSH URL of the fork
upstream_remote_added bool Whether git remote add upstream ran
webhook_id int | None GitHub hook ID if registered
error Exception | None Set on failure; None on success
succeeded bool fork is not None and error is None

ForkJob fields / methods (fork_async)

Description
.done bool — non-blocking check
.status ForkStatusPENDING while running, real status when done
.result ForkResult | None — non-blocking; None if still running
.wait(timeout=None) Block and return ForkResult; raises ForkError on failure
.source_full_name The "owner/repo" string passed in

Exception hierarchy

ForkError
├── ForkTimeoutError       # readiness timeout
├── ForkPermissionError    # 401/403 (distinct from secondary rate limit)
├── RepositoryNotFoundError
├── WebhookError           # webhook registration failed
└── UpstreamRemoteError    # git remote add upstream failed

License

MIT © Hadi Cahyadi

👤 Author

Hadi Cahyadi

Buy Me a Coffee

Donate via Ko-fi

Support me on Patreon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

github_forker-1.0.1.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

github_forker-1.0.1-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file github_forker-1.0.1.tar.gz.

File metadata

  • Download URL: github_forker-1.0.1.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for github_forker-1.0.1.tar.gz
Algorithm Hash digest
SHA256 7b6a44cd5c553d2fa30d49fb11c604666902bd85a0a4147e2e6affd06e7bf2ec
MD5 e4ede9408fce08be20e21811a05464d3
BLAKE2b-256 a8859fce287adab6941ad7ee41817d4a0382e9bf0fa5a10496c1e92f6cd175d3

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_forker-1.0.1.tar.gz:

Publisher: publish.yml on cumulus13/pygithub-fork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file github_forker-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: github_forker-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for github_forker-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b0be29f94e67fb8dff50022a5f84d7cc712d9cd4211276b16ca778763bf5d942
MD5 0067d3eb0d68cf5e5a99ee704e4e76b5
BLAKE2b-256 d9c06f77a61f40a757504fc631a71ddca1f63cf9f85b382762b5f26e19c1d6c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_forker-1.0.1-py3-none-any.whl:

Publisher: publish.yml on cumulus13/pygithub-fork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page