Skip to main content

Production-ready GitHub repository forking built on PyGithub — retry, backoff, readiness polling, thread pool, background jobs, upstream remotes, and webhooks.

Project description

github-forker

PyPI version Python License: MIT

Production-ready GitHub repository forking built on PyGithub.

A bare repo.create_fork() call returns immediately but the fork is not actually usable yet — GitHub builds the copy asynchronously in the background. github-forker handles everything you need for real-world use:

  • Idempotency — detects pre-existing forks so re-runs never crash
  • Retry + exponential backoff with jitter — survives 5xx, timeouts, rate limits, and GitHub's secondary ("abuse") rate limit
  • Fork-readiness polling — waits until the fork is actually populated before returning
  • Thread poolfork_many() runs up to N forks concurrently
  • Background / fire-and-forgetfork_async() returns a ForkJob you can query or wait on from any thread
  • Streaming generatorfork_iter() yields results as each fork completes
  • Post-fork upstream remote — runs git remote add upstream <url> in your local clone
  • Post-fork webhook — registers GitHub push/fork (or any) events on the new fork

Installation

pip install github-forker

Requires Python ≥ 3.9 and PyGithub ≥ 1.55.


Quick start

from github import Github
from pygithub_fork import GitHubForker

gh = Github("ghp_your_token")
forker = GitHubForker(gh)

result = forker.fork("octocat/Hello-World")
print(result.status)      # ForkStatus.READY
print(result.clone_url)   # https://github.com/you/Hello-World.git

Usage

1. fork() — synchronous, blocking

Forks one repo and blocks until it is confirmed ready on GitHub's side.

from pygithub_fork import GitHubForker, ForkerConfig

forker = GitHubForker(gh)

result = forker.fork("octocat/Hello-World")
# result.status  → ForkStatus.READY
# result.fork    → github.Repository.Repository
# result.clone_url
# result.ssh_url
# result.already_existed  → False (or True on re-run)
# result.elapsed_seconds

Fork into an organization with a custom name:

result = forker.fork(
    "octocat/Hello-World",
    organization="my-org",
    name="hello-world-internal",
    default_branch_only=True,
)

2. fork_async() — fire-and-forget, separated process

Submit a fork to the background thread pool and return immediately.
Query the ForkJob handle from anywhere — the caller is never blocked.

job = forker.fork_async("octocat/Hello-World")

# --- do other things in the meantime ---

# Non-blocking status check:
print(job.done)          # True / False
print(job.status)        # ForkStatus.PENDING | CREATED | READY | FAILED …

# Access result without blocking (returns None if still running):
result = job.result      # ForkResult | None

# Block when you actually need the answer:
result = job.wait()      # blocks until done, returns ForkResult
result = job.wait(timeout=30)  # TimeoutError after 30s if not done

This is the answer to "fork then get status in a separate process" — submit with fork_async() and poll job.done / job.status from any thread at any time without blocking.

Concrete pattern — submit all, poll separately:

jobs = [forker.fork_async(repo) for repo in ["owner/a", "owner/b", "owner/c"]]

# ... do other work ...

# Later, collect all results:
results = [job.wait() for job in jobs]

# Or poll individually without waiting:
for job in jobs:
    if job.done:
        print(job.source_full_name, job.status)
    else:
        print(job.source_full_name, "still running")

3. fork_many() — bulk fork with thread pool

Fork a list in parallel (default) or sequentially:

results = forker.fork_many([
    "owner/repo-a",
    "owner/repo-b",
    "owner/repo-c",
])

for r in results:
    print(r.source_full_name, r.status, r.succeeded)

Parallel vs sequential:

# Parallel (default) — up to config.pool_workers concurrent forks
results = forker.fork_many(repos, parallel=True)

# Sequential — one at a time, guaranteed order, easier to debug
results = forker.fork_many(repos, parallel=False)

Per-item control with ForkRequest:

from pygithub_fork import ForkRequest

requests = [
    ForkRequest("owner/public-repo", organization="my-org"),
    ForkRequest("owner/private-repo", name="private-fork", default_branch_only=True),
    ForkRequest("owner/widget",       organization="other-org", register_webhook=True,
                webhook_url="https://ci.example.com/hooks/github"),
]
results = forker.fork_many(requests)

Stop on first failure:

results = forker.fork_many(repos, stop_on_error=True)

4. fork_iter() — streaming results (completion order)

Yields each ForkResult as soon as it finishes — useful for large batches where you want to start processing early:

for result in forker.fork_iter(["owner/a", "owner/b", "owner/c"]):
    # results arrive in completion order, not submission order
    print(result.source_full_name, result.status)

5. Post-fork: upstream remote

After forking, automatically run git remote add upstream <source_url> in a local clone:

from pygithub_fork import ForkerConfig

config = ForkerConfig(
    add_upstream_remote=True,
    local_clone_path="/path/to/your/local/clone",
)
forker = GitHubForker(gh, config)
result = forker.fork("octocat/Hello-World")

print(result.upstream_remote_added)  # True
# Now: git remote -v shows `upstream → https://github.com/octocat/Hello-World.git`

Override per-call:

result = forker.fork(
    "octocat/Hello-World",
    add_upstream_remote=True,
    local_path="/path/to/clone",
)

6. Post-fork: webhook registration

Register a GitHub webhook on the new fork immediately after creation:

config = ForkerConfig(
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push", "pull_request", "fork"],
    webhook_secret="s3cr3t",
)
forker = GitHubForker(gh, config)
result = forker.fork("octocat/Hello-World")

print(result.webhook_id)   # GitHub hook ID

Override per-call:

result = forker.fork(
    "octocat/Hello-World",
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push"],
)

7. Advanced configuration

from pygithub_fork import ForkerConfig

config = ForkerConfig(
    # Retry
    max_retries=8,
    base_backoff_seconds=2.0,
    max_backoff_seconds=120.0,

    # Readiness polling
    wait_for_ready=True,
    ready_timeout_seconds=120.0,
    ready_poll_interval_seconds=3.0,

    # Thread pool size (keep ≤ 4 to avoid GitHub secondary rate limits)
    pool_workers=4,

    # Post-fork actions
    add_upstream_remote=True,
    local_clone_path="/repos/my-clone",
    register_webhook=True,
    webhook_url="https://ci.example.com/hooks/github",
    webhook_events=["push", "fork"],
    webhook_secret="s3cr3t",

    # Callbacks
    on_retry=lambda attempt, exc, sleep: print(f"retry {attempt}: {exc}"),
    on_fork_done=lambda result: print(f"done: {result.source_full_name}"),
)

forker = GitHubForker(gh, config)

8. Context manager (pool cleanup)

with GitHubForker(gh) as forker:
    results = forker.fork_many(repos)
# Thread pool is shut down cleanly here

ForkResult fields

Field Type Description
source_full_name str "owner/repo" of the source
fork Repository | None The forked repo object (PyGithub)
status ForkStatus READY, CREATED, ALREADY_EXISTED, TIMED_OUT_WAITING, FAILED
already_existed bool True if the fork pre-existed
attempts int How many API attempts were made
elapsed_seconds float Wall time from call to return
clone_url str | None HTTPS clone URL of the fork
ssh_url str | None SSH URL of the fork
upstream_remote_added bool Whether git remote add upstream ran
webhook_id int | None GitHub hook ID if registered
error Exception | None Set on failure; None on success
succeeded bool fork is not None and error is None

ForkJob fields / methods (fork_async)

Description
.done bool — non-blocking check
.status ForkStatusPENDING while running, real status when done
.result ForkResult | None — non-blocking; None if still running
.wait(timeout=None) Block and return ForkResult; raises ForkError on failure
.source_full_name The "owner/repo" string passed in

Exception hierarchy

ForkError
├── ForkTimeoutError       # readiness timeout
├── ForkPermissionError    # 401/403 (distinct from secondary rate limit)
├── RepositoryNotFoundError
├── WebhookError           # webhook registration failed
└── UpstreamRemoteError    # git remote add upstream failed

License

MIT © Hadi Cahyadi

👤 Author

Hadi Cahyadi

Buy Me a Coffee

Donate via Ko-fi

Support me on Patreon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

github_forker-1.0.2.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

github_forker-1.0.2-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file github_forker-1.0.2.tar.gz.

File metadata

  • Download URL: github_forker-1.0.2.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for github_forker-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c4b14c81105dd46761c83404bd9571e2946e8eed162affffeb84bd459551be80
MD5 54d50760296d2c2d4bf24e42acdf2ff9
BLAKE2b-256 ba1d668e5fcbd7241d9232ed0e6343f57151315c90f040e5045089f44fed01d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_forker-1.0.2.tar.gz:

Publisher: publish.yml on cumulus13/pygithub-fork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file github_forker-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: github_forker-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for github_forker-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7c1f1afc4d4869ca839886e90aeff55b48dfbcd38a20ee70622594e1b0fc0907
MD5 12e22b452924a70d5b10aa2ae894582e
BLAKE2b-256 935d52e949a32a22cf85220d3f90aba8d09977a8153ab24bd48ac337dd11e51d

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_forker-1.0.2-py3-none-any.whl:

Publisher: publish.yml on cumulus13/pygithub-fork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page