Skip to main content

🕵️ OSINT tool to extract identities (names, emails, GitHub logins) from git repositories and the GitHub API.

Project description

Gitcolombo

Gitcolombo

OSINT tool that extracts identities — names, emails, and links between seemingly unrelated accounts — from git repositories and GitHub.

  • Python CLI (gitcolombo.py) — clones repos, walks git log, and can call the GitHub API for richer signals.
  • Web version (gitcolombo.html) — a single static HTML file; open it in a browser and query the GitHub API directly, no install.

For the full breakdown of where each email/name comes from (PGP keys, public events, commit search, commit-message trailers, etc.) see docs.md.

Web version

Hosted at https://gitcolombo.soxoj.com — or open gitcolombo.html locally. A single static HTML file that queries the GitHub API straight from your browser; no install, no backend.

Gitcolombo web version

Install

Requires Python 3.10+ and a working git binary. No third-party Python dependencies.

pip install gitcolombo

Or from source:

git clone https://github.com/Soxoj/gitcolombo
cd gitcolombo
pip install -e .

Usage

# from any git URL
gitcolombo -u https://github.com/Soxoj/maigret

# from a local directory, recursively
gitcolombo -d ./maigret -r

# clone and scan every public repo of a GitHub user/org
gitcolombo --nickname octocat

# API-only: find emails for a GitHub username without cloning
gitcolombo --search Soxoj

# change where remote repos get cloned (default: ./repos)
gitcolombo -u https://github.com/Soxoj/maigret --repos-dir ./clones

python -m gitcolombo works equivalently if you'd rather not put the script on $PATH.

Remote repositories are cloned into ./repos/ by default; override with --repos-dir. For batch cloning from GitLab and Bitbucket groups use ghorg.

Output

  • Per-person details: name, email, author/committer counts, and other identities that may belong to the same person.
  • Emails that share a name.
  • Different names tied to the same email.
  • General statistics across the scanned repos.

Why it works

Developers often commit with one identity (e.g. work account), then switch to another (e.g. personal account) and run git commit --amend, forgetting that this rewrites the committer but leaves the original author in place. The two roles drift apart, and that mismatch is exactly what gitcolombo correlates.

Short explainer on author vs. committer: https://stackoverflow.com/questions/18750808/difference-between-author-and-committer-in-git

Testing

Stdlib-only test suite — no third-party dependencies. From the repo root (after pip install -e .):

python3 -m unittest test_gitcolombo -v

The end-to-end test creates a real git repository in a temp directory, so a working git binary is required (the test is skipped if git is missing).

Tests run on every push and pull request via GitHub Actions (.github/workflows/tests.yml) across Python 3.10–3.13.

Further reading

Roadmap

  • Total statistics for repos in a directory
  • GitHub support: clone all repos from account/group
  • GitHub support: extract links to accounts from commit info
  • GitHub support: API pagination
  • Exclude "system" accounts (e.g. noreply@github.com, @users.noreply.github.com)
  • Reverse mapping email → names (currently only name → emails)
  • Probabilistic graph links based on shared names/emails and Levenshtein distance
  • Other popular git platforms: GitLab, Bitbucket

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitcolombo-0.3.0.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gitcolombo-0.3.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file gitcolombo-0.3.0.tar.gz.

File metadata

  • Download URL: gitcolombo-0.3.0.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gitcolombo-0.3.0.tar.gz
Algorithm Hash digest
SHA256 d57b37950e6597bac98dee10678d69100721354d241b93ca8fb1fbf573bb34ae
MD5 ca4a284201c039ebcb1490a62d3792dc
BLAKE2b-256 e890171951fc2f670e21789acb75d61ad8b00bfb50c64229dbf3893c0e99c938

See more details on using hashes here.

Provenance

The following attestation bundles were made for gitcolombo-0.3.0.tar.gz:

Publisher: python-publish.yml on soxoj/gitcolombo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gitcolombo-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: gitcolombo-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gitcolombo-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2ba1b8e00941cf5d9ad020ea6165ed460d2c2092d6b18bbbdd6823b43a33cbbc
MD5 405c7a4c5bc37c124c18db8112ece251
BLAKE2b-256 8be70f505e799680d13533173f6d8d20c0c65eb2e4c6a25071383cf49f4c84fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for gitcolombo-0.3.0-py3-none-any.whl:

Publisher: python-publish.yml on soxoj/gitcolombo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page