Skip to main content

Exposed repository metadata testing tool

Project description

penterepTools

PTREPO - Exposed repository testing tool

ptrepo is a Penterep tool for testing exposed source-code repositories on web servers. The planned scope covers repository discovery, best-effort repository metadata/content download, commit/revision listing where practical, and native secret scanning of recovered content.

Current MVP status

Current version supports discovery, Git/SVN best-effort content download, Mercurial/Bazaar metadata download, reachable or dangling Git commit listing, observed SVN revision reporting, observed Mercurial/Bazaar count reporting from metadata where possible, and native secret scanning of recovered Git/SVN content, Git history, and recovered Mercurial/Bazaar metadata. Terminal output shows concise history/count summaries; JSON output keeps the detailed recovery source in historyCoverage.

Extended SCM types such as Darcs, Fossil, CVSweb, and RCS are discovery-level in this MVP. If --download, --commits, or --secrets is requested for a confirmed unsupported SCM type, ptrepo reports an explicit unsupported warning instead of printing a false commit count or ambiguous empty result.

Implemented:

  • URL normalization
  • .git, .svn, _svn, .bzr, .hg, cgi-bin/cvsweb.cgi, _darcs, _fossil_, Fossil checkout markers, and RCS candidate generation
  • HTTP probing
  • discovery classification
  • Git recovery for metadata, refs, reflogs, loose objects, pack files, files recoverable from .git/index, and files exportable from the reconstructed Git object database
  • SVN recovery for entries, text-base, wc.db, pristine, and recovered file contents
  • Mercurial and Bazaar/Breezy metadata recovery for known metadata paths
  • Git validation and history reporting through a defensive low-level git backend
  • SVN observed revision reporting from recovered entries and wc.db metadata
  • Mercurial changeset count reporting from .hg/store/00changelog.i when the recovered revlog index is usable
  • Bazaar/Breezy revision count reporting from .bzr/branch/last-revision when available
  • native Git commit patch/message, recovered Git/SVN file, and recovered Mercurial/Bazaar metadata secret scanning with built-in rules, redaction, fingerprints, and coverage reporting
  • human and JSON output

Git download currently saves:

  • .git/HEAD
  • .git/config
  • .git/index
  • .git/packed-refs
  • .git/info/refs
  • .git/objects/info/packs
  • .git/logs/HEAD and discovered/common ref logs
  • branch/tag ref files where discovered
  • loose objects discovered from refs, reflogs, commits, trees, and .git/index
  • pack files listed in .git/objects/info/packs
  • locally reconstructed pack indexes where a .pack file is recovered but the matching .idx file is unavailable
  • recovered blob contents under git/files/
  • files exported from reachable or dangling commit trees when the local Git object database is usable

SVN download currently saves:

  • .svn/entries or _svn/entries
  • .svn/wc.db or _svn/wc.db
  • old working-copy text-base files where discoverable
  • recursive old working-copy entries/text-base files where subdirectory metadata is exposed
  • new working-copy pristine files where discoverable from wc.db
  • recovered file contents under svn/files/

Mercurial/Bazaar download currently saves selected metadata only. It does not reconstruct full working trees or history:

  • .hg/requires
  • .hg/hgrc
  • .hg/dirstate
  • .hg/store/00changelog.i
  • .hg/store/00manifest.i
  • .bzr/branch-format
  • .bzr/branch/format
  • .bzr/branch/last-revision
  • .bzr/branch/branch.conf
  • .bzr/repository/format
  • .bzr/repository/pack-names
  • .bzr/checkout/format
  • .bzr/checkout/dirstate

When --download is used, ptrepo also reports available Git commit counts, observed SVN revision counts, and observed Mercurial/Bazaar counts without printing detailed history entries. Use --commits for bounded detailed history output.

These planned options are accepted by the CLI contract but intentionally fail in the current MVP slice:

  • -r/--redirects
  • -C/--cache

Installation

pip install ptrepo

Adding to PATH

If you're unable to invoke the script from your terminal, it's likely because it's not included in your PATH. You can resolve this issue by executing the following commands, depending on the shell you're using:

For Bash Users

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc

For ZSH Users

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.zshrc
source ~/.zshrc

Usage examples

ptrepo -u https://www.example.com/
ptrepo -u https://www.example.com/plugins/mpdf
ptrepo -u https://www.example.com/ -t git svn bzr hg cvs darcs fossil rcs
ptrepo -f urls.txt -w repository_paths.txt
ptrepo -u https://www.example.com/ --download
ptrepo -u https://www.example.com/ --download ~/Download/repo
ptrepo -u https://www.example.com/ --commits
ptrepo -u https://www.example.com/ --commits --commit-limit 20
ptrepo -u https://www.example.com/ --download --commits
ptrepo -u https://www.example.com/ --secrets
ptrepo -u https://www.example.com/ --max-response-bytes 32768 -j

Options

   -u   --url           <url>           Test specified URL
   -f   --file          <file>          Load URLs from file
   -w   --wordlist      <file>          Load additional supported repository path candidates from file
   -t   --repo-type     <type>          Repository type(s) to test: git, svn, bzr, hg, cvs, darcs, fossil, rcs
        --download      [directory]     Download recoverable repository content/metadata; defaults to current directory
        --commits                       Temporarily recover metadata and list Git commits or observed SVN/Hg/Bzr counts
        --commit-limit  <count>         Maximum commit/revision entries to print and Git commits to scan; 0 disables both
        --secrets                       Temporarily recover supported repository content/metadata and scan for secrets
        --secrets-rules <file>          Load additional JSON secret rules
        --secrets-baseline <file>       Ignore previously reported secret finding fingerprints
        --secrets-mode  <mode>          Secret scan mode: auto, files, or history
        --entropy                       Enable entropy checks for generic secret rules
        --no-entropy                    Disable entropy checks for generic secret rules
        --allowlist     <file>          Load JSON secret allowlist
        --max-secret-file-size <bytes>  Maximum recovered file size to scan for secrets
   -H   --headers       <header:value>  Set custom header(s)
   -T   --timeout       <timeout>       Set timeout
        --max-response-bytes <bytes>    Maximum bytes to read from each discovery response
        --max-download-bytes <bytes>    Maximum bytes to write for each downloaded file
   -a   --user-agent    <user-agent>    Set User-Agent header
   -c   --cookie        <cookie=value>  Set cookie(s)
   -p   --proxy         <proxy>         Set proxy (e.g. http://127.0.0.1:8080)
   -v   --version                       Show script version and exit
   -h   --help                          Show this help message and exit
   -j   --json                          Output JSON only, suppresses banner and human output

Planned options

These options are accepted by the CLI contract but intentionally fail in the current MVP slice.

   -r   --redirects                     Planned, not implemented in current MVP slice
   -C   --cache                         Planned, not implemented in current MVP slice

Secret rule files

--secrets-rules loads additional JSON rules. The file may contain either a list of rules or an object with a rules list. Custom rules must include at least one keyword so the scanner can skip regex evaluation on unrelated lines. Custom rule regexes are length-limited, secret_group must reference an existing capture group, and entropy_threshold must be between 0.0 and 8.0.

{
  "rules": [
    {
      "id": "custom-demo-token",
      "name": "Custom demo token",
      "description": "Project-specific token",
      "regex": "(DEMO_[A-Z0-9]{12})",
      "secret_group": 1,
      "keywords": ["DEMO_"],
      "severity": "high",
      "confidence": "medium",
      "allowlist": {
        "patterns": ["DEMO_PUBLIC_FIXTURE"],
        "regexes": ["^DEMO_TEST_[A-Z0-9]+$"]
      }
    }
  ]
}

--allowlist loads JSON allowlists:

{
  "patterns": ["known-fixture-value"],
  "regexes": ["^example_[A-Za-z0-9]+$"]
}

Allowlist regexes are length-limited before compilation.

--secrets-baseline loads previously reported fingerprints and suppresses matching findings. It accepts either a JSON list of fingerprint strings, an object with a fingerprints list, or a previous PTREPO-style JSON report that contains nested fingerprint fields. Suppressed findings are counted as ignored baseline findings in human and JSON output.

When Git history contains more commits than --commit-limit, history-aware secret scanning reports partial coverage instead of implying that the whole history was scanned. Setting --commit-limit 0 disables detailed history listing and Git commit secret scanning; file-mode secret scanning can still run. Mercurial and Bazaar/Breezy secret scanning is metadata-only in this MVP; it does not reconstruct or scan each historical changeset/revision.

Built-in secret rules cover common provider and generic credential patterns, including private key markers, AWS AKIA/ASIA access key IDs, GitHub tokens, GitLab access/build/deploy/runner/OAuth token prefixes, Slack tokens and incoming webhooks, Stripe secret/restricted/webhook keys, Google API keys and OAuth client secrets, Google service-account JSON, database URLs with credentials, URLs with embedded credentials, JWT-like tokens, generic password/token/API key assignments, and conservative base64/hex decoded credential assignments. Git history scanning checks added and deleted patch lines plus commit message text. Recovered-file scanning skips oversized files and files that look binary based on NUL bytes or a high ratio of binary control bytes.

Dependencies

ptlibs>=1.0.33,<2
requests>=2.31,<3

License

Copyright (c) 2026 Penterep Security s.r.o.

ptrepo is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

ptrepo is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with ptrepo. If not, see https://www.gnu.org/licenses/.

Warning

You are only allowed to run the tool against the websites which you have been given permission to pentest. We do not accept any responsibility for any damage/harm that this application causes to your computer, or your network. Penterep is not responsible for any illegal or malicious use of this code. Be Ethical!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptrepo-0.0.3.tar.gz (90.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptrepo-0.0.3-py3-none-any.whl (71.0 kB view details)

Uploaded Python 3

File details

Details for the file ptrepo-0.0.3.tar.gz.

File metadata

  • Download URL: ptrepo-0.0.3.tar.gz
  • Upload date:
  • Size: 90.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for ptrepo-0.0.3.tar.gz
Algorithm Hash digest
SHA256 99fe10ebe5c6870450edaf4b8e15c840291b83fd040bc6957ec5f986ae40952b
MD5 ef7edf52b39a9f299a12c2fed4ab5132
BLAKE2b-256 4bcec77432aa0867c81a6f165e0522caca9e3f8041a74f2facdedf6dd4ad4e04

See more details on using hashes here.

File details

Details for the file ptrepo-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: ptrepo-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 71.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for ptrepo-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4957b02deaca01e1b050e5cd9e3db05e2f8b704fe0464a16785892787b54089a
MD5 5415354fd6f0a47ec14e4bb3fa7b533c
BLAKE2b-256 5c6b15fa4d9dbc6ae82ef39848eb4497972ae7d5d560d7239c8a782d65446e0f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page