Source-control automation -- audit, secret-scan, and remediate Git workspaces
Project description
source-control-automation v3 (Python rewrite)
[LOCKED] Safe to run. Bare
scais read-only. It walks your code root, writes JSON + an HTML report to its own output dir, and opens the HTML in your browser -- that's it. The pipeline produces a plan of what would-need-fixing but never executes it. After the report opens, you'll get ay/Nprompt to apply the plan; typen(or just Enter) to skip and review first. Destructive operations always require an explicit--applyflag.
Cross-platform Python rewrite of v1's PowerShell framework, with a smaller surface and a few capabilities v1 was missing.
Quick start
cd ~/code # or wherever your repos live
sca # produces a report, opens it, asks before fixing anything
That's the whole onboarding. The same sca command works on Linux, macOS, and Windows. On Windows specifically, the auto-open uses explorer.exe <path> so the browser launches in your interactive user session even when sca was started from an elevated PowerShell.
Why v3 exists
v1 (the rest of this repo) is a thorough PowerShell framework -- five numbered orchestration scripts, a Pester suite, ~25 specialized fix scripts at the root, gitignore template library. Real value, real coverage. But:
- Windows / PowerShell only -- won't run on a Linux dev box or a CI runner that defaults to bash.
- Surface area -- 128 files for what's logically a 5-stage workflow. The
Fix-RemainingIssues.ps1-style scripts are one-off remediations that crept into the tree. - Doesn't notice "wrapper repo wrapping nested repos" -- the pattern where a parent
.gittracks paths that have their own.gitdirectories. We hit this on the originalC:\code\.gitand had to extract viagit subtree splitby hand. - No secret scanning during audit -- v1 looks for "sensitive file extensions" (
.pfx,.env) but not for content patterns likeObfuscated*config + decode function in same repo, leaked PATs in committed scripts, base64+XOR secrets, etc. We found three leaked PATs and a leaked Azure cert by hand during one audit pass -- those should be detected by the tool. - HEAD-only branch view -- only inspects the current branch; can't tell you that you have 8 unpushed feature branches or that
mainis 63 commits behind your active branch. - The tool has its own leaked PAT --
Fix-RemoteUrls.ps1line 3 hardcodes agithub_pat_*. A tool that standardizes source control should not be the place this happens.
v3 is the smaller, opinionated rewrite that keeps v1's good ideas and adds the missing pieces.
Architecture
v3/
+-- README.md (this file)
+-- pyproject.toml (sca package metadata)
+-- sca/
| +-- __init__.py
| +-- cli.py (entry: `sca audit`, `sca scan`, `sca branches`, `sca render`)
| +-- audit.py (walk tree -> classify repos/orphans/loose files -> JSON)
| +-- branches.py (per-repo branch audit: unpushed, main-behind, diverged)
| +-- secrets.py (NEW: pattern-scan for leaked PATs, XOR-obfuscated configs, etc.)
| +-- render.py (JSON -> single-file HTML report)
| +-- classify.py (port v1's 5-state model; repo strategy decision)
| +-- remediate.py (port v1's state-based fixes; backup-before-modify)
| +-- extract.py (NEW: detect + fix wrapper-repo-wrapping-nested-repos pattern)
+-- templates/
| +-- gitignore/ (port v1's library: dotnet, python, node, powershell, etc.)
+-- tests/
+-- ... (pytest port of v1's Pester suite)
What carries over from v1
| v1 idea | v3 home |
|---|---|
| 5-state classification (NoSC / LocalGitOnly / Incomplete / PartialSync / Compliant) | sca/classify.py |
| Dedicated-vs-consolidated repo strategy decision (file count, .sln/.csproj presence) | sca/classify.py |
.gitignore template library by project type |
templates/gitignore/ |
| Backup-before-modify (zip the project before destructive ops) | sca/remediate.py |
| Per-state remediation workflows (init / push / commit / sync) | sca/remediate.py |
What carries over from v2 (the audit scripts written 2026-04-30)
| v2 script | v3 home |
|---|---|
audit-code.py (walk + classify into JSON) |
sca/audit.py |
branch-audit.py (per-branch flags) |
sca/branches.py |
render-audit.py (JSON -> HTML) |
sca/render.py |
Plus: cross-platform support (--root / CODE_ROOT env var) so the same code runs on Linux and Windows.
What's NEW in v3 (not in v1 or v2)
- Secret scanning during audit --
sca/secrets.pyscans every committed file for:- Hardcoded
github_pat_...,ghp_...,sk-..., AWS access keys Obfuscated*field names paired with aGet-DecryptedValue(or similar) decode function in same repo (XOR/base64 antipattern)- Embedded PFX/PEM blobs
- SharePoint URLs, tenant/client/list UUIDs that look like real values
- Any
password\s*[:=]patterns inside config files
- Hardcoded
- Wrapper-repo detection + extraction --
sca/extract.pydetects the "parent.gitwraps child repos that have their own.git" pattern and offers a clean extraction (thegit subtree splitworkflow we did manually). - Visibility-before-push gate -- every
git pushis preceded by a quick check: is the remote repo public? If yes, runsecrets.pybefore the push. Stops the 68-minute-public-exposure problem we hit with ResetIntuneEnrollment. - Branch-level audit -- already had this in v2; v1 was HEAD-only. Surfaces unpushed branches and
main-behind situations as first-class report items.
Out of scope for v3
- Continuous monitoring / dashboard -- v1's Mode 4. The audit is one-shot. If you want recurring runs, schedule via cron / Task Scheduler.
- Cross-platform support beyond Windows + Linux. macOS should work but isn't tested.
- Non-GitHub forges (GitLab, Bitbucket, Gitea). v1 had aspirations here; v3 is GitHub-only by design.
Status / progress (this branch)
| Module | Status | Notes |
|---|---|---|
sca/audit.py |
[DONE] ported from v2 | walk + classify dirs -> JSON; cross-platform (--root) |
sca/branches.py |
[DONE] ported from v2 | per-repo branch state; unpushed / behind / diverged |
sca/render.py |
[DONE] ported from v2 | JSON -> single-file HTML report |
sca/secrets.py |
[DONE] NEW | live token regex, XOR-obfuscation pair detector, real-looking GUIDs, SharePoint URLs |
sca/classify.py |
[DONE] NEW | 5-state model + 3 extra states v1 didn't have (Empty, LooseFile, WrapperRepo) + dedicated/consolidated decision |
sca/extract.py |
[DONE] NEW | wrapper-repo detection + repair (subtree-split / archive-and-delete) |
sca/templates.py + templates/gitignore/ |
[DONE] NEW | stack detection (python/node/dotnet/powershell) -> curated gitignore |
sca/remediate.py |
[DONE] NEW | per-state plan + executor; backup-before-modify; visibility-before-push gate |
sca/cli.py |
[DONE] NEW | sca audit | branches | scan | classify | extract | gitignore | remediate | render |
tests/ |
[DONE] NEW | 35 pytest tests, all green |
pyproject.toml |
[DONE] | pip install -e v3 |
Smoke runs on C:\code (real workspace):
- audit + classify: 24 entries -> 20 FullyCompliant, 2 WrapperRepo, 1 IncompleteSourceControl, 1 LooseFile
- secrets scan on this very repo: caught 16 hardcoded GitHub PATs in v1's root remediation scripts (subsequently rotated and redacted)
- extract: identified
UnicodeReplacementTool/as the one wrapper-repo situation in the workspace
Open work
- The visibility-before-push gate in
remediate.pycallsgh apito determine repo visibility -- that's a network dependency. A--offlinemode that errs on the side of "treat as public" would be safer for CI use. sca extract --archive-and-deleteworks but the subtree-split path needs an integration test against a real wrapper repo (currently only unit-tested viaextract.detect).sca renderstill writes toC:/code/temp/audit-report.htmlby default -- that path needs to be parameterized or written next to the input JSON.
How to use it
pip install -e v3 # editable install puts `sca` on PATH
sca audit --root ~/code # walks tree, prints JSON
sca audit --root ~/code | sca classify --summary
sca scan ~/code/some-repo # secret scan one repo
sca extract --root ~/code # find wrapper-repo situations
sca gitignore ~/code/some-dir --write # write a stack-aware .gitignore
sca audit --root ~/code | sca remediate --plan # dry-run plan
sca audit --root ~/code | sca remediate --apply # actually run it
Set CODE_ROOT to skip --root everywhere.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aollivierre_sca-0.2.2.tar.gz.
File metadata
- Download URL: aollivierre_sca-0.2.2.tar.gz
- Upload date:
- Size: 115.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e799ff73810f8625481dc59969b8af64de7bdbe9ca2049e616ef0e87608450c
|
|
| MD5 |
43d2c12d3aaefafe2c521d25eddb3d8c
|
|
| BLAKE2b-256 |
0cf30ff833d7333ef37bc52f0e75ee4267e3da72d4c92027fbb547d3d390adc8
|
Provenance
The following attestation bundles were made for aollivierre_sca-0.2.2.tar.gz:
Publisher:
release.yml on aollivierre/source-control-automation
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aollivierre_sca-0.2.2.tar.gz -
Subject digest:
9e799ff73810f8625481dc59969b8af64de7bdbe9ca2049e616ef0e87608450c - Sigstore transparency entry: 1453992007
- Sigstore integration time:
-
Permalink:
aollivierre/source-control-automation@51dbda54ab784dc899b3381718fd56578171fcb7 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/aollivierre
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@51dbda54ab784dc899b3381718fd56578171fcb7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file aollivierre_sca-0.2.2-py3-none-any.whl.
File metadata
- Download URL: aollivierre_sca-0.2.2-py3-none-any.whl
- Upload date:
- Size: 102.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86d728b9857c46c82fcc2991a1e3c94e91b73359b751b12f22d42fa036b964d4
|
|
| MD5 |
17a8a300fd8cdfc2325f68f7323347c3
|
|
| BLAKE2b-256 |
d3b2155ef49ce7b2465abf5b06599d28c086305e4b22b67ad2a625e9ebe47a20
|
Provenance
The following attestation bundles were made for aollivierre_sca-0.2.2-py3-none-any.whl:
Publisher:
release.yml on aollivierre/source-control-automation
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aollivierre_sca-0.2.2-py3-none-any.whl -
Subject digest:
86d728b9857c46c82fcc2991a1e3c94e91b73359b751b12f22d42fa036b964d4 - Sigstore transparency entry: 1453992093
- Sigstore integration time:
-
Permalink:
aollivierre/source-control-automation@51dbda54ab784dc899b3381718fd56578171fcb7 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/aollivierre
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@51dbda54ab784dc899b3381718fd56578171fcb7 -
Trigger Event:
push
-
Statement type: