Wordlists forged for your target, not for everyone's. Hyper-contextual wordlist generation for offensive security.
Project description
WordForge
Wordlists forged for your target, not for everyone's.
Why WordForge?
Generic wordlists like rockyou, SecLists/discovery, or common.txt are
noisy against specific targets. Throwing 200,000 generic passwords at an
enterprise login is inefficient and loud.
WordForge generates hyper-contextual wordlists from passive OSINT of your target: their website, GitHub orgs, public docs, employee profiles, and historical content. The result is wordlists that are an order of magnitude more relevant than generic ones — usernames realistic to the company, directory paths matching the actual stack, parameters matching internal naming, and subdomains rooted in real codenames.
Designed as a companion to SubSift: same stack, same philosophy, pipe-friendly.
Features
- Passive OSINT collection — website crawler, GitHub org metadata, Wayback Machine, DNS, response headers — all async, rate-limited, polite.
- NER + pattern extraction — spaCy identifies people, organizations, products, locations; heuristics surface internal jargon and codenames.
- Multi-provider LLM — Ollama (local, free), Anthropic (premium), or OpenAI. Switch at runtime from the UI or CLI.
- HashCat-style mutations — capitalization, leet, year suffixes, role combinations, configurable rules.
- Categorized output — usernames, passwords, paths, subdomains, parameters, emails, company name variants.
- Pipe-friendly —
wordforge generate ... | subsift scan. - Live dashboard — the web UI streams per-stage progress over SSE (collect → extract → generate → mutate), then renders a tabbed result card with copy / download / pipe-to-SubSift actions.
- Operator seeds —
--seed employees.txtfolds LinkedIn-derived names straight into the LLM context and the offline fallback. - Personal OPSEC awareness —
wordforge person --username … --github …gathers an employee's public footprint (handle enumeration across ~25 platforms, GitHub, supplied URLs, pasted--text), derives the passwords an attacker would guess, and checks them against HaveIBeenPwned — so awareness training can say "295 of these are already in breaches." For authorized engagements. - Self-diagnostic —
wordforge doctorreports DNS, HTTPS, LLM provider, spaCy model, cache, and DB readiness in one shot. - One-command Docker —
docker compose upand you're scanning.
LLM providers
WordForge supports three LLM providers. Choose the trade-off that fits.
| Provider | Privacy | Cost | Quality | Setup |
|---|---|---|---|---|
| Ollama (default) | 🟢 Local-only | 🟢 Free | 🟡 Good | Run ollama serve; default model llama3.1:8b |
| Anthropic Claude | 🔴 API call | 🟡 Pay-per-token | 🟢 Excellent | Set ANTHROPIC_API_KEY; default model claude-sonnet-4-5 |
| OpenAI | 🔴 API call | 🟡 Pay-per-token | 🟢 Excellent | Set OPENAI_API_KEY; default model gpt-4o-mini |
Set WORDFORGE_LLM_PROVIDER=ollama|anthropic|openai in .env, or switch on
the fly with --provider on the CLI, or click the provider icon in the web UI.
Quickstart
Install as a tool from PyPI (easiest — nothing to clone):
uv tool install wordforge # or: pipx install wordforge
wordforge models download # one-time: fetch the spaCy NER model
wordforge doctor
wordforge generate example.com
Or run the whole stack with Docker:
git clone https://github.com/Ataraxia-ia-labs/WordForge.git
cd WordForge
cp .env.example .env # edit if you want non-default provider
docker compose up --build # web UI on http://localhost:8001
Or from source with uv (the spaCy NER model installs automatically via uv sync):
uv sync --extra dev
uv run wordforge doctor
uv run wordforge generate example.com --provider ollama
New here? The full Usage Guide walks through install, provider setup, your first wordlist, and feeding the output into ffuf / hydra / hashcat / SubSift.
Usage
CLI
# First time? Check your setup is healthy.
wordforge doctor
# Generate all categories for a target
wordforge generate example.com
# Pipe subdomains directly to SubSift
wordforge generate example.com --format subdomains | \
subsift scan --wordlist - example.com
# Seed with employees you've already collected (LinkedIn export, etc.)
wordforge generate example.com --seed employees.txt
# Choose provider per run
wordforge generate example.com --provider anthropic
# Export to a ZIP bundle
wordforge generate example.com --format zip --output bundle.zip
# Apply a hashcat rule file to the password candidates
wordforge generate example.com --rules best64.rule
# Run a whole list of targets (one per line); failures don't abort the batch
wordforge generate-batch targets.txt
# Compare two runs (or two snapshots of the same target over time)
wordforge diff out/example.com.old out/example.com --show
# Personal OPSEC awareness (authorized): gather a footprint, show the risk
wordforge person --name "Jane Doe" --username jdoe --github jdoe --company acme.com
# ...with a handout HTML report, or a whole roster at once
wordforge person --username jdoe --report jdoe.html
wordforge person-batch employees.csv # per-employee reports + index.html
# Browse past generations
wordforge list
Web UI
Open http://localhost:8001, enter a target, pick a provider from the selector, click Forge. Stream results in real time, download per category, or grab the ZIP bundle.
Integration with SubSift
wordforge generate target.com --format subdomains | \
subsift scan --wordlist - target.com
WordForge detects pipes automatically: when stdout is not a TTY, the banner and logs are suppressed, only data goes to stdout.
Configuration
See .env.example for the complete list. Key variables:
| Variable | Default | Description |
|---|---|---|
WORDFORGE_PORT |
8001 |
Web UI / API port |
WORDFORGE_LLM_PROVIDER |
ollama |
ollama, anthropic, openai |
OLLAMA_HOST |
http://localhost:11434 |
Ollama endpoint |
OLLAMA_MODEL |
llama3.1:8b |
Ollama model |
ANTHROPIC_MODEL |
claude-sonnet-4-5 |
Anthropic model |
OPENAI_MODEL |
gpt-4o-mini |
OpenAI model |
WORDFORGE_RATE_LIMIT_PER_HOST |
1.0 |
Requests/sec per hostname |
WORDFORGE_CRAWL_MAX_DEPTH |
2 |
Crawler depth |
Architecture
flowchart LR
A[Target] --> B[Collectors]
B -->|Website| C[Extractors]
B -->|GitHub| C
B -->|Wayback| C
B -->|DNS| C
C -->|NER + Patterns| D[LLM Provider]
D -->|Ollama / Claude / OpenAI| E[Generators]
E --> F[Mutators]
F --> G[Exporters]
G -->|txt / json / zip| H[Wordlists]
Roadmap
Shipped in v0.1.0
- Async pipeline with 4 collectors (Website BFS+robots, DNS, Wayback CDX, GitHub REST)
- LLM-driven generators with cached prompts (Ollama / Anthropic / OpenAI)
- HashCat-style rule engine + case/leet/year/suffix mutators
- HTMX dashboard with provider selector + recent-runs panel
- Runtime provider switching from the dashboard
- Pipe-friendly integration with SubSift
-
wordforge doctorself-diagnostic -
--seedflag for operator-supplied seed lists (e.g. LinkedIn names) - SQLite-backed history (
wordforge list)
Shipped in v0.2.0
- SSE-streamed live progress in the dashboard (per-stage updates)
Shipped (unreleased)
- PyPI distribution (
uv tool install wordforge/pipx) + automated tag releases - Multi-target batch mode (
wordforge generate-batch targets.txt) - Run-diff: compare two run outputs (
wordforge diff a/ b/) - Hashcat ruleset import (
generate --rules best64.rule)
Planned for v0.3
- Optional API auth (HMAC-signed bearer for
/api/generate) - Prometheus
/metricsendpoint - Burp Suite extension (separate repo)
- Plugin API for custom collectors
Contributing
See CONTRIBUTING.md. Issues and PRs welcome.
Disclaimer
WordForge is for authorized security testing only. Read DISCLAIMER.md before use. Unauthorized scanning may violate computer fraud laws.
License
AGPL-3.0-or-later. If you run a modified version as a network service, you must release your modifications under the same license.
Acknowledgements
Built on the shoulders of: FastAPI, Typer, httpx, trafilatura, spaCy, Ollama, and the broader ProjectDiscovery ecosystem that inspires the pipe-friendly philosophy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wordforge-0.4.0.tar.gz.
File metadata
- Download URL: wordforge-0.4.0.tar.gz
- Upload date:
- Size: 522.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21be7f426cc88a0512e560b227a984e93181243f261cb58c3ed19dc5b3b38eba
|
|
| MD5 |
22d641d28bac070414886d8cd66ab204
|
|
| BLAKE2b-256 |
cd39ca74a6507c9ea3aa99483a9b7c7515789812f92f188908ffccb5df01cba4
|
Provenance
The following attestation bundles were made for wordforge-0.4.0.tar.gz:
Publisher:
release.yml on Ataraxia-ia-labs/WordForge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordforge-0.4.0.tar.gz -
Subject digest:
21be7f426cc88a0512e560b227a984e93181243f261cb58c3ed19dc5b3b38eba - Sigstore transparency entry: 1632114154
- Sigstore integration time:
-
Permalink:
Ataraxia-ia-labs/WordForge@68aca0093df36855652984d0dfd60219fe19cb13 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/Ataraxia-ia-labs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@68aca0093df36855652984d0dfd60219fe19cb13 -
Trigger Event:
push
-
Statement type:
File details
Details for the file wordforge-0.4.0-py3-none-any.whl.
File metadata
- Download URL: wordforge-0.4.0-py3-none-any.whl
- Upload date:
- Size: 170.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a35de483f27e31a39189122ec8ec3aded709f1897c9e269ed9059682ceca4a68
|
|
| MD5 |
9e7646d1268935fb1080fb81b65ac811
|
|
| BLAKE2b-256 |
0b85688ef470229c72e403076408966349488e79540e70379624909636626258
|
Provenance
The following attestation bundles were made for wordforge-0.4.0-py3-none-any.whl:
Publisher:
release.yml on Ataraxia-ia-labs/WordForge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordforge-0.4.0-py3-none-any.whl -
Subject digest:
a35de483f27e31a39189122ec8ec3aded709f1897c9e269ed9059682ceca4a68 - Sigstore transparency entry: 1632114175
- Sigstore integration time:
-
Permalink:
Ataraxia-ia-labs/WordForge@68aca0093df36855652984d0dfd60219fe19cb13 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/Ataraxia-ia-labs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@68aca0093df36855652984d0dfd60219fe19cb13 -
Trigger Event:
push
-
Statement type: