Auto-generate human-readable documentation for Salesforce Data 360 orgs.
Project description
data360-autodoc
Auto-generate human-readable documentation for Salesforce Data 360 (Data Cloud) orgs — in seconds, not days.
Point it at an org and it produces a full data dictionary (DMOs, DLOs, fields, keys), the data streams and field-level mappings behind them, the DMO relationship graph, an ERD, and a deterministic JSON snapshot.
- 📓 Data dictionary — every DMO and DLO as clean Markdown tables, with field names, types, and keys.
- 🌊 Data streams + field mappings — per-stream source/refresh metadata, the Stream → DLO field map, and the DLO → DMO field map (with real labels, not just API names).
- 🔗 Relationships + ERD — DMO-to-DMO relationships with cardinality and status, plus a Mermaid graph of DLO → DMO mappings and relationship edges.
- 🧊 JSON snapshot — a deterministic, diff-friendly export of your whole org schema (the foundation for drift detection — see below).
For who
Built for Salesforce SI consultants and Data Cloud practitioners who lose days hand-writing org documentation for every engagement. Works against any Data 360 org you can authenticate to with a connected app — including Developer Edition / Data Cloud Dev orgs, so you can try it on a sandbox before pointing it at a client.
Quick start
pip install data360-autodoc
data360-autodoc generate \
--instance-url https://mydomain.my.salesforce.com \
--client-id <connected-app-consumer-key> \
--private-key ./server.pem \
--username admin@myorg.com \
--output ./docs \
--format all
Wrote acme-data-cloud.md
Wrote acme-data-cloud.mmd
Wrote acme-data-cloud.json
Generated docs for 24 DMOs, 11 DLOs, 0 Identity Rulesets
Authentication uses the OAuth 2.0 JWT Bearer flow (connected app + private key — no passwords stored).
Options that affect the metadata fetch:
--sandbox— authenticate againsttest.salesforce.com(sandbox / scratch orgs).--api-version— the Salesforce REST API version used for the/ssot/*metadata calls (e.g.v62.0). By default the tool auto-detects your org's highest supported version (fromGET /services/data/), so you normally don't set this. Force it only if auto-detection picks a version where a Data Cloud endpoint misbehaves, or to pin output to a specific version. It must be a valid Salesforce REST API version your org supports.
(The Identity Rulesets count is currently always 0 — see "Not supported yet" below.)
What you get
--format controls the output:
| Format | Files | What it is |
|---|---|---|
markdown |
.md + .mmd |
Data dictionary + Mermaid ERD |
json |
.json |
Deterministic org-schema snapshot |
pdf |
— | Coming soon |
all |
all of the above | Everything |
Example output
The Markdown data dictionary. DMO field types come from the org's relationships metadata, and fall back to the mapped DLO field type when the DMO endpoint returns a generic type (shown as (via DLO)); DLO keys come from the data streams:
## Data Model Objects (DMOs)
### Individual (`Individual__dmo`)
| Name | Type | Key |
| --- | --- | --- |
| Email__c | EmailAddress | |
| Id__c | Text | |
## Data Lake Objects (DLOs)
### Order (Home) (`Order_Home__dll`)
| Name | Type | Key |
| --- | --- | --- |
| Amount | Number | |
| OrderId | Text | PrimaryKey |
Beyond the dictionary, the document includes (in this order):
- Data Streams — one row per stream: data source, category, primary key, schedule, refresh mode.
- Field Mapping (Streams → DLO) — every Data Lake field with its source field, DLO label, type, and a
KQ_-prefix foreign-key flag. - DLO → DMO Field Mappings — field-level source → target mappings, grouped by DLO → DMO pairing, with real labels joined from DLO metadata.
- Relationships — DMO-to-DMO links with cardinality and status (inactive standard relationships stay visible, never dropped):
## Relationships
| Object | Field | Cardinality | Related Object | Related Field | Status |
| --- | --- | --- | --- | --- | --- |
| Account | ssot__PrimarySalesContactPointId__c | N:1 | ssot__ContactPointEmail__dlm | ssot__Id__c | INACTIVE |
The ERD (renders natively in GitHub). Solid arrows are DLO → DMO mappings; dashed, cardinality-labeled arrows are active DMO → DMO relationships:
graph LR
Order_Home__dll["Order (Home)"]
Individual__dmo["Individual"]
Order_Home__dll --> Individual__dmo
ContactPointEmail__dlm["Contact Point Email"] -.->|N:1| Individual__dmo
Output is deterministic — the same org always produces byte-identical docs (collections are sorted alphabetically). That makes the output safe to commit and easy to diff.
What it reads — and what it doesn't yet
Under the hood it calls the Data 360 Connect REST API (/services/data/v…/ssot/*): data-model-objects (DMOs), data-model-object-mappings (DLO→DMO mappings + field names), …/{dmo}/relationships (DMO field types and the DMO-to-DMO relationship graph), and data-streams (DLOs + their fields + the per-stream and field-level mappings, including primary keys). Full request/response shapes are in agent_docs/api_reference.md.
Not supported yet. Calculated Insights and Identity Resolution rulesets are not fetched — those sections render as empty placeholders (e.g. _No Calculated Insights found._) and the Identity Rulesets count stays 0. Documenting them is on the roadmap. (Profile and Engagement DMOs are covered — those are DMO categories, not separate entities.)
Resilient by default. If one DMO's metadata can't be read, that DMO is skipped with a warning and the rest of the document is still produced. If the org has more than 500 DMOs, the list is capped (with a warning). A failure fetching the DMO list or the data streams stops the run with a clean one-line error — never a stack trace.
Future: drift monitoring (paid tier)
The open-source CLI documents your org once. The thing that actually bites consultants is when an org changes after you've documented it — a client admin adds a DLO, a field type changes, an identity rule shifts — and your beautiful docs quietly go stale.
A hosted tier (planned) will turn the deterministic JSON snapshot into drift monitoring: re-run on a schedule, diff today's snapshot against the last one, and get a client-ready changelog of exactly what changed — without ever handing over your org credentials (drift runs in your own environment; the hosted service only stores snapshots and sends alerts). The CLI stays free forever; the recurring watching, history, and multi-org dashboard are the paid layer.
Hosted version
A hosted web UI is in the works at data360doc.com (placeholder) — same docs, plus scheduled drift alerts and a multi-org dashboard for agencies.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data360_autodoc-0.2.0.tar.gz.
File metadata
- Download URL: data360_autodoc-0.2.0.tar.gz
- Upload date:
- Size: 50.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d04bebdfb77baf2a69ab1ce96d26580444dbc64edac113f72c2792678c7fe16
|
|
| MD5 |
c838273b93942cd83ac0a9ea67ca8bef
|
|
| BLAKE2b-256 |
af7cfccda62566453a571118eba37ca2509a2193720d91c018566049df626a73
|
File details
Details for the file data360_autodoc-0.2.0-py3-none-any.whl.
File metadata
- Download URL: data360_autodoc-0.2.0-py3-none-any.whl
- Upload date:
- Size: 37.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
beca436c9933cfe2d1570555ba5351145957897757e3a0f0ef07432c0fb62b66
|
|
| MD5 |
42818b3f7be5c7fee4ada2a4fc7a62a3
|
|
| BLAKE2b-256 |
be47aef941416c7f01b31e8b9bf803aafeb5a25b9c3010807c41b01e83d912e5
|