Skip to main content

Auto-generate human-readable documentation for Salesforce Data 360 orgs.

Project description

data360-autodoc

PyPI version Tests License: MIT

Auto-generate human-readable documentation for Salesforce Data 360 (Data Cloud) orgs — in seconds, not days.

Point it at an org and it produces a full data dictionary (DMOs, DLOs, fields, keys), the data streams and field-level mappings behind them, the DMO relationship graph, an ERD, and a deterministic JSON snapshot.

  • 📓 Data dictionary — every DMO and DLO as clean Markdown tables, with field names, types, and keys.
  • 🌊 Data streams + field mappings — per-stream source/refresh metadata, the Stream → DLO field map, and the DLO → DMO field map (with real labels, not just API names).
  • 🔗 Relationships + ERD — DMO-to-DMO relationships with cardinality and status, plus a Mermaid graph of DLO → DMO mappings and relationship edges.
  • 🧊 JSON snapshot — a deterministic, diff-friendly export of your whole org schema (the foundation for drift detection — see below).

For who

Built for Salesforce SI consultants and Data Cloud practitioners who lose days hand-writing org documentation for every engagement. Works against any Data 360 org you can authenticate to with a connected app — including Developer Edition / Data Cloud Dev orgs, so you can try it on a sandbox before pointing it at a client.

Quick start

pip install data360-autodoc

data360-autodoc generate \
  --instance-url https://mydomain.my.salesforce.com \
  --client-id <connected-app-consumer-key> \
  --private-key ./server.pem \
  --username admin@myorg.com \
  --output ./docs \
  --format all
Wrote acme-data-cloud.md
Wrote acme-data-cloud.mmd
Wrote acme-data-cloud.json
Generated docs for 24 DMOs, 11 DLOs, 0 Identity Rulesets

Authentication uses the OAuth 2.0 JWT Bearer flow (connected app + private key — no passwords stored).

Options that affect the metadata fetch:

  • --sandbox — authenticate against test.salesforce.com (sandbox / scratch orgs).
  • --api-version — the Salesforce REST API version used for the /ssot/* metadata calls (e.g. v62.0). By default the tool auto-detects your org's highest supported version (from GET /services/data/), so you normally don't set this. Force it only if auto-detection picks a version where a Data Cloud endpoint misbehaves, or to pin output to a specific version. It must be a valid Salesforce REST API version your org supports.

(The Identity Rulesets count is currently always 0 — see "Not supported yet" below.)

What you get

--format controls the output:

Format Files What it is
markdown .md + .mmd Data dictionary + Mermaid ERD
json .json Deterministic org-schema snapshot
pdf Coming soon
all all of the above Everything

Example output

The Markdown data dictionary. DMO field types come from the org's relationships metadata, and fall back to the mapped DLO field type when the DMO endpoint returns a generic type (shown as (via DLO)); DLO keys come from the data streams:

## Data Model Objects (DMOs)

### Individual (`Individual__dmo`)

| Name | Type | Key |
| --- | --- | --- |
| Email__c | EmailAddress |  |
| Id__c | Text |  |

## Data Lake Objects (DLOs)

### Order (Home) (`Order_Home__dll`)

| Name | Type | Key |
| --- | --- | --- |
| Amount | Number |  |
| OrderId | Text | PrimaryKey |

Beyond the dictionary, the document includes (in this order):

  • Data Streams — one row per stream: data source, category, primary key, schedule, refresh mode.
  • Field Mapping (Streams → DLO) — every Data Lake field with its source field, DLO label, type, and a KQ_-prefix foreign-key flag.
  • DLO → DMO Field Mappings — field-level source → target mappings, grouped by DLO → DMO pairing, with real labels joined from DLO metadata.
  • Relationships — DMO-to-DMO links with cardinality and status (inactive standard relationships stay visible, never dropped):
## Relationships

| Object | Field | Cardinality | Related Object | Related Field | Status |
| --- | --- | --- | --- | --- | --- |
| Account | ssot__PrimarySalesContactPointId__c | N:1 | ssot__ContactPointEmail__dlm | ssot__Id__c | INACTIVE |

The ERD (renders natively in GitHub). Solid arrows are DLO → DMO mappings; dashed, cardinality-labeled arrows are active DMO → DMO relationships:

graph LR
  Order_Home__dll["Order (Home)"]
  Individual__dmo["Individual"]
  Order_Home__dll --> Individual__dmo
  ContactPointEmail__dlm["Contact Point Email"] -.->|N:1| Individual__dmo

Output is deterministic — the same org always produces byte-identical docs (collections are sorted alphabetically). That makes the output safe to commit and easy to diff.

What it reads — and what it doesn't yet

Under the hood it calls the Data 360 Connect REST API (/services/data/v…/ssot/*): data-model-objects (DMOs), data-model-object-mappings (DLO→DMO mappings + field names), …/{dmo}/relationships (DMO field types and the DMO-to-DMO relationship graph), and data-streams (DLOs + their fields + the per-stream and field-level mappings, including primary keys). Full request/response shapes are in agent_docs/api_reference.md.

Not supported yet. Calculated Insights and Identity Resolution rulesets are not fetched — those sections render as empty placeholders (e.g. _No Calculated Insights found._) and the Identity Rulesets count stays 0. Documenting them is on the roadmap. (Profile and Engagement DMOs are covered — those are DMO categories, not separate entities.)

Resilient by default. If one DMO's metadata can't be read, that DMO is skipped with a warning and the rest of the document is still produced. If the org has more than 500 DMOs, the list is capped (with a warning). A failure fetching the DMO list or the data streams stops the run with a clean one-line error — never a stack trace.

Future: drift monitoring (paid tier)

The open-source CLI documents your org once. The thing that actually bites consultants is when an org changes after you've documented it — a client admin adds a DLO, a field type changes, an identity rule shifts — and your beautiful docs quietly go stale.

A hosted tier (planned) will turn the deterministic JSON snapshot into drift monitoring: re-run on a schedule, diff today's snapshot against the last one, and get a client-ready changelog of exactly what changed — without ever handing over your org credentials (drift runs in your own environment; the hosted service only stores snapshots and sends alerts). The CLI stays free forever; the recurring watching, history, and multi-org dashboard are the paid layer.

Hosted version

A hosted web UI is in the works at data360doc.com (placeholder) — same docs, plus scheduled drift alerts and a multi-org dashboard for agencies.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data360_autodoc-0.2.0.tar.gz (50.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data360_autodoc-0.2.0-py3-none-any.whl (37.5 kB view details)

Uploaded Python 3

File details

Details for the file data360_autodoc-0.2.0.tar.gz.

File metadata

  • Download URL: data360_autodoc-0.2.0.tar.gz
  • Upload date:
  • Size: 50.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.6

File hashes

Hashes for data360_autodoc-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0d04bebdfb77baf2a69ab1ce96d26580444dbc64edac113f72c2792678c7fe16
MD5 c838273b93942cd83ac0a9ea67ca8bef
BLAKE2b-256 af7cfccda62566453a571118eba37ca2509a2193720d91c018566049df626a73

See more details on using hashes here.

File details

Details for the file data360_autodoc-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for data360_autodoc-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 beca436c9933cfe2d1570555ba5351145957897757e3a0f0ef07432c0fb62b66
MD5 42818b3f7be5c7fee4ada2a4fc7a62a3
BLAKE2b-256 be47aef941416c7f01b31e8b9bf803aafeb5a25b9c3010807c41b01e83d912e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page