Skip to main content

Microsoft Purview CLI with comprehensive automation capabilities

Project description

pvw-cli — Microsoft Purview Command-Line Interface

Version Status Docs

A Python CLI and library for automating Microsoft Purview. Covers the Data Map, Unified Catalog, Collections, Search, Lineage, Scan, and Management APIs.


Install

pip install pvw-cli

For the latest development version:

git clone https://github.com/Keayoub/pvw-cli.git
cd pvw-cli
pip install -r requirements.txt
pip install -e .

Configuration

Set these three environment variables before running any command:

Variable Description
PURVIEW_ACCOUNT_NAME Your Purview account name (e.g. mycompany-purview)
PURVIEW_ACCOUNT_ID Your Azure Tenant ID (used as the Purview account ID for UC APIs)
PURVIEW_RESOURCE_GROUP The resource group containing your Purview account

PowerShell:

$env:PURVIEW_ACCOUNT_NAME = "your-purview-account"
$env:PURVIEW_ACCOUNT_ID   = "your-tenant-id-guid"
$env:PURVIEW_RESOURCE_GROUP = "your-resource-group"

Bash / Linux / macOS:

export PURVIEW_ACCOUNT_NAME=your-purview-account
export PURVIEW_ACCOUNT_ID=your-tenant-id-guid
export PURVIEW_RESOURCE_GROUP=your-resource-group

To find your Tenant ID:

az account show --query tenantId -o tsv

Authentication

The CLI uses DefaultAzureCredential and tries methods in this order:

  1. Azure CLI — run az login (easiest for local use)
  2. Service Principal — set AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET
  3. Managed Identity — works automatically on Azure VMs, App Service, etc.

Legacy tenant note: If you get AADSTS500011: resource principal https://purview.azure.com not found, your tenant uses the older service principal. Set:

export PURVIEW_AUTH_SCOPE=https://purview.azure.net/.default

Check which your tenant uses:

az ad sp show --id "73c2949e-da2d-457a-9607-fcc665198967" --query servicePrincipalNames -o json

Command Groups

pvw account          Account management
pvw collections      Collections CRUD and permissions
pvw entity           Entity read, create, update, bulk operations
pvw glossary         Classic glossary terms
pvw lineage          Lineage creation and CSV import
pvw scan             Data source scanning
pvw search           Search and discovery
pvw types            Type definitions
pvw uc               Unified Catalog (domains, terms, data products, OKRs, CDEs, quality)
pvw workflow         Approval workflows
pvw diagnostics      Cache stats and profile info

Run pvw <command> --help for full options on any command.


📚 Quick Start & Documentation

Quick Reference Guide

For a comprehensive command reference with examples, see docs/quick-reference.md

This guide covers:

  • All Unified Catalog commands (terms, domains, data products, CDEs, OKRs)
  • Data Quality commands and workflow examples
  • Facets, hierarchy, and relationship operations
  • Common patterns and troubleshooting tips

Additional Documentation



Examples

Search

# Search by keyword
pvw search query --keywords "customer" --limit 10

# Table output (default), JSON, or colored JSON
pvw search query --keywords "sales" --limit 5
pvw search query --keywords "sales" --limit 5 --output json
pvw search query --keywords "sales" --limit 5 --output jsonc

# Show GUIDs in output (useful for follow-up operations)
pvw search query --keywords "customer" --show-ids

# Autocomplete and suggestions
pvw search autocomplete --keywords "ord" --limit 5
pvw search suggest --keywords "prod" --limit 5

Entity

# List all entities
pvw entity list --limit 25

# Filter by type
pvw entity list --type-name azure_sql_table --limit 10

# Read entity by GUID
pvw entity read --guid "4fae348b-e960-42f7-834c-38f6f6f60000"

# Update a single attribute
pvw entity update-attribute \
  --guid "4fae348b-e960-42f7-834c-38f6f6f60000" \
  --attribute description \
  --value "Customer address data - SalesLT schema"

# Add a classification
pvw entity add-classification \
  --guid "ea3412c3-7387-4bc1-9923-11f6f6f60000" \
  --classification "MICROSOFT.PERSONAL.EMAIL"

# Business metadata
pvw entity add-business-metadata \
  --guid "entity-guid" \
  --bm-name "Compliance" \
  --attr-name "DataOwner" \
  --attr-value "finance-team"

Collections

# List collections and hierarchy
pvw collections list
pvw collections read-hierarchy --collection-name "Data Engineering"

# Create a collection
pvw collections create \
  --name "analytics" \
  --friendly-name "Analytics Team" \
  --description "Assets for the analytics team"

# View permissions
pvw collections read-permissions --collection-name "analytics"

Unified Catalog (UC)

# Domains
pvw uc domain list
pvw uc domain create --name "Finance" --description "Financial data governance"
pvw uc domain get --domain-id "abc-123"

# Glossary terms
pvw uc term list --domain-id "abc-123"
pvw uc term list --domain-id "abc-123" --output json
pvw uc term create --name "Customer" --domain-id "abc-123" --description "A person who purchases products"
pvw uc term show --term-id "term-456"
pvw uc term update --term-id "term-456" --description "Updated definition"
pvw uc term delete --term-id "term-456" --confirm

# Bulk term import from CSV
pvw uc term import-csv --csv-file samples/csv/uc_terms_bulk_example.csv --domain-id "abc-123" --dry-run
pvw uc term import-csv --csv-file samples/csv/uc_terms_bulk_example.csv --domain-id "abc-123"

# Bulk term import from JSON
pvw uc term import-json --json-file samples/json/term/uc_terms_bulk_example.json --domain-id "abc-123"

# Sync UC terms to a classic glossary
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid"
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing --delete-removed
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing --dry-run

# Data products
pvw uc dataproduct list --domain-id "abc-123"
pvw uc dataproduct create --name "Customer Analytics" --domain-id "abc-123" --type Analytical --status Draft
pvw uc dataproduct update --product-id "prod-789" --status Published --endorsed

# Link a data product to an entity
pvw uc dataproduct link-entity \
  --id "prod-789" \
  --entity-id "4fae348b-e960-42f7-834c-38f6f6f60000" \
  --type-name azure_sql_table

# Business metadata cleanup
pvw uc metadata list
pvw uc metadata cleanup --name "SecteursActivite" --check-only --verbose
pvw uc metadata cleanup --name "SecteursActivite" --verbose

# Delete a definition directly (definition/group name)
pvw uc metadata delete-definition --name "Glossaire" --dry-run
pvw uc metadata delete-definition --name "Glossaire"

# Objectives (OKRs)
pvw uc objective list --domain-id "abc-123"
pvw uc objective create --definition "Improve data quality score to 95%" --domain-id "abc-123"

# Critical Data Elements (CDEs)
pvw uc cde list --domain-id "abc-123"
pvw uc cde create --name "Social Security Number" --data-type String --domain-id "abc-123"
pvw uc cde link-entity --id "cde-789" --entity-id "ea3412c3-7387-4bc1-9923-11f6f6f60000"

# Facets and analytics
pvw uc term facets --output table
pvw uc dataproduct facets --domain-id "abc-123" --output json
pvw uc cde facets --output table

# Governance health
pvw uc health query
pvw uc health query --severity High
pvw uc health summary
pvw uc health update --action-id "action-guid" --status InProgress

Lineage

# Create column-level lineage
pvw lineage create-column \
  --process-name "ETL_Sales_Transform" \
  --source-table-guid "9ebbd583-4987-4d1b-b4f5-d8f6f6f60000" \
  --target-table-guids "c88126ba-5fb5-4d33-bbe2-5ff6f6f60000" \
  --column-mapping "ProductID:ProductID,Name:Name"

# Import from CSV
pvw lineage validate lineage_data.csv
pvw lineage import lineage_data.csv
pvw lineage sample output.csv --num-samples 10 --template detailed

Lineage CSV columns: source_entity_guid, target_entity_guid, relationship_type, process_name, description, confidence_score, owner, metadata

Classic Glossary

pvw glossary list-terms --glossary-guid "your-glossary-guid"
pvw glossary create-term --payload-file term.json

Workflows

pvw workflow list
pvw workflow get --workflow-id "workflow-123"
pvw workflow create --workflow-id "approval-1" --payload-file workflow-definition.json
pvw workflow execute --workflow-id "workflow-123"
pvw workflow executions --workflow-id "workflow-123"

Diagnostics

pvw diagnostics cache-stats
pvw diagnostics profile-info
pvw diagnostics clear-cache

Output Formats

Most list commands support --output:

Format Use case
table Default — human-readable Rich table
json Plain JSON for piping to PowerShell, bash, jq
jsonc Colored JSON for viewing in terminal

PowerShell example:

$terms = pvw uc term list --domain-id $domainId --output json | ConvertFrom-Json
$terms | Where-Object { $_.status -eq "Draft" } | Export-Csv draft_terms.csv -NoTypeInformation

Bash / jq example:

pvw uc term list --domain-id $DOMAIN_ID --output json | jq '.[] | .name'

Bulk Import CSV Format (Terms)

name,description,status,acronym,owner_id,resource_name,resource_url
Customer Acquisition Cost,Cost to acquire a new customer,Draft,CAC,<entra-object-id-guid>,Metrics Guide,https://docs.example.com

Notes:

  • owner_id must be an Entra ID Object ID (GUID), not an email address
  • Terms in unpublished domains must use Draft status
  • Sample files: samples/csv/uc_terms_bulk_example.csv, samples/json/term/uc_terms_bulk_example.json

Sample Files

Path Contents
samples/csv/uc_terms_bulk_example.csv 8 sample UC terms for import
samples/json/term/uc_terms_bulk_example.json 8 data management terms (JSON format)
samples/csv/lineage_example.csv Sample lineage relationships
samples/notebooks (basic)/ Basic Purview CLI notebook examples
samples/notebooks (plus)/ Advanced examples including bulk import

Documentation


Requirements

  • Python 3.8+
  • Microsoft Purview account
  • Azure CLI (az login) or Service Principal credentials

Support


License

See LICENSE for details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pvw_cli-1.11.13.tar.gz (293.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pvw_cli-1.11.13-py3-none-any.whl (315.3 kB view details)

Uploaded Python 3

File details

Details for the file pvw_cli-1.11.13.tar.gz.

File metadata

  • Download URL: pvw_cli-1.11.13.tar.gz
  • Upload date:
  • Size: 293.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pvw_cli-1.11.13.tar.gz
Algorithm Hash digest
SHA256 f9cdf4d4f5fa5ee2d80d643b90eb8b6b9932120a5a0b1d397f40b9e61b551c91
MD5 3a6805a97ff8a68d6514d98951268430
BLAKE2b-256 a641b9328fa0296c72dba2b4a1d9f41886e41440178964d88dada2c83490949a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pvw_cli-1.11.13.tar.gz:

Publisher: publish-to-pypi.yml on Keayoub/pvw-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pvw_cli-1.11.13-py3-none-any.whl.

File metadata

  • Download URL: pvw_cli-1.11.13-py3-none-any.whl
  • Upload date:
  • Size: 315.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pvw_cli-1.11.13-py3-none-any.whl
Algorithm Hash digest
SHA256 0923e72c167b24579f8770fdc9cd9be83ac96ad9a0f85e91b1394bc5a57217d6
MD5 afbf134f0fe3a32660327b4ebdf2c509
BLAKE2b-256 ef406ea4d585c8ff546c1f6eb65f3b1f2b5d19e32390969e810bf7f0820f1561

See more details on using hashes here.

Provenance

The following attestation bundles were made for pvw_cli-1.11.13-py3-none-any.whl:

Publisher: publish-to-pypi.yml on Keayoub/pvw-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page