Skip to main content

Extract Azure DevOps Pull Request metrics to SQLite and generate PowerBI-compatible CSVs.

Project description

ado-git-repo-insights

CI codecov Python License

Extract Azure DevOps Pull Request metrics to SQLite and generate PowerBI-compatible CSVs.

Overview

This tool replaces the MongoDB-based ado-pull-request-metrics with a lightweight, file-based solution that:

  • Stores data in SQLite - No external database required
  • Runs as an Azure DevOps Pipeline Task - Scheduled daily extraction
  • Preserves the PowerBI CSV contract - Same filenames, columns, and ordering
  • Supports incremental + backfill extraction - Efficient daily updates with periodic convergence

Quick Start

Installation

pip install ado-git-repo-insights

Usage Options

This tool provides two ways to extract Azure DevOps Pull Request metrics:

Aspect CLI (Option 1) Extension (Option 2)
Requires Python Yes No (bundled)
Installation pip install Upload VSIX to ADO
Pipeline syntax Script steps Task step
Works outside ADO Yes No (ADO only)
Flexibility Higher Standard

Option 1: Python CLI

Best for users comfortable with Python/pip, custom scripts, and non-ADO CI/CD systems.

First Run (Extract Data)

ado-insights extract \
  --organization MyOrg \
  --projects "ProjectOne,ProjectTwo" \
  --pat $ADO_PAT \
  --database ./ado-insights.sqlite

Note: End date defaults to yesterday (to avoid incomplete data). Include today: --end-date $(date +%Y-%m-%d) (Bash) or --end-date (Get-Date -Format yyyy-MM-dd) (PowerShell)

Generate CSVs

ado-insights generate-csv \
  --database ./ado-insights.sqlite \
  --output ./csv_output

Backfill Mode (Weekly Convergence)

ado-insights extract \
  --organization MyOrg \
  --projects "ProjectOne,ProjectTwo" \
  --pat $ADO_PAT \
  --database ./ado-insights.sqlite \
  --backfill-days 60

Option 2: Azure DevOps Extension

Best for teams that prefer the ADO pipeline editor UI or want a self-contained task without managing Python dependencies.

steps:
  - task: ExtractPullRequests@2
    inputs:
      organization: 'MyOrg'
      projects: 'Project1,Project2'
      pat: '$(PAT_SECRET)'
      database: '$(Pipeline.Workspace)/data/ado-insights.sqlite'
      outputDir: '$(Pipeline.Workspace)/csv_output'

Installation:

  1. Download the .vsix from GitHub Releases
  2. Install in your ADO organization: Organization Settings → Extensions → Browse local extensions

PR Insights Dashboard

Once the extension is installed and a pipeline runs successfully with the aggregates artifact published, the PR Insights hub appears in the project navigation menu. The dashboard auto-discovers pipelines that publish aggregates.

Configuration precedence:

  1. ?dataset=<url> — Direct URL (dev/testing only)
  2. ?pipelineId=<id> — Query parameter override
  3. Extension settings — User-scoped saved preference (Project Settings → PR Insights Settings)
  4. Auto-discovery — Find pipelines with 'aggregates' artifact

Configuration

Create a config.yaml file:

organization: MyOrg

projects:
  - ProjectOne
  - ProjectTwo
  - Project%20Three  # URL-encoded names supported

api:
  base_url: https://dev.azure.com
  version: 7.1-preview.1
  rate_limit_sleep_seconds: 0.5
  max_retries: 3
  retry_delay_seconds: 5
  retry_backoff_multiplier: 2.0

backfill:
  enabled: true
  window_days: 60

Then run:

ado-insights extract --config config.yaml --pat $ADO_PAT

Azure DevOps Pipeline Integration

Use pr-insights-pipeline.yml for a production-ready template that includes:

  • Daily incremental extraction
  • Sunday backfill for data convergence
  • Dashboard-compatible aggregates artifact

See sample-pipeline.yml for additional reference.

Daily Schedule with Sunday Backfill

The production template uses a single daily schedule that detects Sundays for backfill:

schedules:
  - cron: "0 6 * * *"  # Daily at 6 AM UTC
    displayName: "Daily PR Extraction"
    branches:
      include: [main]
    always: true

On Sundays, the pipeline automatically performs a 60-day backfill for data convergence.

CSV Output Contract

The following CSVs are generated with exact schema and column order for PowerBI compatibility:

File Columns
organizations.csv organization_name
projects.csv organization_name, project_name
repositories.csv repository_id, repository_name, project_name, organization_name
pull_requests.csv pull_request_uid, pull_request_id, organization_name, project_name, repository_id, user_id, title, status, description, creation_date, closed_date, cycle_time_minutes
users.csv user_id, display_name, email
reviewers.csv pull_request_uid, user_id, vote, repository_id

Security & Permissions

PR Insights Dashboard (Phase 3)

The PR Insights dashboard reads data from pipeline-produced artifacts. Users must have Build Read permission on the analytics pipeline to view dashboard data.

Requirement Details
Permission scope Build → Read on the pipeline that produces artifacts
No special redaction Data is not filtered per-user; access is all-or-nothing
Artifact retention Operators must configure retention for their desired analytics window

If a user lacks permissions, the dashboard displays: "No access to analytics pipeline artifacts. Ask an admin for Build Read on pipeline X."

Governance

This project is governed by authoritative documents in agents/:

Development

# Setup
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows
pip install -e .[dev]

# Lint + Format
ruff check .
ruff format .

# Type Check
mypy src/

# Test
pytest

Contributing

Line Endings (Windows Developers)

This repo uses LF line endings for cross-platform compatibility. The .gitattributes file handles this automatically, but for best results:

# Recommended: Let .gitattributes be the source of truth
git config core.autocrlf false

# Alternative: Convert on commit (but not checkout)
git config core.autocrlf input

If you see "CRLF will be replaced by LF" warnings, that's expected behavior.

License

MIT

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ado_git_repo_insights-2.7.3.tar.gz (657.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ado_git_repo_insights-2.7.3-py3-none-any.whl (55.4 kB view details)

Uploaded Python 3

File details

Details for the file ado_git_repo_insights-2.7.3.tar.gz.

File metadata

  • Download URL: ado_git_repo_insights-2.7.3.tar.gz
  • Upload date:
  • Size: 657.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ado_git_repo_insights-2.7.3.tar.gz
Algorithm Hash digest
SHA256 9e5ff36beda8109e1dcb1c870a1cff928f1567840e72bc811e0ed7d57217fbb4
MD5 7c56bbeeb5f89da8014acdb6282663a7
BLAKE2b-256 6530ce0225330163e4cb63b8b5be08c1b626368a774412ff561857b015332ae0

See more details on using hashes here.

Provenance

The following attestation bundles were made for ado_git_repo_insights-2.7.3.tar.gz:

Publisher: release.yml on oddessentials/ado-git-repo-insights

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ado_git_repo_insights-2.7.3-py3-none-any.whl.

File metadata

File hashes

Hashes for ado_git_repo_insights-2.7.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5ead1accdb5b16ba9e11abf5642f9efa97fc31c5038aec2f051444e0f55be4a4
MD5 3b7afd67b79091190b5a7dac0a353ae3
BLAKE2b-256 ca54a4934bf0eb5caa79e723983aea4e74852e8ad6e5db4abcae841686def650

See more details on using hashes here.

Provenance

The following attestation bundles were made for ado_git_repo_insights-2.7.3-py3-none-any.whl:

Publisher: release.yml on oddessentials/ado-git-repo-insights

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page