Skip to main content

Extract Azure DevOps Pull Request metrics to SQLite and generate PowerBI-compatible CSVs.

Project description

ado-git-repo-insights

CI codecov Python License

Extract Azure DevOps Pull Request metrics to SQLite and generate PowerBI-compatible CSVs.

Overview

This tool replaces the MongoDB-based ado-pull-request-metrics with a lightweight, file-based solution that:

  • Stores data in SQLite - No external database required
  • Runs as an Azure DevOps Pipeline Task - Scheduled daily extraction
  • Preserves the PowerBI CSV contract - Same filenames, columns, and ordering
  • Supports incremental + backfill extraction - Efficient daily updates with periodic convergence

Quick Start

Installation

pip install ado-git-repo-insights

Usage Options

This tool provides two ways to extract Azure DevOps Pull Request metrics:

Aspect CLI (Option 1) Extension (Option 2)
Requires Python Yes No (bundled)
Installation pip install Upload VSIX to ADO
Pipeline syntax Script steps Task step
Works outside ADO Yes No (ADO only)
Flexibility Higher Standard

Option 1: Python CLI

Best for users comfortable with Python/pip, custom scripts, and non-ADO CI/CD systems.

First Run (Extract Data)

ado-insights extract \
  --organization MyOrg \
  --projects "ProjectOne,ProjectTwo" \
  --pat $ADO_PAT \
  --database ./ado-insights.sqlite

Note: End date defaults to yesterday (to avoid incomplete data). Include today: --end-date $(date +%Y-%m-%d) (Bash) or --end-date (Get-Date -Format yyyy-MM-dd) (PowerShell)

Generate CSVs

ado-insights generate-csv \
  --database ./ado-insights.sqlite \
  --output ./csv_output

Backfill Mode (Weekly Convergence)

ado-insights extract \
  --organization MyOrg \
  --projects "ProjectOne,ProjectTwo" \
  --pat $ADO_PAT \
  --database ./ado-insights.sqlite \
  --backfill-days 60

Option 2: Azure DevOps Extension

Best for teams that prefer the ADO pipeline editor UI or want a self-contained task without managing Python dependencies.

steps:
  - task: ExtractPullRequests@2
    inputs:
      organization: 'MyOrg'
      projects: 'Project1,Project2'
      pat: '$(PAT_SECRET)'
      database: '$(Pipeline.Workspace)/data/ado-insights.sqlite'
      outputDir: '$(Pipeline.Workspace)/csv_output'

Installation:

  1. Download the .vsix from GitHub Releases
  2. Install in your ADO organization: Organization Settings → Extensions → Browse local extensions

PR Insights Dashboard

Once the extension is installed and a pipeline runs successfully with the aggregates artifact published, the PR Insights hub appears in the project navigation menu. The dashboard auto-discovers pipelines that publish aggregates.

Configuration precedence:

  1. ?dataset=<url> — Direct URL (dev/testing only)
  2. ?pipelineId=<id> — Query parameter override
  3. Extension settings — User-scoped saved preference (Project Settings → PR Insights Settings)
  4. Auto-discovery — Find pipelines with 'aggregates' artifact

Configuration

Create a config.yaml file:

organization: MyOrg

projects:
  - ProjectOne
  - ProjectTwo
  - Project%20Three  # URL-encoded names supported

api:
  base_url: https://dev.azure.com
  version: 7.1-preview.1
  rate_limit_sleep_seconds: 0.5
  max_retries: 3
  retry_delay_seconds: 5
  retry_backoff_multiplier: 2.0

backfill:
  enabled: true
  window_days: 60

Then run:

ado-insights extract --config config.yaml --pat $ADO_PAT

Azure DevOps Pipeline Integration

Use pr-insights-pipeline.yml for a production-ready template that includes:

  • Daily incremental extraction
  • Sunday backfill for data convergence
  • Dashboard-compatible aggregates artifact

See sample-pipeline.yml for additional reference.

Daily Schedule with Sunday Backfill

The production template uses a single daily schedule that detects Sundays for backfill:

schedules:
  - cron: "0 6 * * *"  # Daily at 6 AM UTC
    displayName: "Daily PR Extraction"
    branches:
      include: [main]
    always: true

On Sundays, the pipeline automatically performs a 60-day backfill for data convergence.

CSV Output Contract

The following CSVs are generated with exact schema and column order for PowerBI compatibility:

File Columns
organizations.csv organization_name
projects.csv organization_name, project_name
repositories.csv repository_id, repository_name, project_name, organization_name
pull_requests.csv pull_request_uid, pull_request_id, organization_name, project_name, repository_id, user_id, title, status, description, creation_date, closed_date, cycle_time_minutes
users.csv user_id, display_name, email
reviewers.csv pull_request_uid, user_id, vote, repository_id

Security & Permissions

PR Insights Dashboard (Phase 3)

The PR Insights dashboard reads data from pipeline-produced artifacts. Users must have Build Read permission on the analytics pipeline to view dashboard data.

Requirement Details
Permission scope Build → Read on the pipeline that produces artifacts
No special redaction Data is not filtered per-user; access is all-or-nothing
Artifact retention Operators must configure retention for their desired analytics window

If a user lacks permissions, the dashboard displays: "No access to analytics pipeline artifacts. Ask an admin for Build Read on pipeline X."

Governance

This project is governed by authoritative documents in agents/:

Development

# Setup
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows
pip install -e .[dev]

# Lint + Format
ruff check .
ruff format .

# Type Check
mypy src/

# Test
pytest

Contributing

Line Endings (Windows Developers)

This repo uses LF line endings for cross-platform compatibility. The .gitattributes file handles this automatically, but for best results:

# Recommended: Let .gitattributes be the source of truth
git config core.autocrlf false

# Alternative: Convert on commit (but not checkout)
git config core.autocrlf input

If you see "CRLF will be replaced by LF" warnings, that's expected behavior.

License

MIT

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ado_git_repo_insights-3.0.0.tar.gz (663.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ado_git_repo_insights-3.0.0-py3-none-any.whl (55.4 kB view details)

Uploaded Python 3

File details

Details for the file ado_git_repo_insights-3.0.0.tar.gz.

File metadata

  • Download URL: ado_git_repo_insights-3.0.0.tar.gz
  • Upload date:
  • Size: 663.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ado_git_repo_insights-3.0.0.tar.gz
Algorithm Hash digest
SHA256 eb39983e9c967913130e213cbb3eb69be600851fa94afe2b5e9650f5be05ab3b
MD5 055f45497f7971f2f73a121f68ee89aa
BLAKE2b-256 6b2e888b64a5eaa27642f6267556d59e435ae5e2985caa010b6a466f216c361c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ado_git_repo_insights-3.0.0.tar.gz:

Publisher: release.yml on oddessentials/ado-git-repo-insights

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ado_git_repo_insights-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ado_git_repo_insights-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 002ca8ff378b870b38b659d15b1161be12d5602dac385a003ed8040688a04689
MD5 bbfdefd3e4238d64b776b59230337e60
BLAKE2b-256 d1281d30005c57a3c09dd3252f0a006e2fbcd28d8bcf52906ab22f00cb692bb9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ado_git_repo_insights-3.0.0-py3-none-any.whl:

Publisher: release.yml on oddessentials/ado-git-repo-insights

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page