Skip to main content

EU AI Act-compliant scaffold and utilities for automating ingest PRs with GitHub Copilot

Project description

pr-automation-agent

EU AI Act-compliant scaffold and shared library for automating ingest PRs with GitHub Copilot.

Works with plain Python, Prefect, Airflow, Dagster, or any pipeline framework. Zero pipeline framework required for the core package.

pr-agent welcome screen

Table of contents


What you get

Component Description
pr-agent scaffold CLI Generate a correctly structured ingest file in seconds
BaseRestFetcher Subclass and implement fetch_all() — REST ingest done
BaseGraphQLFetcher Single-request GraphQL fetcher with cursor pagination variant
BaseDbReplicator Incremental DB replication to Parquet with watermark support
DevEnvSecretResolver Reads PR_AGENT__<GROUP>__<KEY> env vars — swap for AWS SSM etc. in prod
Dagster integration Optional BaseRestAsset, BaseGraphQLAsset, BaseDbReplicationAsset
log_ai_contribution() Append EU AI Act Art. 52 audit records locally after AI PRs merge
.github/ workflows CI that tests, materializes examples, enforces EU AI Act disclosure
compliance/ AI transparency notice (Art. 52/53) and human oversight policy (Art. 14)

Requirements

  • Python 3.10 or higher
  • pip or uv

No other dependencies for the core package. Dagster, pandas, sqlalchemy, and requests are only needed if you use the Dagster integration or DB replication.


Install

# Core only — plain Python / any framework (no pipeline deps)
pip install pr-automation-agent

# With Dagster integration + DB replication support
pip install "pr-automation-agent[dagster]"

# With uv
uv add pr-automation-agent
uv add "pr-automation-agent[dagster]"

Startup time: ~50 ms import, ~380 ms total CLI start. Dagster, pandas, and sqlalchemy are lazy imports — they are never loaded unless you explicitly call them. The CLI stays fast regardless of which extras are installed.

Once installed, run pr-agent with no arguments to open the welcome screen.


Quickstart — plain Python

No pipeline framework required. The generated file is a plain callable you can import from any script, Prefect flow, Airflow task, cron job, or Django management command.

Step 1 — scaffold

pr-agent scaffold rest --provider stripe --entity invoices
scaffold rest output

This writes rest/stripe/invoices_asset.py (relative to your working directory) and creates __init__.py in every new subdirectory automatically.

Step 2 — fill in the TODOs

Open the generated file and replace the placeholder URL and auth headers with your real API call. Everything else — output path, date-stamped JSON, row count — is handled by the base class.

Step 3 — run it

python rest/stripe/invoices_asset.py
# Wrote 120 rows to tmp/stripe/invoices/2026-07-01.json

Or import the function from your pipeline:

from rest.stripe.invoices_asset import fetch_stripe_invoices

fetch_stripe_invoices()   # returns the output path

Quickstart — Dagster

Only do this if your repo already uses Dagster. Add --framework dagster to the scaffold command:

pr-agent scaffold rest --provider stripe --entity invoices --framework dagster

The generated file uses BaseRestAsset and the @asset decorator instead of a plain callable. Register it in your Definitions:

# defs.py
from dagster import Definitions, load_assets_from_modules
from pr_automation_agent.integrations.dagster import dev_env_secret_resolver_resource

import myrepo.ingest.rest.stripe.invoices_asset as stripe_invoices

defs = Definitions(
    assets=load_assets_from_modules([stripe_invoices]),
    resources={"secret_resolver": dev_env_secret_resolver_resource},
)

Verify it appears:

dagster asset list -m defs

All scaffold options

pr-agent scaffold TYPE [OPTIONS]

TYPE:
  rest      REST API ingest
  graphql   GraphQL API ingest (single-request or cursor-paginated)
  db        Incremental DB replication to Parquet

OPTIONS (rest / graphql):
  --provider NAME    Provider name, e.g. stripe, github   [required]
  --entity NAME      Entity/resource name, e.g. invoices  [required]
  --framework NAME   dagster  (omit for plain Python)

OPTIONS (db):
  --engine NAME      DB engine, e.g. postgres, mysql       [required]
  --table NAME       Table to replicate                    [required]
  --framework NAME   dagster  (omit for plain Python)

SHARED:
  --output-dir PATH  Write into this directory (default: current dir)
  --dry-run          Print the file without writing it

EXAMPLES:
  pr-agent scaffold rest    --provider stripe  --entity invoices
  pr-agent scaffold graphql --provider github  --entity issues
  pr-agent scaffold db      --engine postgres  --table orders
  pr-agent scaffold db      --engine postgres  --table orders --dry-run
  pr-agent scaffold rest    --provider stripe  --entity invoices --framework dagster
scaffold db output

Every generated file includes:

  • An EU AI Act Art. 52 header (date-stamped, names the tool)
  • The correct base class import and subclass stub
  • A single abstract method to implement (all I/O is handled for you)
  • A if __name__ == "__main__": entry point (plain Python) or @asset decorator (Dagster)

Secrets & configuration

The tool never touches your credentials directly. Set environment variables before running:

# DB replication — postgres example
export PR_AGENT__POSTGRES__URI="postgresql://user:pass@localhost/mydb"
export PR_AGENT__POSTGRES__SINCE="2024-01-01"

# Any other secret group follows the same pattern:
# PR_AGENT__<GROUP>__<KEY>=<value>

DevEnvSecretResolver (the default) reads these at runtime. To use a production secret backend, implement AbstractSecretResolver:

from pr_automation_agent import AbstractSecretResolver, SecretReference

class AwsSsmSecretResolver(AbstractSecretResolver):
    def resolve_as_str(self, ref: SecretReference) -> str:
        import boto3
        return boto3.client("ssm").get_parameter(
            Name=f"/pr-agent/{ref.group_name}/{ref.key}",
            WithDecryption=True,
        )["Parameter"]["Value"]

# Pass at construction time (plain Python)
path, rows = PostgresOrders(AwsSsmSecretResolver()).run()

For Dagster, register it as a resource instead of dev_env_secret_resolver_resource.


EU AI Act compliance

Risk classification: Limited Risk (not Annex III high-risk).

Article Obligation What this package does
Art. 50 Disclose AI interaction PR template checkbox + ai-generated GitHub label
Art. 52 Label AI-generated content Header in every scaffolded file, auto-inserted
Art. 53/56 GPAI deployer transparency compliance/AI_TRANSPARENCY_NOTICE.md
Art. 14 (practice) Human oversight compliance/HUMAN_OVERSIGHT_POLICY.md + required PR review

Audit trail — after a Copilot-generated PR is merged, record it:

from pr_automation_agent import log_ai_contribution

log_ai_contribution(
    file_path="ingest/rest/stripe/invoices_asset.py",
    ai_model="GitHub Copilot",
    human_reviewer="@yourhandle",
    pr_number="123",
)

Records are appended to compliance/audit_log/contributions.jsonl — a local file, never transmitted anywhere. See compliance/ for the full policy documents.


Data & privacy (GDPR)

The tool does not collect, transmit, or store any personal data.

Question Answer
Does it require an account or authentication? No
Does it send data to Aevoxis or any third party? No
Does it have telemetry or analytics? No
Does it make network requests at startup or during scaffold? No
Where are output files written? Locally, to tmp/ in your working directory
What does the audit log contain? File path, AI model name, reviewer handle, PR number, timestamp — written locally only
Who controls the audit log? You. It is a plain JSONL file in compliance/audit_log/.

The requests library (used by BaseRestFetcher and BaseGraphQLFetcher) makes HTTP calls only to URLs you configure in your own subclass — never to Aevoxis or any default endpoint.

If your ingest files process personal data (e.g. user records from an API), you are the data controller / processor under GDPR — the tool is infrastructure, not a data processor itself.

For full GDPR and regulatory documentation see PRIVACY.md.


Run the tests

pip install -e ".[dev]"
pytest tests/ -v

31 tests covering resolvers, base classes, Dagster integration, and the scaffold CLI (plain Python and Dagster variants, error cases, --dry-run, __init__.py creation, provider sanitisation).


Connect your own repo

See CONNECTING.md for:

  • Path A — GitHub template: click "Use this template", copy examples, done
  • Path B — pip install: add to an existing repo in minutes
  • Production secret backend examples (AWS SSM, GCP Secret Manager)

Feedback & contributing

Report a bug Request a feature Request a new ingest asset

All bug reports, feature requests, and ingest asset requests are tracked in GitHub Issues. Questions and general discussion go in GitHub Discussions.


License

GNU Affero General Public License v3 (AGPL-3.0).
Copyright (C) 2025 Vinita Silaparasetty, Aevoxis Solutions.

For commercial use, enterprise deployment, or licensing enquiries: info@aevoxis.deaevoxis.de

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pr_automation_agent-0.1.0.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pr_automation_agent-0.1.0-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file pr_automation_agent-0.1.0.tar.gz.

File metadata

  • Download URL: pr_automation_agent-0.1.0.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pr_automation_agent-0.1.0.tar.gz
Algorithm Hash digest
SHA256 51ec8175517773b05c3c2150abffa458c02e154abff01aeb6b486fea8b9eba63
MD5 71047cce71d5549e244dc248cb4dec05
BLAKE2b-256 868737586786afe4bbe752008e4886b29d99f6388e79c61d664f5ad4766870ca

See more details on using hashes here.

File details

Details for the file pr_automation_agent-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pr_automation_agent-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pr_automation_agent-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd50d2f64fe5fe2bd4e509207fdbd237bfa6c6537a78877cb14ab4ef94c1af72
MD5 195c64b106658342b18ff86736cd288a
BLAKE2b-256 a712e38d0b35fdc24203326aa61d3e397a352672d7950a9776451e93898c412c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page