Skip to main content

Evaluate Azure Data Factory submissions from JSON/ZIP/folder inputs with comprehensive AI-driven grading and professional reporting.

Project description

ADFMentor

A Python package for evaluating Azure Data Factory (ADF) submissions with comprehensive AI-driven grading.

Provides accurate, fair assessment of ADF implementations including architecture, design quality, error handling, and best practices.

Features

  • AI-Driven Grading (Default): Comprehensive quality evaluation covering architecture, design, error handling, parameterization, and best practices
  • ADF JSON validation for ARM templates and repo-export style files
  • Safe ZIP extraction with size and file-count limits
  • Recursive discovery of ADF JSON artifacts
  • Optional rule-based grading for fast component checking (backward compatible)
  • Clear feedback with specific recommendations
  • Secure API key management via environment variables

Installation

pip install ADFMentor

Quick Start

from ADFMentor import ADFMentor

# Initialize mentor
mentor = ADFMentor(api_key="your-api-key", model_name="gemini-2.0-flash-exp")

# Evaluate with comprehensive AI grading (default - most accurate and fair)
result = mentor.evaluate_adf(
    submission_path="path/to/submission.zip",
    question="Grade this ADF assignment"
)

print(f"Score: {result['score']}/100")
print(f"Feedback:\n{result['feedback']}")
print(f"Metadata:\n{result['metadata_report']}")

# Optional: Use custom grading rubric
custom_rubric = """
Evaluate focus on: error handling, parameterization, 
security practices, and performance optimization.
Score 0-100 with detailed feedback.
"""

result_custom = mentor.evaluate_adf(
    submission_path="path/to/submission.zip",
    prompt=custom_rubric,
    question="Assess ADF quality"
)

# Optional: Use fast rule-based grading (backward compatible)
result_fast = mentor.evaluate_adf(
    submission_path="path/to/submission.zip",
    use_rule_based=True
)

Environment Setup

  1. Get a free Google API key: https://makersuite.google.com/app/apikey
  2. Copy .env.example to .env:
    cp .env.example .env
    
  3. Edit .env and add your API key:
    GEMINI_API_KEY=your_actual_key_here
    
  4. Important: Never commit .env to git (already protected in .gitignore)

Test files automatically load from .env.

Supported Formats

  • .json - ARM templates or repo-export JSON files
  • .zip - ADF export packages (may include .txt notes)
  • folder path - ADF project directory
  • .txt - Optional written explanations (AI grading only)

AI Grading Evaluation

The system uses Gemini AI (default: gemini-2.0-flash-exp) for comprehensive evaluation including:

  • Architecture & Design: Pipeline organization, activity flow, component relationships
  • Error Handling: Exception handling, retry logic, resilience patterns
  • Parameterization: Runtime flexibility, hardcoded values, reusability
  • Best Practices: Naming conventions, schema definitions, performance, security
  • Completeness: All required components, connections, configurations
  • Code Quality: JSON structure, documentation, clarity

Returns: Score (0-100) + detailed feedback with specific improvements.

Security Validation

Safe ZIP extraction with:

  • 200 file limit
  • 50MB total size limit
  • 10MB per-file limit
  • Path traversal protection

Custom Rubrics (PowerBI-Style)

For specialized evaluations, define custom prompts and pass them to evaluate_adf():

from ADFMentor import ADFMentor
import os

mentor = ADFMentor(api_key=os.getenv("GEMINI_API_KEY"))

# Define specialized prompts (like PowerBI DAX/Visual/Text pattern)
prompts = {
    "pipeline": """Evaluate pipeline logic correctness (40 pts), activity selection (30 pts), 
                   performance optimization (20 pts), and completeness (10 pts). 
                   Provide 50-80 word feedback.""",
    
    "architecture": """Evaluate architecture design (40 pts), error handling (30 pts), 
                        security practices (20 pts), and documentation (10 pts). 
                        Provide 50-80 word feedback.""",
    
    "configuration": """Evaluate configuration accuracy (40 pts), parameter usage (35 pts), 
                         security implementation (15 pts), and best practices (10 pts). 
                         Provide 50-80 word feedback."""
}

# Use specific prompt for different evaluation types
result_pipeline = mentor.evaluate_adf(
    submission_path="submission.zip",
    prompt=prompts["pipeline"],
    question="Evaluate the pipeline logic"
)

result_arch = mentor.evaluate_adf(
    submission_path="submission.zip",
    prompt=prompts["architecture"],
    question="Evaluate the architecture"
)

# Or use default comprehensive AI grading (no prompt needed)
result_default = mentor.evaluate_adf(submission_path="submission.zip")

Key Features:

  • Define prompts in your test file (like PowerBI pattern)
  • Pass different prompts for different evaluation focuses
  • Default AI grading works without custom prompts
  • 50-80 word concise feedback
ADFMentor/
  __init__.py
ADFMentorInternal/
  core.py
  models/
    gemini.py
  utils/
    adf_validator.py
    adf_processor.py
    adf_evaluator.py

API

ADFMentor

  • ADFMentor(api_key, model_name="gemini-2.0-flash-exp")
  • evaluate_adf(submission_path, prompt=None, question=None)

Returns:

{
  "score": 0,
  "feedback": "...",
  "metadata_report": "..."
}

Notes:

  • For .txt submissions, provide prompt (and optionally question) to enable AI grading.
  • For .zip/folder submissions, .txt files are treated as optional companion notes in AI mode.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adfmentor-1.0.0.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

adfmentor-1.0.0-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file adfmentor-1.0.0.tar.gz.

File metadata

  • Download URL: adfmentor-1.0.0.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for adfmentor-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b3e651a49010ca9369d913654eeae371018305cb734bd97284e6b586ca318b0a
MD5 f5b02783db6ec2ea72d91a15d3b28cac
BLAKE2b-256 fec1aaf3d8fc8fc34a4e56238c3277def1fa01f587975b48ba20606b856d6fbd

See more details on using hashes here.

File details

Details for the file adfmentor-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: adfmentor-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for adfmentor-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fc6ef0b8b114bfe9d669c40e426732c03854b01df0c1273b68aea254281e633
MD5 eaf4750ad45a3595214e546fade8bee4
BLAKE2b-256 1747f888036b697e26e3dbcab3d1a48a57be60677e22732e481dcf79c7507ed5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page