Test Intelligence Engine
Project description
Aston AI - Code Intelligence
Aston is a code intelligence system for parsing, analyzing, and finding test coverage gaps in your code.
Installation
# Install from PyPI
pip install astonai
# Or install from source
git clone https://github.com/your-org/aston.git
cd aston
pip install -e .
Development
When developing for Aston AI, the package structure is kept simple with automatic subdirectory discovery:
# pyproject.toml
[tool.setuptools]
packages = ["testindex"] # All subdirectories are automatically included
This approach makes it easier to maintain the package structure as new modules are added without needing to update the package configuration.
Quick Start
# Install dependencies
pip install pytest pytest-cov # if not already installed
# Initialize your repo
aston init
# Run tests with coverage
pytest --cov --cov-report=xml
# Find testing gaps
aston coverage
Core Commands
# Initialize repository
aston init [--offline]
# Run tests with coverage
aston test
# Detect gaps (with options)
aston coverage --threshold 80 --json results.json --exit-on-gap
Advanced Usage
Coverage Analysis
# Find gaps with 80% threshold
aston coverage --threshold 80
# Save results to JSON
aston coverage --json gaps.json
# Exit with code 1 if gaps exist (for CI)
aston coverage --exit-on-gap
# Specify a coverage file
aston coverage --coverage-file path/to/coverage.xml
Debug Mode
# Enable debug logging
DEBUG=1 aston coverage
Environment Variables
NEO4J_URI=bolt://localhost:7687 # Optional Neo4j connection
NEO4J_USER=neo4j # Optional Neo4j username
NEO4J_PASS=password # Optional Neo4j password
Repository-Centric Design
Aston uses a repository-centric approach:
- All operations are relative to the repository root
- Data is stored in
.testindexdirectory at the repository root - Path resolution is normalized for consistent matching
- Works with both offline and Neo4j storage
Legacy Support
If you've been using TestIndex, you can still use the legacy scripts:
./scripts/testindex.sh init
./scripts/testindex.sh coverage
These will automatically use the aston command if available.
Benchmarks
Run the benchmarks to evaluate the system performance:
# Run all benchmarks
./run_benchmarks.sh
# Run a specific benchmark
export BENCH_REPO=bench_repo
python benchmarks/ingest_throughput.py
# Run coverage benchmark
python benchmarks/coverage_benchmark.py --threshold 50
The Knowledge v1 benchmark system measures the performance of key components with well-defined KPI targets:
- Schema F1 Score: Measures extraction accuracy (precision ≥ 0.95, recall ≥ 0.90)
- Query Latency: Measures P95 latency for queries (target < 80ms)
- Ingest Throughput: Measures code processing speed (target ≥ 200,000 LOC/min)
- Incremental Latency: Measures watcher processing speed for incremental updates (P95 < 30s)
- Vector Cost: Measures the cost efficiency of vector storage (target ≤ $15 per million vectors)
- Coverage Detection: Measures the performance and accuracy of test coverage gap detection
Running Benchmarks
# Simplified benchmark runner (recommended)
./run_benchmarks.sh # Run all benchmarks (full mode)
./run_benchmarks.sh --smoke # Run only schema F1 and query latency (smoke mode)
# Legacy benchmark runners
python run_benchmarks_with_setup.py
python run_benchmarks_with_setup.py --skip-clone
# Run specific benchmark
python run_schema_f1_benchmark.py
python run_query_latency_benchmark.py
python run_incremental_latency_benchmark.py
python run_vector_cost_benchmark.py
python benchmarks/coverage_benchmark.py
The run_benchmarks.sh script is the recommended way to run benchmarks as it:
- Sets up all necessary environment variables
- Automatically uses the mock watcher for incremental latency benchmark
- Supports both full and smoke test modes (matching CI behavior)
- Creates consistent output in the artifacts directory
- Does not scan the entire repository during incremental tests
Testing with Multiple Repositories
You can test with different repositories by:
-
Setting the
BENCH_REPOenvironment variable:export BENCH_REPO=/path/to/your/repo python run_benchmarks_with_setup.py
-
Using the Knowledge pod test infrastructure:
python test_knowledge_pod.py --test_type benchmarks --chunks_dir /path/to/processed/chunks
-
Creating your own gold standard:
# Generate a new gold standard for schema F1 testing python scripts/make_gold_sample.py --repo /path/to/your/repo
Benchmark Results
Latest benchmark results from CI:
| Benchmark | Target | Actual | Status |
|---|---|---|---|
| Schema F1 | ≥ 0.95 (precision), ≥ 0.90 (recall) | 1.0 (precision), 1.0 (recall) | ✅ PASSED |
| Query Latency | < 80ms (P95) | 58.6ms (P95) | ✅ PASSED |
| Ingest Throughput | ≥ 200,000 LOC/min | 1,511,033 LOC/min | ✅ PASSED |
| Incremental Latency | < 30s (P95) | 5.0s (P95) | ✅ PASSED |
| Vector Cost | ≤ $15/M vectors | $15.00/M vectors | ✅ PASSED |
| Coverage Detection | < 1s | 0.61s | ✅ PASSED |
Benchmark Repository Details
The benchmarks use a specific Django repository commit that is FROZEN for Knowledge v1:
- Repository: https://github.com/django/django.git
- Tag: 4.2.11
- SHA: 61a986f53d805e4d359ab61af60a2dcd55befe25
- Patch: patches/django_1k.diff (adds 1,000 lines of code for benchmark)
When recreating the benchmark environment, ensure that the repository is set to this exact commit and that the patch applies cleanly.
CI Pipeline
The CI pipeline runs benchmarks on pull requests and commits to main:
-
PR Smoke Tests (
pr_smoke.yml):- Triggered on pull requests to main
- Runs only schema F1 and query latency benchmarks (faster)
- Sets
CI_SMOKE=1environment variable - Equivalent to running
./run_benchmarks.sh --smokelocally
-
Full KPI Tests (
full_kpi.yml):- Triggered on pushes to main branch and nightly at midnight UTC
- Runs all benchmarks including coverage detection
- Equivalent to running
./run_benchmarks.shlocally
-
Benchmark Constants Check:
- Verifies that benchmark constants haven't been modified without tag update
All benchmark results are saved in the artifacts/ directory for analysis.
Requirements
- Python 3.9+
- Neo4j database (set credentials via environment variables, optional for offline mode)
- Git (for cloning benchmark repositories)
- SQLite (for vector storage benchmarking)
Environment Variables
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASS=your_password
BENCH_REPO=/path/to/repo # Optional, will use Django repo or mini-repo if not set
KINE_BENCHMARK_REPO=/path/to/repo # Used for incremental latency benchmark
KINE_WATCHER_SOCKET=/tmp/vsearch-watcher.sock # Socket path for watcher communication
KINE_TEST_OUTPUT_DIR=/path/to/output # Directory to save benchmark results
VECTOR_STORE_PATH=/path/to/vectors.sqlite # Path to the vector store
MOCK_PROCESSING_TIME=2 # Processing time for mock watcher in seconds
Knowledge Contract Package
We provide a Knowledge Contract package that can be used by other tools to integrate with the Knowledge v1 system:
Installation
# Install from GitHub release
pip install https://github.com/thusai/testindex-graph/releases/download/v0.1.0/testindex_knowledge_contract-0.1.0-py3-none-any.whl
For usage instructions, see the Knowledge Contract README.
Basic Usage
# Import schema constants
from testindex_knowledge_contract.schema import (
IMPL_LABEL, GAP_LABEL, PROP_ID, PROP_PATH, PROP_START, PROP_END, PROP_COVER
)
# Import Neo4j client
from testindex_knowledge_contract.neo4j_client import Neo4jClient
# Import benchmark constants
from testindex_knowledge_contract.bench_constants import DJANGO_SHA, PATCH_FILE
Neo4j Client Usage
# Connect using environment variables (NEO4J_URI, NEO4J_USER, NEO4J_PASS)
client = Neo4jClient()
# Or connect with specific credentials
client = Neo4jClient(
uri="bolt://localhost:7687",
username="neo4j",
password="password"
)
# Use with context manager for automatic connection management
with Neo4jClient() as client:
# Run a query
results = client.run_query(
"MATCH (n:Implementation) WHERE n.coverage < $threshold RETURN n.id, n.path",
{"threshold": 50}
)
# Process results
for record in results:
print(f"Low coverage: {record['n.id']} at {record['n.path']}")
Coverage Map 360 Integration Example
from testindex_knowledge_contract.schema import (
IMPL_LABEL, GAP_LABEL, PROP_ID, PROP_PATH, PROP_START, PROP_END, PROP_COVER
)
from testindex_knowledge_contract.neo4j_client import Neo4jClient
# Set up environment variables before running:
# export NEO4J_URI=bolt://localhost:7687
# export NEO4J_USER=neo4j
# export NEO4J_PASS=password
# Create client
client = Neo4jClient()
# Query uncovered implementation functions
query = f"""
MATCH (n:{IMPL_LABEL})
WHERE n.{PROP_COVER} < 10
RETURN n.{PROP_ID} as id, n.{PROP_PATH} as path,
n.{PROP_START} as start_line, n.{PROP_END} as end_line
"""
results = client.run_query(query)
# Process results - identify coverage gaps
gaps = []
for record in results:
gap = {
"impl_id": record["id"],
"path": record["path"],
"start_line": record["start_line"],
"end_line": record["end_line"],
"coverage": 0 # Assuming these are 0% covered
}
gaps.append(gap)
# Create coverage gap nodes in Neo4j
for i, gap in enumerate(gaps):
# Create a unique ID for the gap
gap_id = f"GAP_{gap['impl_id']}_{i}"
# Create a gap node in Neo4j
query = f"""
CREATE (g:{GAP_LABEL} {{
{PROP_ID}: $id,
{PROP_PATH}: $path,
{PROP_START}: $start_line,
{PROP_END}: $end_line,
{PROP_COVER}: $coverage
}})
RETURN g.{PROP_ID}
"""
result = client.run_query(query, {
"id": gap_id,
"path": gap["path"],
"start_line": gap["start_line"],
"end_line": gap["end_line"],
"coverage": gap["coverage"]
})
print(f"Created gap node: {gap_id}")
Using Gold Schema for Testing
import json
import importlib.resources
from testindex_knowledge_contract import test_data
# Access the gold schema data
gold_schema_path = importlib.resources.files(test_data).joinpath('gold_schema.json')
with open(gold_schema_path, 'r') as f:
gold_schema = json.load(f)
# Validate against your implementation
def validate_against_gold(gold_schema, your_implementation_results):
gold_nodes = {node["id"]: node for node in gold_schema["nodes"]}
gold_edges = [(edge["src"], edge["type"], edge["dst"]) for edge in gold_schema["edges"]]
# Check if your implementation's results match the gold standard
# Example validation logic:
implementation_nodes = {node["id"]: node for node in your_implementation_results["nodes"]}
implementation_edges = [(edge["src"], edge["type"], edge["dst"])
for edge in your_implementation_results["edges"]]
# Compute precision and recall
node_matches = set(gold_nodes.keys()) & set(implementation_nodes.keys())
edge_matches = set(gold_edges) & set(implementation_edges)
node_precision = len(node_matches) / len(implementation_nodes) if implementation_nodes else 0
node_recall = len(node_matches) / len(gold_nodes) if gold_nodes else 0
edge_precision = len(edge_matches) / len(implementation_edges) if implementation_edges else 0
edge_recall = len(edge_matches) / len(gold_edges) if gold_edges else 0
return {
"node_precision": node_precision,
"node_recall": node_recall,
"edge_precision": edge_precision,
"edge_recall": edge_recall
}
Benchmark Integration
from testindex_knowledge_contract.bench_constants import (
DJANGO_URL, DJANGO_TAG, DJANGO_SHA, PATCH_FILE
)
# Clone the benchmark repository at the exact SHA
import subprocess
import os
def setup_benchmark_repo():
repo_path = "benchmark_repo"
if not os.path.exists(repo_path):
# Clone the repository
subprocess.run([
"git", "clone", DJANGO_URL, repo_path
])
# Check out the exact SHA
subprocess.run([
"git", "-C", repo_path, "checkout", DJANGO_SHA
])
# Apply the benchmark patch if it exists
if os.path.exists(PATCH_FILE):
subprocess.run([
"git", "-C", repo_path, "apply", f"../{PATCH_FILE}"
])
return repo_path
# Use the benchmark repository
repo_path = setup_benchmark_repo()
print(f"Benchmark repository set up at {repo_path}")
Coverage Workflow
To check for test coverage gaps:
# Step 1: Initialize repository
testindex init
# Step 2: Run tests with coverage
testindex test
# OR use your existing test runner
pytest --cov --cov-report=xml
# Step 3: Check coverage gaps
testindex coverage
# Or with options
testindex coverage --exit-on-gap --threshold 80 --json gaps.json
The testindex coverage command will:
- Find coverage.xml in the repository
- Match coverage data to implementation nodes in the knowledge graph
- Detect gaps where coverage is below threshold
- Display results as a table or JSON file
Path Resolution
TestIndex uses a robust path resolution system that handles various path formats and ensures consistent path handling throughout the application. The system:
- Normalizes paths (case-insensitive, separator normalization)
- Supports multiple matching strategies (direct, normalized, suffix, basename)
- Logs detailed path resolution information for debugging
- Handles both absolute and relative paths
If you encounter path-related issues, you can enable debug logging to see detailed path resolution information:
DEBUG=1 testindex coverage
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file astonai-0.1.12.tar.gz.
File metadata
- Download URL: astonai-0.1.12.tar.gz
- Upload date:
- Size: 228.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa17c0cdf0cf7461ef9b42923079c04e69618e9162952f722f49c1f306042600
|
|
| MD5 |
b3640b4b4723ccfccbc232d039de19bf
|
|
| BLAKE2b-256 |
62a35895ddf841c9b425c0948de45ab5707bf49841916f45068431d8414853c0
|
File details
Details for the file astonai-0.1.12-py3-none-any.whl.
File metadata
- Download URL: astonai-0.1.12-py3-none-any.whl
- Upload date:
- Size: 263.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd19af498b207c87eeb326d5339aea7455d1e873a4c83d0d36316c54a0cfc70a
|
|
| MD5 |
a7f9cd324a150ba03bb023ac4f59ecbe
|
|
| BLAKE2b-256 |
a51a1c7c09bb70cc5f0c5a8b34d8b71692b1f13430f4f0d7292309802aa7c943
|