Skip to main content

AWS Service Control Policy toolkit: lint, analyze, and simulate SCP impact

Project description

NPKT - AWS Service Control Policy Toolkit

"Apply SCPs with 0% chance of breaking existing workflows"

A unified Python toolkit for linting, analyzing, and simulating AWS Service Control Policy (SCP) impact against CloudTrail logs.

Features

  • Lint - Validate SCP syntax and detect common mistakes (36+ rules)
  • Analyze - Find conflicts, shadows, duplicates across policies
  • Simulate - Test SCP impact against real CloudTrail data before deployment
  • Validate - Quick syntax check for SCP JSON files
  • Generate Logs - Generate mock CloudTrail events for all 442 AWS services for testing

Simulation Capabilities

  • 400+ AWS services with resource ARN extraction (22 hardcoded quirky services + data-driven via IAM reference)
  • 1,150+ condition key extractors (88 hand-tuned + 1,069 auto-generated from IAM reference)
  • External context enrichment - supply org ID, principal/resource tags, VPC mappings, and management account ID via --context file to resolve normally-unevaluable condition keys
  • Service-linked role filtering - automatically excludes SLR events (SCPs don't apply to them)
  • Management account filtering - excludes management account events when management_account_id is in the context file
  • Simulation confidence scoring - reports whether denial rate is exact or a lower bound based on unevaluable condition keys
  • Strict conditions mode - --strict-conditions treats unevaluable conditions as non-matching to produce an upper-bound denial rate
  • Resource-level permissions lint rule - warns when SCP statements use non-* Resource with actions that don't support resource-level permissions (W052)
  • Multi-policy hierarchy warning - warns when multiple policies are evaluated as a flat set instead of OU hierarchy
  • Data event gap detection - warns when SCPs target events not in logs
  • Mock CloudTrail generation - generate test events for all 442 AWS services with realistic requestParameters and condition key values

Installation

Prerequisites

  • Python 3.10 or higher
  • pip

Install from source

git clone <repo-url>
cd NPKT

# Install the package
pip install -e .

# Or with dev dependencies
pip install -e ".[dev]"

Quick Start

Lint an SCP

npkt lint policy.json
npkt lint ./policies/
npkt lint policy.json --format json

Analyze policies for conflicts

npkt analyze ./policies/
npkt analyze policy.json
npkt analyze ./policies/ --format json

Simulate SCP impact

npkt simulate policy.json --logs ./cloudtrail/
npkt simulate policy.json --logs events.json --days 30
npkt simulate policy.json --logs ./logs/ --context context.json
npkt simulate policy.json --logs ./logs/ --quick

Validate syntax

npkt validate policy.json
npkt validate policy.json --verbose

Generate mock CloudTrail logs

npkt generate-logs -o logs.json
npkt generate-logs -o logs.json -s ec2,s3,iam -c 100
npkt generate-logs -o logs.json --write-only --seed 42
npkt generate-logs -o logs.json -c 1000 --regions us-east-1,eu-west-1

CLI Reference

npkt lint

Lint SCP policies for errors and best practices.

npkt lint <policy_path> [options]
Option Description
--format, -f Output format: text (default) or json
--strict Treat warnings as errors
--quiet, -q Only show errors, suppress warnings

Example output:

policy.json
  [W] W050: Statement denies all S3 actions without conditions (Statement.0)
  [I] W090: Statement uses NotAction (Statement.1)

Summary: 1 warning(s), 1 info

npkt analyze

Analyze policies for conflicts, shadows, and redundancies.

npkt analyze <policy_path> [options]
Option Description
--format, -f Output format: text (default) or json
--cross-policy/--no-cross-policy Enable cross-policy analysis (default: enabled)
--strict Treat warnings as errors

Detected issues:

  • DUPLICATE_STATEMENT - Identical statements across policies
  • DUPLICATE_SID - Duplicate statement IDs
  • SHADOW - Statement is overshadowed by another
  • CONFLICT - Conflicting Allow/Deny for same actions
  • UNREACHABLE - Allow statement blocked by Deny statements

npkt simulate

Simulate SCP impact against CloudTrail events.

npkt simulate <policy_path> --logs <logs_path> [options]
Option Description
--logs, -l Path to CloudTrail logs (required)
--context, -c Path to external context JSON file (org ID, tags, VPC mappings)
--format, -f Output format: text (default) or json
--output, -o Write output to file
--days, -d Days to analyze (default: 90)
--quick Quick analysis with sampling
--sample-size Sample size for quick mode (default: 1000)
--no-details Hide detailed denial list
--strict-conditions Treat unevaluable conditions as non-matching (worst-case upper-bound denial rate)

Exit codes:

  • 0 - No risk or low risk
  • 1 - Medium risk
  • 2 - High or critical risk

npkt validate

Quick syntax validation for SCP files.

npkt validate <policy_path> [--verbose]

npkt generate-logs

Generate mock CloudTrail log events for SCP simulation testing. Creates realistic events for any of the 442 AWS services in the IAM reference database, with proper requestParameters derived from IAM resource ARN patterns for high resource ARN resolution rates.

npkt generate-logs -o <output_path> [options]
Option Description
-o, --output Output file path (required)
-s, --services Comma-separated service prefixes or all (default: all)
-c, --count Total number of events to generate (default: 500)
--regions Comma-separated AWS regions (default: us-east-1,us-west-2,eu-west-1)
--account-id AWS account ID (default: 123456789012)
--seed Random seed for reproducible output
--write-only Only include write/mutative operations (skip read/list)

Example workflow - generate logs then simulate:

# Generate 500 write-only events for key services
npkt generate-logs -o test_logs.json -s ec2,s3,iam,lambda,rds -c 500 --write-only --seed 42

# Simulate your SCP against the generated events
npkt simulate policy.json --logs test_logs.json

# Simulate with external context for higher accuracy
npkt simulate policy.json --logs test_logs.json --context context.json

Python API

from npkt import (
    load_policy,
    load_policies_from_dir,
    SCPLinter,
    PolicyAnalyzer,
    analyze_policies,
    ImpactAnalyzer,
    FileIngester,
)

# Load and lint a policy
policy = load_policy("policy.json")
linter = SCPLinter()
report = linter.lint(policy.to_dict())

if report.has_errors:
    for result in report.errors:
        print(f"{result.code}: {result.message}")

# Analyze multiple policies
policies = load_policies_from_dir("./policies/")
analysis = analyze_policies(*policies)

for issue in analysis.issues:
    print(f"{issue.type.value}: {issue.message}")

# Simulate SCP impact
ingester = FileIngester("./cloudtrail/")
analyzer = ImpactAnalyzer(
    scp_policies=[policy],
    cloudtrail_ingester=ingester,
)
report = analyzer.analyze()

print(f"Denial rate: {report.denial_rate:.2%}")
print(f"Risk level: {report.get_risk_level()}")

# Simulate with external context for better accuracy
from npkt import ExternalContext

ctx = ExternalContext.from_file("context.json")
analyzer = ImpactAnalyzer(
    scp_policies=[policy],
    cloudtrail_ingester=ingester,
    external_context=ctx,
)
report = analyzer.analyze()

Project Structure

NPKT/
+-- src/npkt/               # Main package
|   +-- cli/                # CLI commands
|   |   +-- main.py         # Entry point
|   |   +-- lint.py         # lint command
|   |   +-- analyze.py      # analyze command
|   |   +-- simulate.py     # simulate command
|   |   +-- validate.py     # validate command
|   |   +-- generate.py     # generate-logs command
|   +-- models/             # Data models
|   |   +-- scp.py          # SCPStatement, SCPPolicy
|   |   +-- cloudtrail.py   # CloudTrailEvent
|   |   +-- report.py       # ImpactReport, EvaluationResult, EvaluationContext
|   |   +-- external_context.py # ExternalContext (--context FILE)
|   |   +-- lint.py         # LintReport, LintResult
|   |   +-- analysis.py     # AnalysisReport, Issue
|   +-- linter/             # SCP linter
|   +-- analyzer/           # Policy and impact analysis
|   +-- engine/             # SCP evaluation engine
|   +-- parsers/            # SCP parsers
|   +-- ingest/             # CloudTrail ingesters
|   +-- reporters/          # Output formatters
|   +-- generators/         # CloudTrail log generation
|   +-- data/               # IAM reference data
+-- tests/                  # Test suite (1219 tests)
|   +-- test_services/     # Per-service tests (48 files)
|   +-- test_cli/          # CLI command tests
|   +-- test_engine/       # Engine tests
    +-- fixtures/           # Test data

How It Works

  1. Parse SCP: Reads and validates SCP policies (JSON format)
  2. Ingest CloudTrail: Loads CloudTrail events from files (JSON/gzip)
  3. Filter: Excludes service-linked role events and management account events (SCPs don't apply)
  4. Extract Context: Resolves resource ARNs and condition key values from each event
  5. Enrich Context: If --context is provided, enriches each event with external data (org ID, principal/resource tags, VPC mappings)
  6. Evaluate: Tests each event against SCP statements (action, resource, principal, conditions)
  7. Track Confidence: Records unresolved resources and unevaluable condition keys; qualifies denial rate as exact or lower-bound
  8. Analyze: Aggregates results, calculates statistics, detects data event gaps
  9. Report: Generates output with risk assessment, confidence score, and recommendations

Understanding Risk Levels

Level Denial Rate Action
NONE 0% Safe to apply
LOW <1% Review denials, likely safe
MEDIUM 1-5% Careful review needed
HIGH 5-20% Significant impact expected
CRITICAL >20% Major impact, refine SCP first

Supported SCP Features

  • Effects: Allow, Deny
  • Actions: Wildcards (*, s3:*, s3:Delete*)
  • NotAction: Inverse action matching
  • Resources/NotResource: ARN pattern matching with wildcards
  • Principal/NotPrincipal: Principal ARN pattern matching
  • Conditions: 24 operators with IfExists and ForAll/ForAny modifiers
    • String: StringEquals, StringLike, StringEqualsIgnoreCase, etc.
    • ARN: ArnEquals, ArnLike, ArnNotEquals, ArnNotLike
    • Numeric: NumericEquals, NumericLessThan, NumericGreaterThan, etc.
    • IP: IpAddress, NotIpAddress
    • Date: DateEquals, DateLessThan, DateGreaterThan, etc.
    • Bool, Null

Resource ARN Extraction

NPKT uses a hybrid approach for extracting resource ARNs from CloudTrail events:

Hardcoded Quirky Services (22)

Services with non-trivial extraction logic that requires hand-tuned patterns:

Category Services
Compute EC2 (10 resource types, nested instancesSet), EKS (sub-resources under cluster)
Storage S3 (composite bucket/key, regionless)
Database RDS (colon separator db:id), ElastiCache (colon separator cluster:id)
Messaging SQS (URL parsing), EventBridge
Networking ELBv2, Route 53 (prefix stripping, regionless), CloudFront (regionless)
Security IAM (regionless, priority ordering), KMS (UUID/alias/ARN detection), WAFv2 (scope-based path), Organizations
Monitoring CloudWatch, CloudTrail, AWS Config
Integration Step Functions, SSM (leading slash stripping), CodePipeline
Data Glue (composite database/table)
DevOps CloudFormation

Data-Driven Services (400+)

All remaining services use IAM reference ARN patterns for automatic extraction:

  • ARN passthrough - Detects when parameters already contain valid ARNs
  • Template scoring - When multiple resource types exist, picks the best match by resolved placeholders and specificity
  • 8 parameter matching strategies - Exact, camelCase, lowercase, snake_case, abbreviation expansion, suffix stripping, aliases, name fallback
  • Regionless/accountless handling - Correctly handles services that omit region or account from ARNs

Condition Key Extraction

NPKT evaluates 1,150+ condition keys from CloudTrail events using a two-tier system:

Hand-Tuned Extractors (88 keys across 31 services)

Manually crafted extractors for keys with non-obvious mappings:

Service Example Keys
S3 s3:prefix, s3:delimiter, s3:x-amz-acl, s3:x-amz-server-side-encryption
EC2 ec2:instancetype, ec2:imageid, ec2:region, ec2:tenancy, ec2:volumetype
RDS rds:databaseclass, rds:databaseengine, rds:multi-az, rds:storagetype
Lambda lambda:functionarn, lambda:layer, lambda:runtime
KMS kms:viaservice, kms:callerarn, kms:encryptioncontext
IAM/STS iam:permissionsboundary, sts:rolesessionname, sts:externalid

Auto-Generated Extractors (1,069 keys across 200+ services)

Derived from IAM reference condition key definitions at startup. Key name parts are converted to requestParameters field candidates (e.g., sagemaker:VolumeKmsKeyId -> volumeKmsKeyId). Hand-tuned extractors always take priority.

External Context Enrichment

Some condition keys (aws:PrincipalOrgId, aws:PrincipalTag/*, aws:ResourceTag/*, aws:SourceVpc) require data not present in CloudTrail events. Without this data, NPKT conservatively assumes conditions match and reports them as unevaluable.

The --context flag lets you supply this data via a JSON file, turning "assumed match" into actual evaluation:

npkt simulate policy.json --logs ./logs/ --context context.json

Context file format

{
  "management_account_id": "123456789012",
  "organization": {
    "id": "o-a1b2c3d4e5",
    "paths": ["o-a1b2c3d4e5/r-ab12/ou-ab12-11111111"]
  },
  "principals": {
    "arn:aws:iam::123456789012:role/AdminRole": {
      "tags": { "Department": "Engineering", "Environment": "production" }
    },
    "arn:aws:iam::123456789012:role/*": {
      "tags": { "OrgUnit": "eng" }
    }
  },
  "resources": {
    "arn:aws:s3:::my-bucket": {
      "tags": { "Classification": "confidential" }
    },
    "arn:aws:s3:::public-*": {
      "tags": { "Classification": "public" }
    }
  },
  "vpc_map": {
    "vpce-0a1b2c3d": "vpc-11111111"
  }
}

What each section resolves

Section Effect
management_account_id Excludes events from this account (SCPs don't apply to management account)
organization.id Resolves aws:PrincipalOrgId condition key
organization.paths Resolves aws:PrincipalOrgPaths condition key
principals.*.tags Resolves aws:PrincipalTag/* condition keys
resources.*.tags Resolves aws:ResourceTag/* condition keys
vpc_map Resolves aws:SourceVpc (via VPC endpoint ID mapping)

Principal and resource ARN patterns support * and ? wildcards. Exact matches take priority over wildcards, and more specific patterns override less specific ones.

Gathering context data

The data for the context file can be collected with a few AWS CLI commands:

# Organization ID
aws organizations describe-organization --query 'Organization.Id'

# Principal tags
aws iam list-role-tags --role-name MyRole
aws iam list-user-tags --user-name MyUser

# Resource tags
aws resourcegroupstaggingapi get-resources --resource-type-filters ec2:instance

# VPC endpoint to VPC mapping
aws ec2 describe-vpc-endpoints --query 'VpcEndpoints[].{Id:VpcEndpointId,VpcId:VpcId}'

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=npkt

# Run specific test file
pytest tests/test_engine/test_scp_engine.py

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run linters
ruff check .
mypy src/

# Format code
ruff format .

Simulation Confidence

NPKT reports simulation confidence to help you understand result reliability:

X Would Be DENIED:  3 (>=0.3% -- lower bound, 2 unevaluable condition key(s))

  Filtered: 42 service-linked role event(s) (SCPs do not apply to SLRs)
  Filtered: 15 management account event(s) (SCPs do not apply to management account)

Simulation Confidence: MEDIUM
------------------------------------------------------------
  Resource ARN resolved: 847/1,000 events (84.7%)
  Unevaluable condition keys encountered:
    - aws:PrincipalOrgId (found in 3 evaluations)
    - aws:ResourceTag/Environment (found in 1 evaluation)

  WARNING: These conditions were assumed to MATCH (not deny).
  The actual denial rate may be HIGHER than reported.
  Supply a --context file to resolve evaluable keys, or use
  --strict-conditions to treat unevaluable conditions as denials (worst-case).

Use --context to resolve unevaluable keys and improve confidence:

npkt simulate policy.json --logs ./logs/ --context context.json

Use --strict-conditions for worst-case analysis (upper-bound denial rate):

npkt simulate policy.json --logs ./logs/ --strict-conditions

Running both modes gives a range: the normal mode shows a lower bound and strict mode shows an upper bound. The actual denial rate is somewhere in between.

Confidence Levels:

  • HIGH - Resource resolution >95%, few unevaluable keys
  • MEDIUM - Some resources unresolved or unevaluable keys present
  • LOW - Significant data gaps, results may be unreliable

Known Limitations

CloudTrail Ingestion

  • File-based only: CloudTrail logs must be downloaded locally (JSON or gzip format)
  • No S3 direct access: Cannot read logs directly from S3 buckets
  • Data events: S3 object operations, Lambda invocations, and DynamoDB item operations require explicit CloudTrail data event logging. Most trails only capture management events -- a deny rule targeting s3:GetObject would show 0% denial rate if data events weren't enabled, giving false confidence. NPKT warns when SCPs target these events but none are found in logs.

Condition Keys Not Evaluable

Some condition keys require external context not available in CloudTrail. Most of these can be resolved by providing a --context file (see External Context Enrichment):

Key Type Examples Resolvable via --context?
Organization context aws:PrincipalOrgId, aws:PrincipalOrgPaths Yes
Principal tags aws:PrincipalTag/* Yes
Resource tags aws:ResourceTag/* Yes
VPC context aws:SourceVpc Yes (via VPC endpoint mapping)
Service-specific keys s3:prefix, s3:x-amz-acl, kms:ViaService, etc. No
Multi-factor auth aws:MultiFactorAuthAge No

When these keys are encountered without a context file, NPKT assumes the condition matches (conservative approach -- the reported denial rate is a lower bound) and tracks them in the simulation confidence report. SCPs that rely heavily on service-specific condition keys will have less accurate results. Use --strict-conditions to flip this assumption and get an upper-bound denial rate.

Resource ARN Extraction

  • 400+ services supported: 22 hardcoded quirky services + data-driven extraction via IAM reference for all others
  • 5-layer extraction: Direct ARN fields, quirky service patterns, ~190 known ARN parameter keys, IAM reference template resolution, response element scan
  • Resolution rate: Typically 80-98% depending on service mix. The simulation confidence section reports the exact resolution rate so you can assess impact.

SCP Evaluation Scope

  • SCP layer only: This tool evaluates SCPs in isolation. It does not model identity policies, resource policies, permissions boundaries, or session policies. An action the SCP allows could still be denied by other policy types (and vice versa).
  • Service-linked roles: Automatically filtered out (SCPs do not apply to SLRs)
  • Management account: Filtered when management_account_id is provided in the context file
  • Resource-level permissions: Linter warns (W052) when actions that don't support resource-level permissions are paired with non-* Resource restrictions. The simulator does not yet adjust Resource matching for these actions.
  • OU hierarchy: SCPs are inherited at every level (Root, OU, Account) and all must allow an action. This tool evaluates provided policies as a flat set and warns when multiple policies are provided.

Troubleshooting

If you encounter issues, see TROUBLESHOOTING.md.

Common checks:

  1. Validate your SCP: npkt validate policy.json
  2. Run tests: pytest (1219 tests verify functionality)
  3. Check CloudTrail format: Ensure valid JSON with Records array

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

npkt-1.0.1.tar.gz (305.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

npkt-1.0.1-py3-none-any.whl (311.8 kB view details)

Uploaded Python 3

File details

Details for the file npkt-1.0.1.tar.gz.

File metadata

  • Download URL: npkt-1.0.1.tar.gz
  • Upload date:
  • Size: 305.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for npkt-1.0.1.tar.gz
Algorithm Hash digest
SHA256 a7c47d074bea20047711c50a875c0e29e993878c0e0b6c518ec45b1a53bbf457
MD5 b194abad48c80cde7c91d4b9a710064c
BLAKE2b-256 4713538ff225f11a499d215c52879bea32cc16487e2f69fbd5fc95a9a4af15c2

See more details on using hashes here.

File details

Details for the file npkt-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: npkt-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 311.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for npkt-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f92360183e55a9e407f8b08ef879d3d25a4d3722f5d839de6dfe2a4beac7d20c
MD5 fbef654ff2a431c68e3116ec94ff66ee
BLAKE2b-256 8a26193eef2299cb90648a65290591cbf96ca71bd59aae9bae6b8885b1755297

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page