Skip to main content

AWS CDK Construct Library to automatically create CloudWatch Alarms for resources in a CDK app based on resource type.

Project description

cdk-library-cloudwatch-alarms

WIP - Library to provide constructs, aspects, and construct extensions to more easily set up alarms for AWS resources in CDK code based on AWS recommended alarms list. This project is still in early development so YMMV.

Usage

This library is flexible in its approach and there are multiple paths to configuring alarms depending on how you'd like to work with the recommended alarms.

Feature Availability

Intended feature list as of Aug 2024

  • Aspects to apply recommended alarms to a wide scope such as a whole CDK app

    • Ability to exclude specific alarms
    • Ability to define a default set of alarm actions
    • Ability to modify the configuration of each alarm type
    • Ability to exclude specific resources
  • Constructs to ease alarm configuration for individual resources at a granular scope

    • Constructs for each available alarm according to the coverage table
    • Constructs for applying all recommended alarms to a specific resource
    • Ability to exclude specific alarms from the all recommended alarms construct
  • Extended versions of resource constructs with alarm helper methods

Resource Coverage

If its not shown it hasn't been worked on.

Service Status Notes
S3 - [x] 4xxErrors
- [x] 5xxErrors
- [ ] OperationsFailedReplication
Replication errors are difficult to set up in CDK at the moment due to rule properties being IResolvables and replication rules not being available on the L2 Bucket construct
SQS - [x] ApproximateAgeOfOldestMessage
- [x] ApproximateNumberOfMessagesNotVisible
- [x] ApproximateNumberOfMessagesVisible
- [x] NumberOfMessagesSent
All alarms with the exception of number of messages sent require a user defined threshold because its very usecase specific
SNS - [x] NumberOfMessagesPublished
- [x] NumberOfNotificationsDelivered
- [x] NumberOfNotificationsFailed
- [x] NumberOfNotificationsFilteredOut-InvalidAttributes
- [x] NumberOfNotificationsFilteredOut-InvalidMessageBody
- [x] NumberOfNotificationsRedrivenToDlq
- [x] NumberOfNotificationsFailedToRedriveToDlq
- [ ] SMSMonthToDateSpentUSD
- [ ] SMSSuccessRate
Some alarms require a threshold to be defined. SMS alarms are not implememented.
Lambda - [ ] ClaimedAccountConcurrency
- [x] Errors
- [x] Throttles
- [x] Duration
- [x] ConcurrentExecutions
ClaimedAccountConcurrency is account wide and one time so not covered by this library at this time
RDS For database & cluster instances
- [x] CPUUtilization
- [x] DatabaseConnections
- [x] FreeableMemory
- [x] FreeLocalStorage
- [x] FreeStorageSpace
- [x] ReadLatency
- [x] WriteLatency
- [x] DBLoad

For clusters
- [x] AuroraVolumeBytesLeftTotal
- [x] AuroraBinlogReplicaLag
Some alarms require a threshold to be defined. AuroraVolumeBytesLeftTotal and AuroraBinlogReplicaLag alarms are created only for Aurora MySQL clusters.
ECS - [x] CPUUtilization
- [x] MemoryUtilization
- [x] EphemeralStorageUtilized
- [x] RunningTaskCount
The alarms are applied to FargateService constructs only. EphemeralStorageUtilized requires a threshold to be defined.
EFS - [x] PercentIOLimit
- [x] BurstCreditBalance
The alarms are applied to FileSystem constructs.
ApiGateway - [x] 4XXError
- [x] 5XXError
- [x] Count
- [x] Latency
The alarms are applied to RestApi constructs only. Count requires a threshold to be defined. Alarms are automatically created using the ApiName and Stage dimensions. To create Count or Latency alarms using the Resource and Method dimensions, the corresponding properties must be explicitly specified.
CloudFront - [x] 5xxErrorRate
- [x] OriginLatency
- [x] FunctionValidationErrors
- [x] FunctionExecutionErrors
- [x] FunctionThrottles
The alarms are applied to Distribution constructs only. Both 5xxErrorRate and OriginLatency require a threshold to be defined. To create Function level alarms using the FunctionName dimension, the corresponding properties must be explicitly specified.
DynamoDB Mandatory alarms
- [x] ReadThrottleEvents
- [x] SystemErrors
- [x] WriteThrottleEvents

Replication alarms (optional)
- [x] AgeOfOldestUnreplicatedRecord
- [x] FailedToReplicateRecordCount
- [x] ThrottledPutRecordCount
The alarms are applied to Table constructs only. All the mandatory alarms require a threshold to be defined.
Replication alarms are created only if the corresponding configuration is specified. Each replication alarm has a default DelegatedOperation dimension value:
- AgeOfOldestUnreplicatedRecord: StreamRecords
- FailedToReplicateRecordCount: StreamRecords
- ThrottledPutRecordCount: PutItem
EC2
- [x] CPUUtilization
- [x] StatusCheckFailed

The alarms are applied to Instance constructs.
AutoScaling
- [x] GroupInServiceCapacity

The alarms are applied to AutoScalingGroup constructs. The alarm requires a threshold to be defined and the AutoScalingGroup should have this metric explicitly enabled.
ElastiCache
- [x] DatabaseMemoryUsagePercentage
- [x] EngineCPUUtilization
- [x] ReplicationLag
The alarms are applied to CfnCacheCluster and CfnReplicationGroup constructs. DatabaseMemoryUsagePercentage and ReplicationLag require a threshold to be defined.
PrivateLink Endpoints
- [x] PacketsDropped

Endpoint Services
- [x] RstPacketsSent
The alarms are applied to InterfaceVpcEndpoint and VpcEndpointService constructs. Because these objects do not expose the attributes required by alarms, they cannot be implemented using the Aspect. In all cases, the threshold must be defined.
VPN
- [x] TunnelState

The alarms are applied to CfnVPNConnection constructs.
ELBv2 For ApplicationLoadBalancer
- [x] RejectedConnectionCount
- [x] HTTPCode_ELB_4XX_Count
- [x] HTTPCode_ELB_5XX_Count
- [x] HTTPCode_Target_5XX_Count

For ApplicationTargetGroup
- [x] HealthyHostCount
- [x] UnHealthyHostCount

For NetworkLoadBalancer
- [x] TCP_ELB_Reset_Count
- [x] TCP_Target_Reset_Count

For NetworkTargetGroup
- [x] HealthyHostCount
- [x] UnHealthyHostCount
For target groups, HealthyHostCount alarm triggers when count falls below threshold (default: 1) and UnHealthyHostCount alarm triggers when count exceeds threshold (default: 0). For load balancers, all alarms trigger when count exceeds threshold (default: 0).

Aspects

Below is an example of configuring the Lambda aspect. You must configure non-defaults for alarms which is most cases is only a threshold. Since the aspect is applied at the app level it applies to both the TestStack and TestStack2 lambda functions and will create all available recommended alarms for those functions. See references for additional details on Aspects which can be applied to the app, stack, or individual constructs depending on your use case.

import { App, Stack, Aspects, aws_lambda as lambda } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const stack2 = new Stack(app, 'TestStack2', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const appAspects = Aspects.of(app);

appAspects.add(
  new recommendedalarms.LambdaRecommendedAlarmsAspect({
    configDurationAlarm: {
      threshold: 15,
    },
    configErrorsAlarm: {
      threshold: 1,
    },
    configThrottlesAlarm: {
      threshold: 0,
    },
  }),
);

new lambda.Function(stack, 'Lambda', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

new lambda.Function(stack2, 'Lambda2', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

Recommended Alarm Constructs

You can also apply alarms to a specific resource using the recommended alarm construct for a given resource type. For example if you have an S3 Bucket you might do something like below. None of the S3 alarms require configuration so no config props are needed in this case:

import { App, Stack, Aspects, aws_s3 as s3 } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const bucket = new s3.Bucket(stack, 'Bucket', {});

new recommendedalarms.S3RecommendedAlarms(stack, 'RecommendedAlarms', {
  bucket,
});

Individual Constructs

You can also apply specific alarms from their individual constructs:

import { App, Stack, Aspects, aws_s3 as s3 } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const bucket = new s3.Bucket(stack, 'Bucket', {});

new recommendedalarms.S3Bucket5xxErrorsAlarm(stack, 'RecommendedAlarms', {
  bucket,
  threshold: 0.10,
});

Construct Extensions

You can use extended versions of the constructs you are familiar with to expose helper methods for alarms if you'd like to keep alarms more tightly coupled to specific resources.

import { App, Stack, Aspects, aws_s3 as s3 } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

  const bucket = new recommendedalarms.Bucket(stack, 'Bucket', {});

  bucket.applyRecommendedAlarms();

Alarm Actions

You can apply alarm actions using the default actions on an aspect or all recommended alarms construct or you can apply individual alarm actions for helper methods of individual constructs. See below where default actions are set but an override is set for a specific alarm for the alarm action to use a different SNS topic.

import { App, Stack, Aspects, aws_lambda as lambda } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const stack2 = new Stack(app, 'TestStack2', {
  env: {
    account: '123456789012',
    region: 'us-east-1',
  },
});

const alarmTopic = new sns.Topic(stack, 'Topic');
const topicAction =  new cloudwatch_actions.SnsAction(alarmTopic)

const alarmTopic2 = new sns.Topic(stack, 'Topic');
const topicAction2 =  new cloudwatch_actions.SnsAction(alarmTopic2)

const appAspects = Aspects.of(app);

appAspects.add(
  new recommendedalarms.LambdaRecommendedAlarmsAspect({
    defaultAlarmAction: topicAction,
    defaultOkAction: topicAction,
    defaultInsufficientDataAction: topicAction,
    configDurationAlarm: {
      threshold: 15,
      alarmAction: topicAction2,
    },
    configErrorsAlarm: {
      threshold: 1,
    },
    configThrottlesAlarm: {
      threshold: 0,
    },

  }),
);

new lambda.Function(stack, 'Lambda', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

new lambda.Function(stack2, 'Lambda2', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

Exclusions

You can exclude specific alarms or specific resources. Alarms use the available metrics enums and resources use the string used for a resources id. For example below Lambda1 will not have alarms created and there will be no alarm for the Duration metric for either lambda function.

import { App, Stack, Aspects, aws_lambda as lambda } from 'aws-cdk-lib';
import * as recommendedalarms from '@renovosolutions/cdk-library-cloudwatch-alarms';

const app = new App();
const stack = new Stack(app, 'TestStack', {
  env: {
    account: '123456789012', // not a real account
    region: 'us-east-1',
  },
});

const appAspects = Aspects.of(app);

appAspects.add(
  new recommendedalarms.LambdaRecommendedAlarmsAspect({
    excludeResources: ['Lambda1'],
    excludeAlarms: [recommendedalarms.LambdaRecommendedAlarmsMetrics.DURATION],
    configDurationAlarm: {
      threshold: 15,
    },
    configErrorsAlarm: {
      threshold: 1,
    },
    configThrottlesAlarm: {
      threshold: 0,
    },
  }),
);

new lambda.Function(stack, 'Lambda1', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

new lambda.Function(stack, 'Lambda2', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromInline('exports.handler = async (event) => { console.log(event); }'),
});

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file renovosolutions_aws_cdk_cloudwatch_alarms-0.0.6.tar.gz.

File metadata

File hashes

Hashes for renovosolutions_aws_cdk_cloudwatch_alarms-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1b960df73b69e6bee6d790ca50ba3b66656892a2457dc53213d237c70a469ab7
MD5 bb5291ed68311cba3839a3900fb44651
BLAKE2b-256 cf436c04bdb0306d7bac08a0efe39f26d2620fab3c735dd5d12d3dd600af7791

See more details on using hashes here.

File details

Details for the file renovosolutions.aws_cdk_cloudwatch_alarms-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for renovosolutions.aws_cdk_cloudwatch_alarms-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 3f425f09f89de5d7e1f62548f350a3fb6963dfdf9cf2f4d3e75f65ca723a8c00
MD5 e79994183e7cf4f883c05bf360a20f6c
BLAKE2b-256 ffe6ef7ef43128bacbac417c8f9e6dcc2d76ba01617ee872003b8a4af1885d93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page