CloudWatch Alarms

Alerting & Automated Actions

Alarms Overview

CloudWatch alarms watch a single metric over a time period you specify, and perform one or more actions based on the value of the metric relative to a given threshold over a number of time periods. The action is a notification sent to an Amazon SNS topic or an Amazon EC2 Auto Scaling policy.

Monitor

Watches a single metric over a specified time period.

Alert

Triggers when the metric breaches a defined threshold.

Act

Initiates automated actions like notifications or scaling.

Alarm States

An alarm has three possible states:

OK

The metric is within the defined threshold.

ALARM

The metric has breached the threshold for the specified number of evaluation periods.

INSUFFICIENT_DATA

Not enough data is available for the metric to determine the alarm state.

Alarm Configuration

Configuring an alarm involves specifying the metric to watch and the conditions that trigger the alarm.

Key Configuration Parameters

Metric

The metric to monitor, including its Namespace, MetricName, and any Dimensions.

Statistic

The metric statistic to apply (e.g., Average, Minimum, Maximum, Sum).

Period

The length of time to evaluate the metric, in seconds.

Evaluation Periods

The number of most recent periods to evaluate when determining alarm state.

Datapoints to Alarm

The number of datapoints within the evaluation periods that must be breaching to cause the alarm to trigger.

Threshold

The value to compare the metric against.

Comparison Operator

The arithmetic operation to use for comparison (e.g., >=, >, <, <=).

Missing Data Treatment

How to treat missing datapoints (e.g., `breaching`, `notBreaching`, `ignore`, `missing`).

Alarm Actions

You can configure alarms to trigger a variety of automated actions when they change state.

SNS Notifications

Send a notification to an SNS topic. This can then be routed to email, SMS, Lambda, or an HTTP endpoint.

Auto Scaling Actions

Trigger an EC2 Auto Scaling policy to scale your fleet in or out in response to changing demand.

EC2 Actions

Stop, terminate, or reboot an EC2 instance. This is useful for automated recovery from instance failures.

Composite Alarms

Composite alarms allow you to combine multiple alarms using Boolean logic (AND, OR, NOT) to create more sophisticated alerting rules. This helps in reducing alarm noise by only triggering when multiple conditions are met.

Example Rule

ALARM(HighCPUAlarm) AND ALARM(HighMemoryAlarm)
This composite alarm would only enter the ALARM state if both the HighCPUAlarm and the HighMemoryAlarm are in the ALARM state simultaneously.

Key Takeaways

1
Alarms watch a single metric and have three states: OK, ALARM, and INSUFFICIENT_DATA.
2
Configuration involves setting the metric, statistic, period, evaluation periods, and threshold.
3
Actions include SNS notifications, EC2 Auto Scaling, and direct EC2 instance actions.
4
Composite alarms combine multiple alarms to reduce noise and create complex alerting logic.
5
Carefully configure "Datapoints to Alarm" and "Missing Data Treatment" to avoid false positives or negatives.