Oven logo

Oven

Published

The CDK Construct Library for AWS::CloudWatch

pip install aws-cdk-aws-cloudwatch

Package Downloads

Weekly DownloadsMonthly Downloads

Authors

Project URLs

Requires Python

~=3.7

Amazon CloudWatch Construct Library

---

End-of-Support

AWS CDK v1 has reached End-of-Support on 2023-06-01. This package is no longer being updated, and users should migrate to AWS CDK v2.

For more information on how to migrate, see the Migrating to AWS CDK v2 guide.


Metric objects

Metric objects represent a metric that is emitted by AWS services or your own application, such as CPUUsage, FailureCount or Bandwidth.

Metric objects can be constructed directly or are exposed by resources as attributes. Resources that expose metrics will have functions that look like metricXxx() which will return a Metric object, initialized with defaults that make sense.

For example, lambda.Function objects have the fn.metricErrors() method, which represents the amount of errors reported by that Lambda function:

# fn: lambda.Function


errors = fn.metric_errors()

Metric objects can be account and region aware. You can specify account and region as properties of the metric, or use the metric.attachTo(Construct) method. metric.attachTo() will automatically copy the region and account fields of the Construct, which can come from anywhere in the Construct tree.

You can also instantiate Metric objects to reference any published metric that's not exposed using a convenience method on the CDK construct. For example:

hosted_zone = route53.HostedZone(self, "MyHostedZone", zone_name="example.org")
metric = cloudwatch.Metric(
    namespace="AWS/Route53",
    metric_name="DNSQueries",
    dimensions_map={
        "HostedZoneId": hosted_zone.hosted_zone_id
    }
)

Instantiating a new Metric object

If you want to reference a metric that is not yet exposed by an existing construct, you can instantiate a Metric object to represent it. For example:

metric = cloudwatch.Metric(
    namespace="MyNamespace",
    metric_name="MyMetric",
    dimensions_map={
        "ProcessingStep": "Download"
    }
)

Metric Math

Math expressions are supported by instantiating the MathExpression class. For example, a math expression that sums two other metrics looks like this:

# fn: lambda.Function


all_problems = cloudwatch.MathExpression(
    expression="errors + throttles",
    using_metrics={
        "errors": fn.metric_errors(),
        "faults": fn.metric_throttles()
    }
)

You can use MathExpression objects like any other metric, including using them in other math expressions:

# fn: lambda.Function
# all_problems: cloudwatch.MathExpression


problem_percentage = cloudwatch.MathExpression(
    expression="(problems / invocations) * 100",
    using_metrics={
        "problems": all_problems,
        "invocations": fn.metric_invocations()
    }
)

Search Expressions

Math expressions also support search expressions. For example, the following search expression returns all CPUUtilization metrics that it finds, with the graph showing the Average statistic with an aggregation period of 5 minutes:

cpu_utilization = cloudwatch.MathExpression(
    expression="SEARCH('{AWS/EC2,InstanceId} MetricName=\"CPUUtilization\"', 'Average', 300)",

    # Specifying '' as the label suppresses the default behavior
    # of using the expression as metric label. This is especially appropriate
    # when using expressions that return multiple time series (like SEARCH()
    # or METRICS()), to show the labels of the retrieved metrics only.
    label=""
)

Cross-account and cross-region search expressions are also supported. Use the searchAccount and searchRegion properties to specify the account and/or region to evaluate the search expression against.

Aggregation

To graph or alarm on metrics you must aggregate them first, using a function like Average or a percentile function like P99. By default, most Metric objects returned by CDK libraries will be configured as Average over 300 seconds (5 minutes). The exception is if the metric represents a count of discrete events, such as failures. In that case, the Metric object will be configured as Sum over 300 seconds, i.e. it represents the number of times that event occurred over the time period.

If you want to change the default aggregation of the Metric object (for example, the function or the period), you can do so by passing additional parameters to the metric function call:

# fn: lambda.Function


minute_error_rate = fn.metric_errors(
    statistic="avg",
    period=Duration.minutes(1),
    label="Lambda failure rate"
)

This function also allows changing the metric label or color (which will be useful when embedding them in graphs, see below).

Rates versus Sums

The reason for using Sum to count discrete events is that some events are emitted as either 0 or 1 (for example Errors for a Lambda) and some are only emitted as 1 (for example NumberOfMessagesPublished for an SNS topic).

In case 0-metrics are emitted, it makes sense to take the Average of this metric: the result will be the fraction of errors over all executions.

If 0-metrics are not emitted, the Average will always be equal to 1, and not be very useful.

In order to simplify the mental model of Metric objects, we default to aggregating using Sum, which will be the same for both metrics types. If you happen to know the Metric you want to alarm on makes sense as a rate (Average) you can always choose to change the statistic.

Labels

Metric labels are displayed in the legend of graphs that include the metrics.

You can use dynamic labels to show summary information about the displayed time series in the legend. For example, if you use:

# fn: lambda.Function


minute_error_rate = fn.metric_errors(
    statistic="sum",
    period=Duration.hours(1),

    # Show the maximum hourly error count in the legend
    label="[max: ${MAX}] Lambda failure rate"
)

As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph's legend.

If the metric is a math expression producing more than one time series, the maximum will be individually calculated and shown for each time series produce by the math expression.

Alarms

Alarms can be created on metrics in one of two ways. Either create an Alarm object, passing the Metric object to set the alarm on:

# fn: lambda.Function


cloudwatch.Alarm(self, "Alarm",
    metric=fn.metric_errors(),
    threshold=100,
    evaluation_periods=2
)

Alternatively, you can call metric.createAlarm():

# fn: lambda.Function


fn.metric_errors().create_alarm(self, "Alarm",
    threshold=100,
    evaluation_periods=2
)

The most important properties to set while creating an Alarms are:

  • threshold: the value to compare the metric against.
  • comparisonOperator: the comparison operation to use, defaults to metric >= threshold.
  • evaluationPeriods: how many consecutive periods the metric has to be breaching the the threshold for the alarm to trigger.

To create a cross-account alarm, make sure you have enabled cross-account functionality in CloudWatch. Then, set the account property in the Metric object either manually or via the metric.attachTo() method.

Alarm Actions

To add actions to an alarm, use the integration classes from the @aws-cdk/aws-cloudwatch-actions package. For example, to post a message to an SNS topic when an alarm breaches, do the following:

import aws_cdk.aws_cloudwatch_actions as cw_actions
# alarm: cloudwatch.Alarm


topic = sns.Topic(self, "Topic")
alarm.add_alarm_action(cw_actions.SnsAction(topic))

Notification formats

Alarms can be created in one of two "formats":

  • With "top-level parameters" (these are the classic style of CloudWatch Alarms).
  • With a list of metrics specifications (these are the modern style of CloudWatch Alarms).

For backwards compatibility, CDK will try to create classic, top-level CloudWatch alarms as much as possible, unless you are using features that cannot be expressed in that format. Features that require the new-style alarm format are:

  • Metric math
  • Cross-account metrics
  • Labels

The difference between these two does not impact the functionality of the alarm in any way, except that the format of the notifications the Alarm generates is different between them. This affects both the notifications sent out over SNS, as well as the EventBridge events generated by this Alarm. If you are writing code to consume these notifications, be sure to handle both formats.

Composite Alarms

Composite Alarms can be created from existing Alarm resources.

# alarm1: cloudwatch.Alarm
# alarm2: cloudwatch.Alarm
# alarm3: cloudwatch.Alarm
# alarm4: cloudwatch.Alarm


alarm_rule = cloudwatch.AlarmRule.any_of(
    cloudwatch.AlarmRule.all_of(
        cloudwatch.AlarmRule.any_of(alarm1,
            cloudwatch.AlarmRule.from_alarm(alarm2, cloudwatch.AlarmState.OK), alarm3),
        cloudwatch.AlarmRule.not(cloudwatch.AlarmRule.from_alarm(alarm4, cloudwatch.AlarmState.INSUFFICIENT_DATA))),
    cloudwatch.AlarmRule.from_boolean(False))

cloudwatch.CompositeAlarm(self, "MyAwesomeCompositeAlarm",
    alarm_rule=alarm_rule
)

A note on units

In CloudWatch, Metrics datums are emitted with units, such as seconds or bytes. When Metric objects are given a unit attribute, it will be used to filter the stream of metric datums for datums emitted using the same unit attribute.

In particular, the unit field is not used to rescale datums or alarm threshold values (for example, it cannot be used to specify an alarm threshold in Megabytes if the metric stream is being emitted as bytes).

You almost certainly don't want to specify the unit property when creating Metric objects (which will retrieve all datums regardless of their unit), unless you have very specific requirements. Note that in any case, CloudWatch only supports filtering by unit for Alarms, not in Dashboard graphs.

Please see the following GitHub issue for a discussion on real unit calculations in CDK: https://github.com/aws/aws-cdk/issues/5595

Dashboards

Dashboards are set of Widgets stored server-side which can be accessed quickly from the AWS console. Available widgets are graphs of a metric over time, the current value of a metric, or a static piece of Markdown which explains what the graphs mean.

The following widgets are available:

  • GraphWidget -- shows any number of metrics on both the left and right vertical axes.
  • AlarmWidget -- shows the graph and alarm line for a single alarm.
  • SingleValueWidget -- shows the current value of a set of metrics.
  • TextWidget -- shows some static Markdown.
  • AlarmStatusWidget -- shows the status of your alarms in a grid view.

Graph widget

A graph widget can display any number of metrics on either the left or right vertical axis:

# dashboard: cloudwatch.Dashboard
# execution_count_metric: cloudwatch.Metric
# error_count_metric: cloudwatch.Metric


dashboard.add_widgets(cloudwatch.GraphWidget(
    title="Executions vs error rate",

    left=[execution_count_metric],

    right=[error_count_metric.with(
        statistic="average",
        label="Error rate",
        color=cloudwatch.Color.GREEN
    )]
))

Using the methods addLeftMetric() and addRightMetric() you can add metrics to a graph widget later on.

Graph widgets can also display annotations attached to the left or the right y-axis.

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.GraphWidget(
    # ...

    left_annotations=[cloudwatch.HorizontalAnnotation(value=1800, label=Duration.minutes(30).to_human_string(), color=cloudwatch.Color.RED), cloudwatch.HorizontalAnnotation(value=3600, label="1 hour", color="#2ca02c")
    ]
))

The graph legend can be adjusted from the default position at bottom of the widget.

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.GraphWidget(
    # ...

    legend_position=cloudwatch.LegendPosition.RIGHT
))

The graph can publish live data within the last minute that has not been fully aggregated.

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.GraphWidget(
    # ...

    live_data=True
))

The graph view can be changed from default 'timeSeries' to 'bar' or 'pie'.

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.GraphWidget(
    # ...

    view=cloudwatch.GraphWidgetView.BAR
))

Alarm widget

An alarm widget shows the graph and the alarm line of a single alarm:

# dashboard: cloudwatch.Dashboard
# error_alarm: cloudwatch.Alarm


dashboard.add_widgets(cloudwatch.AlarmWidget(
    title="Errors",
    alarm=error_alarm
))

Single value widget

A single-value widget shows the latest value of a set of metrics (as opposed to a graph of the value over time):

# dashboard: cloudwatch.Dashboard
# visitor_count: cloudwatch.Metric
# purchase_count: cloudwatch.Metric


dashboard.add_widgets(cloudwatch.SingleValueWidget(
    metrics=[visitor_count, purchase_count]
))

Show as many digits as can fit, before rounding.

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.SingleValueWidget(
    metrics=[],

    full_precision=True
))

Text widget

A text widget shows an arbitrary piece of MarkDown. Use this to add explanations to your dashboard:

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.TextWidget(
    markdown="# Key Performance Indicators"
))

Alarm Status widget

An alarm status widget displays instantly the status of any type of alarms and gives the ability to aggregate one or more alarms together in a small surface.

# dashboard: cloudwatch.Dashboard
# error_alarm: cloudwatch.Alarm


dashboard.add_widgets(
    cloudwatch.AlarmStatusWidget(
        alarms=[error_alarm]
    ))

An alarm status widget only showing firing alarms, sorted by state and timestamp:

# dashboard: cloudwatch.Dashboard
# error_alarm: cloudwatch.Alarm


dashboard.add_widgets(cloudwatch.AlarmStatusWidget(
    title="Errors",
    alarms=[error_alarm],
    sort_by=cloudwatch.AlarmStatusWidgetSortBy.STATE_UPDATED_TIMESTAMP,
    states=[cloudwatch.AlarmState.ALARM]
))

Query results widget

A LogQueryWidget shows the results of a query from Logs Insights:

# dashboard: cloudwatch.Dashboard


dashboard.add_widgets(cloudwatch.LogQueryWidget(
    log_group_names=["my-log-group"],
    view=cloudwatch.LogQueryVisualizationType.TABLE,
    # The lines will be automatically combined using '\n|'.
    query_lines=["fields @message", "filter @message like /Error/"
    ]
))

Custom widget

A CustomWidget shows the result of an AWS Lambda function:

# dashboard: cloudwatch.Dashboard


# Import or create a lambda function
fn = lambda_.Function.from_function_arn(dashboard, "Function", "arn:aws:lambda:us-east-1:123456789012:function:MyFn")

dashboard.add_widgets(cloudwatch.CustomWidget(
    function_arn=fn.function_arn,
    title="My lambda baked widget"
))

You can learn more about custom widgets in the Amazon Cloudwatch User Guide.

Dashboard Layout

The widgets on a dashboard are visually laid out in a grid that is 24 columns wide. Normally you specify X and Y coordinates for the widgets on a Dashboard, but because this is inconvenient to do manually, the library contains a simple layout system to help you lay out your dashboards the way you want them to.

Widgets have a width and height property, and they will be automatically laid out either horizontally or vertically stacked to fill out the available space.

Widgets are added to a Dashboard by calling add(widget1, widget2, ...). Widgets given in the same call will be laid out horizontally. Widgets given in different calls will be laid out vertically. To make more complex layouts, you can use the following widgets to pack widgets together in different ways:

  • Column: stack two or more widgets vertically.
  • Row: lay out two or more widgets horizontally.
  • Spacer: take up empty space