Terraform Module for Datadog Monitors on AWS

Terraform module to provision Datadog Monitors on AWS.

Usage

Simple setup

Create simple Datadog monitors with default configurations.

    module "datadog" {
        source  = "app.terraform.io/ncodelibrary/datadog/aws"
        version = "0.1.3"
        system = true
        lambda = true
        k8s    = true
        ecs    = true
        rds    = true
        elb    = true
        alb    = true
        tags   = {
            Owner = "example@nclouds.com"
        }
    }

For more details on a working example, please visit examples/simple

Advanced Setup

If you want to create Datadog monitors with custom configuration e.g custom metrics, custom tags for triggers, custom notification templates etc., you can use the module like this:

    module "datadog" {
        source  = "app.terraform.io/ncodelibrary/datadog/aws"
        version = "0.1.3"
        system = true
        elb    = true
        tags   = {
            Owner = "example@nclouds.com"
        }
        system_queries = {
            disk_total = "avg:system.disk.total"
            mem_total  = "avg:system.mem.total"
            disk_free  = "avg:system.disk.free"
            mem_used   = "avg:system.mem.used"
            load       = "avg:system.load.norm.5"
            cpu        = "avg:system.cpu.idle"
        }
        elb_queries = {
            surge_queue_length  = "avg:aws.elb.surge_queue_length"
            active_connection   = "avg:aws.elb.active_connection_count"
            unhealthy_host      = "avg:aws.elb.unhealthy_host_count"
            response_time       = "avg:aws.elb.target_response_time.average"
            request_count       = "avg:aws.elb.request_count"
            healthy_host        = "avg:aws.elb.healthy_host_count"
            spill_over          = "avg:aws.elb.spillover_count"
            count_4xx           = "avg:aws.elb.httpcode_elb_4xx"
            count_5xx           = "avg:aws.elb.httpcode_elb_5xx"
            latency             = "avg:aws.elb.latency"
        }
        system_trigger_by = "{host,Environment}"
        elb_trigger_by    = "{host,region,Environment}"
        elb_alert_message = <<EOF
        Value: {{value}}
        Name: {{host.name}}
        Region: {{region.name}}
        Environment: {{Environment.name}}

        {{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}
        {{#is_alert}}@pagerduty{{/is_alert}}
        {{#is_warning}}@pagerduty{{/is_warning}}
        {{#is_recovery}}@pagerduty{{/is_recovery}}
        @slack-alerts
        EOF

        system_alert_message = <<EOF
        Value: {{value}}
        Name: {{host.name}}
        Region: {{region.name}}
        Environment: {{Environment.name}}

        {{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}
        {{#is_alert}}@pagerduty{{/is_alert}}
        {{#is_warning}}@pagerduty{{/is_warning}}
        {{#is_recovery}}@pagerduty{{/is_recovery}}
        @slack-alerts
        EOF
    }

For more options refer to a working example at examples/advanced

Examples

Here are some working examples of using this module:

Requirements

Name Version
terraform >= 0.12

Providers

No providers.

Modules

Name Source Version
active_connection ./modules/datadog-monitor
alb_4xx ./modules/datadog-monitor
alb_5xx ./modules/datadog-monitor
alb_active_connection ./modules/datadog-monitor
alb_healthy_host_count ./modules/datadog-monitor
alb_rejected ./modules/datadog-monitor
alb_request_count ./modules/datadog-monitor
alb_target_response_time ./modules/datadog-monitor
alb_unhealthy_host_count ./modules/datadog-monitor
cpu_utilization ./modules/datadog-monitor
crashloopbackoff ./modules/datadog-monitor
current_conn ./modules/datadog-monitor
diff_in_nodes_daemonset ./modules/datadog-monitor
diff_in_replica_replicaset ./modules/datadog-monitor
diff_in_replica_statefulset ./modules/datadog-monitor
diff_in_replicas_deployment ./modules/datadog-monitor
ec_failover ./modules/datadog-monitor
ecs_cluster_cpu_reservation ./modules/datadog-monitor
ecs_cluster_cpu_utilization ./modules/datadog-monitor
ecs_cluster_mem_reservation ./modules/datadog-monitor
ecs_cluster_mem_utilization ./modules/datadog-monitor
ecs_service_cpu_utilization ./modules/datadog-monitor
ecs_service_mem_utilization ./modules/datadog-monitor
ecs_task_count ./modules/datadog-monitor
elb_4xx ./modules/datadog-monitor
elb_5xx ./modules/datadog-monitor
elb_healthy_host_count ./modules/datadog-monitor
elb_latency ./modules/datadog-monitor
elb_request_count ./modules/datadog-monitor
elb_spillover ./modules/datadog-monitor
elb_surge_queue_length ./modules/datadog-monitor
elb_target_response_time ./modules/datadog-monitor
elb_unhealthy_host_count ./modules/datadog-monitor
engine_cpu_utilization ./modules/datadog-monitor
evictions ./modules/datadog-monitor
failover ./modules/datadog-monitor
host_mem ./modules/datadog-monitor
imagepull_backoff ./modules/datadog-monitor
job_failed ./modules/datadog-monitor
lag ./modules/datadog-monitor
lambda_concurrent_executions ./modules/datadog-monitor
lambda_duration ./modules/datadog-monitor
lambda_errors ./modules/datadog-monitor
lambda_throtttle_executions ./modules/datadog-monitor
memc_mem ./modules/datadog-monitor
node_disk_pressure ./modules/datadog-monitor
node_mem_pressure ./modules/datadog-monitor
node_nw_unavailable ./modules/datadog-monitor
node_out_of_disk ./modules/datadog-monitor
pod_memory ./modules/datadog-monitor
rds_burst_balance ./modules/datadog-monitor
rds_cpu_utilization ./modules/datadog-monitor
rds_cpucredit_balance ./modules/datadog-monitor
rds_db_connections ./modules/datadog-monitor
rds_disk_queuedepth ./modules/datadog-monitor
rds_disk_usage ./modules/datadog-monitor
rds_mem_utilization ./modules/datadog-monitor
rds_read_iops ./modules/datadog-monitor
rds_read_latency ./modules/datadog-monitor
rds_read_throughput ./modules/datadog-monitor
rds_replica_lag ./modules/datadog-monitor
rds_write_iops ./modules/datadog-monitor
rds_write_latency ./modules/datadog-monitor
rds_write_throughput ./modules/datadog-monitor
redis_mem ./modules/datadog-monitor
system_cpu ./modules/datadog-monitor
system_disk ./modules/datadog-monitor
system_load ./modules/datadog-monitor
system_mem ./modules/datadog-monitor
unable_to_place_event ./modules/datadog-monitor
unscheduled_nodes ./modules/datadog-monitor

Resources

No resources.

Inputs

Name Description Type Default Required
alb variable to enable elb monitors creation bool false no
alb_queries Variabls for defining datadog alb queries. map
{
“active_connection”: “avg:aws.applicationelb.active_connection_count”,
“count_4xx”: “avg:aws.applicationelb.httpcode_elb_4xx”,
“count_5xx”: “avg:aws.applicationelb.httpcode_elb_5xx”,
“healthy_host”: “avg:aws.applicationelb.healthy_host_count”,
“rejected”: “avg:aws.applicationelb.rejected_connection_count”,
“request_count”: “avg:aws.applicationelb.request_count”,
“response_time”: “avg:aws.applicationelb.target_response_time.average”,
“unhealthy_host”: “avg:aws.applicationelb.un_healthy_host_count”
}
no
ec variable to enable Elasticache monitors creation bool false no
ec_alert_message Alert message to add in Elasticache monitors. string "Value: {{value}}\nCluster: {{cacheclusterid.name}}\nNode: {{cachenodeid.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
ec_queries Variabls for defining datadog Elasticache queries. map
{
“cpu”: “avg:aws.elasticache.cpuutilization”,
“current_conn”: “avg:aws.elasticache.curr_connections”,
“engine_cpu”: “avg:aws.elasticache.engine_cpuutilization”,
“evictions”: “avg:aws.elasticache.evictions”,
“host_mem”: “avg:aws.elasticache.freeable_memory”,
“lag”: “avg:aws.elasticache.replication_lag”,
“memc_mem”: “avg:aws.elasticache.bytes_used_for_cache_items”,
“redis_mem”: “avg:aws.elasticache.bytes_used_for_cache”
}
no
ec_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{cacheclusterid,cachenodeid,Environment}" no
ecs Enable ecs monitors creation bool false no
ecs_alert_message Alert message to add in all ecs monitors. string "Cluster: {{clustername.name}}\nService: {{servicename.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
ecs_queries Variables for defining datadog ecs queries. map
{
“cpu_reservation”: “avg:aws.ecs.cpureservation”,
“cpu_utilization”: “avg:aws.ecs.cpuutilization”,
“desired”: “avg:aws.ecs.service.desired”,
“mem_reservation”: “avg:aws.ecs.memory_reservation”,
“mem_utilization”: “avg:aws.ecs.memory_utilization”,
“running”: “avg:aws.ecs.service.running”
}
no
ecs_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{clustername,servicename,region,Environment}" no
elb variable to enable elb monitors creation bool false no
elb_alert_message Alert message to add in all AWS elb and alb monitors. string "Value: {{value}}\nName: {{host.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
elb_queries Variabls for defining datadog elb queries. map
{
“active_connection”: “avg:aws.elb.active_connection_count”,
“count_4xx”: “avg:aws.elb.httpcode_elb_4xx”,
“count_5xx”: “avg:aws.elb.httpcode_elb_5xx”,
“healthy_host”: “avg:aws.elb.healthy_host_count”,
“latency”: “avg:aws.elb.latency”,
“request_count”: “avg:aws.elb.request_count”,
“response_time”: “avg:aws.elb.target_response_time.average”,
“spill_over”: “avg:aws.elb.spillover_count”,
“surge_queue_length”: “avg:aws.elb.surge_queue_length”,
“unhealthy_host”: “avg:aws.elb.unhealthy_host_count”
}
no
elb_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{host,region,Environment}" no
from Datadog Monitor option to evaluate data from tag defined.. map
{
“tag”: “Environment”,
“tag_value”: “prod”
}
no
k8s variable to enable Kubernetes monitors creation bool false no
k8s_alert_message Alert message to add in k8s monitors. string "Cluster: {{kubernetes_cluster.name}} \nStatefulset: {{statefulset.name}} \nReplicaset: {{replicaset.name}}\nDeployment: {{deployment.name}} \nPod: {{pod_name.name}}\nValue: {{value}}\nName: {{host.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
k8s_pod_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{pod_name,replicaset,deployment,kube_namespace,kubernetes_cluster}" no
k8s_queries Variabls for defining datadog kubernetes queries. This works for eks also. map
{
“available_replicas_depl”: “avg:kubernetes_state.deployment.replicas_available”,
“desired_replicas_deploy”: “avg:kubernetes_state.deployment.replicas_desired”,
“desired_replicas_rep_set”: “avg:kubernetes_state.replicaset.replicas_desired”,
“desired_replicas_stateful”: “sum:kubernetes_state.statefulset.replicas_desired”,
“kubelet_api”: “kubernetes.kubelet.check.ping”,
“kuber_container_waiting”: “max:kubernetes_state.container.waiting”,
“kuber_desired_daemonset”: “avg:kubernetes_state.daemonset.desired”,
“kuber_failed_jobs”: “avg:kubernetes_state.job.failed”,
“kuber_node_status”: “sum:kubernetes_state.node.status”,
“kuber_pod_mem_limit”: “avg:kubernetes.memory.limits”,
“kuber_pod_mem_requests”: “avg:kubernetes.memory.requests”,
“kuber_pod_mem_usage”: “avg:kubernetes.memory.usage”,
“kuber_ready_daemonset”: “avg:kubernetes_state.daemonset.ready”,
“node_disk_pressure”: “kubernetes_state.node.disk_pressure”,
“node_mem_pressure”: “kubernetes_state.node.memory_pressure”,
“node_nw_unavailable”: “kubernetes_state.node.network_unavailable”,
“node_outof_disk”: “kubernetes_state.node.out_of_disk”,
“ready_replicas_rep_set”: “avg:kubernetes_state.replicaset.replicas_ready”,
“ready_replicas_stateful”: “sum:kubernetes_state.statefulset.replicas_ready”
}
no
k8s_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{statefulset,kubernetes_cluster,kube_namespace,replicaset,node}" no
lambda variable to enable lambda monitors creation bool false no
lambda_alert_message Alert message to add in AWS Lambda monitors. string "Value: {{value}}\nFunction: {{functionname.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
lambda_queries Variabls for defining datadog AWS Lambda queries. map
{
“concurrent”: “avg:aws.lambda.concurrent_executions”,
“duration”: “avg:aws.lambda.duration.maximum”,
“errors”: “avg:aws.lambda.errors”,
“throttle”: “avg:aws.lambda.throttles”
}
no
lambda_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{functionname,Environment}" no
name Name/Title for datadog monitors map
{
“alb_4xx”: “ALB: 4xx Error count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_5xx”: “ALB: 5xx Error count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_active_connection”: “ALB: Active connection count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_healthy_host”: “ALB: Healthy host count is low on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_rejected”: “ALB: Spill over count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_request_count”: “ALB: Request count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_response_time”: “ALB: Response time is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“alb_unhealthy_host”: “ALB: Unhealthy host count is high on ALB: {{host.name}}, ENV: {{Environment.name}}”,
“clustercpu_reservation”: “ECS: CPU reservation is high on Cluster: {{clustername.name}}, ENV: {{Environment.name}}”,
“clustercpu_utilization”: “ECS: CPU utilization is high on Cluster: {{clustername.name}}, ENV: {{Environment.name}}”,
“clustermem_reservation”: “ECS: Memory reservation is high on Cluster: {{clustername.name}}, ENV: {{Environment.name}}”,
“clustermem_utilization”: “ECS: Memory Utilization is high on Cluster: {{clustername.name}}, ENV: {{Environment.name}}”,
“count”: “ECS: Difference in Desired and Running tasks count on Cluster: {{clustername.name}}, Service {{servicename.name}}”,
“ec_cpu”: “Elasticache: CPU Utilization is high on Cluster: {{cacheclusterid.name}}”,
“ec_current_conn”: “Elasticache: High number of Current Connections on Cluster: {{cacheclusterid.name}}”,
“ec_engine_cpu”: “Elasticache: Engine CPU Utilization is high on Cluster: {{cacheclusterid.name}}”,
“ec_evictions”: “Elasticache: Evictions count is high on Cluster: {{cacheclusterid.name}}”,
“ec_failover”: “Elasticache: Elasticache failover triggered on Cluster: {{cacheclusterid.name}}”,
“ec_host_mem”: “Elasticache: Host memory usage is high on Cluster: {{cacheclusterid.name}}”,
“ec_lag”: “Elasticache: Lag is high on Cluster: {{cacheclusterid.name}}”,
“ec_memc_mem”: “Elasticache: Memcached memory usage is high on Cluster: {{cacheclusterid.name}}”,
“ec_redis_mem”: “Elasticache: Redis memory usage is high on Cluster: {{cacheclusterid.name}}”,
“elb_4xx”: “ELB: 4xx Error count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_5xx”: “ELB: 5xx Error count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_active_connection”: “ELB: Active connection count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_healthy_host”: “ELB: Healthy host count is low on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_latency”: “ELB: Latency is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_request_count”: “ELB: Request count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_response_time”: “ELB: Response time is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_spill_over”: “ELB: Spill over count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_surge_queue_length”: “ELB: Surge queue length is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“elb_unhealthy_host”: “ELB: Unhealthy host count is high on ELB: {{host.name}}, ENV: {{Environment.name}}”,
“k8s_crashloopbackoff”: “Kubernetes: Pod is in CrashLoopBackOff state on Cluster: {{kubernetes_cluster.name}}”,
“k8s_daemonset”: “Kubernetes: Difference in Desired and Running daemon pod on Cluster: {{kubernetes_cluster.name}}”,
“k8s_deployment”: “Kubernetes: Difference in Desired and Running replicas on Deployment: {{deployment.name}}”,
“k8s_failed_job”: “Kubernetes: Job failed on Cluster: {{kubernetes_cluster.name}}”,
“k8s_imagepull_backoff”: “Kubernetes: Pod is in ImagePullBackOff state on Cluster: {{kubernetes_cluster.name}}”,
“k8s_pod_mem”: “Kubernetes: Memory usage is high on Pod {{pod_name.name}} on Cluster: {{kubernetes_cluster.name}}”,
“k8s_replicaset”: “Kubernetes: Difference in Desired and Running replicas on Replicaset: {{replicaset.name}}”,
“k8s_statefulset”: “Kubernetes: Difference in Desired and Running replicas on Statefulset: {{statefulset.name}}”,
“kubelet_api”: “Kubernetes: Kubelet API is not available on Cluster: {{kubernetes_cluster.name}}”,
“lambda_concurrent”: “Lambda: High number of Concurrent executions on Function: {{functionname.name}}”,
“lambda_duration”: “Lambda: Duration is high on Function: {{functionname.name}}”,
“lambda_errors”: “Lambda: High number of errors on Function: {{functionname.name}}”,
“lambda_throttle”: “Lambda: Invocations are throttled on Function: {{functionname.name}}”,
“node_disk_pressure”: “Kubernetes: Node under disk pressure on Node {{node_name.name}} on Cluster: {{kubernetes_cluster.name}}”,
“node_mem_pressure”: “Kubernetes: Node under memory pressure on Node {{node_name.name}} on Cluster: {{kubernetes_cluster.name}}”,
“node_nw_unavailable”: “Kubernetes: Node network unavailable on Node {{node_name.name}} on Cluster: {{kubernetes_cluster.name}}”,
“node_out_of_disk”: “Kubernetes: Node under disk pressure on Node {{node_name.name}} on Cluster: {{kubernetes_cluster.name}}”,
“rds_burst_balance”: “RDS: CPU burst balance is low on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_cpu_credit_balance”: “RDS: CPU credit balance is low on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_cpu_utilization”: “RDS: CPU utilization is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_db_connections”: “RDS: DB conections are high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_disk_queue_depth”: “RDS: Disk queue depth is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_disk_usage”: “RDS: Disk usage is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_failover”: “RDS: Failover triggered on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_mem_utilization”: “RDS: Memory usage is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_read_iops”: “RDS: Read IOPS is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_read_latency”: “RDS: Read latency is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_read_throughput”: “RDS: Read throughput is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_replica_lag”: “RDS: Replica Lag is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_write_iops”: “RDS: Write IOPS high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_write_latency”: “RDS: Write latency is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“rds_write_throughput”: “RDS: Write throughput is high on RDS: {{dbinstanceidentifier.name}}, ENV: {{Environment.name}}”,
“servicecpu_utilization”: “ECS: CPU Utilization is high on Service: {{servicename.name}}, ENV: {{Environment.name}}”,
“servicemem_utilization”: “ECS: Memory Utilization is high on Service: {{servicename.name}}, ENV: {{Environment.name}}”,
“system_cpu”: “System: CPU usage is high on Host: {{host.name}}, ENV: {{Environment.name}}”,
“system_disk”: “System: DISK usage is high on Host: {{host.name}}, Device: {{device.name}}, ENV: {{Environment.name}}}”,
“system_load”: “System: Load is high on Host: {{host.name}}, ENV: {{Environment.name}}”,
“system_mem”: “System: MEM usage is high on Host: {{host.name}}, ENV: {{Environment.name}}”,
“unable_to_place_task”: “ECS: Unable to place task on Service: {{servicename.name}}, ENV: {{Environment.name}}”,
“unscheduled_nodes”: “Kubernetes: High % of unscheduled nodes on Cluster: {{kubernetes_cluster.name}}”
}
no
rds Enable rds monitors creation bool false no
rds_alert_message Alert message to add in all rds monitors. string "Value: {{value}}\nRDS Name: {{dbinstanceidentifier.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
rds_queries Variabls for defining datadog rds queries. map
{
“burst_balance”: “avg:aws.rds.burst_balance”,
“cpu_credit_balance”: “avg:aws.rds.cpucredit_balance”,
“cpu_utilization”: “avg:aws.rds.cpuutilization”,
“db_connections”: “avg:aws.rds.database_connections”,
“disk_free”: “avg:aws.rds.free_storage_space”,
“disk_queue_depth”: “avg:aws.rds.disk_queue_depth”,
“disk_total”: “avg:aws.rds.total_storage_space”,
“mem_utilization”: “avg:aws.rds.freeable_memory”,
“read_iops”: “avg:aws.rds.read_iops”,
“read_latency”: “avg:aws.rds.read_latency”,
“read_throughput”: “avg:aws.rds.read_throughput”,
“replica_lag”: “avg:aws.rds.replica_lag”,
“write_iops”: “avg:aws.rds.write_iops”,
“write_latency”: “avg:aws.rds.write_latency”,
“write_throughput”: “avg:aws.rds.write_throughput”
}
no
rds_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{Environment,dbinstanceidentifier,region}" no
system variable to enable system monitors creation bool false no
system_alert_message Alert message to add in system monitors. string "Value: {{value}}\nName: {{host.name}}\nRegion: {{region.name}}\nEnvironment: {{Environment.name}}\n\n{{#is_no_data}}Not receiving data @pagerduty{{/is_no_data}}\n{{#is_alert}}@pagerduty{{/is_alert}}\n{{#is_warning}}@pagerduty{{/is_warning}}\n{{#is_recovery}}@pagerduty{{/is_recovery}}\n@slack-alerts\n" no
system_queries Variabls for defining datadog alb queries. map
{
“cpu”: “avg:system.cpu.idle”,
“disk_free”: “avg:system.disk.free”,
“disk_total”: “avg:system.disk.total”,
“load”: “avg:system.load.norm.5”,
“mem_total”: “avg:system.mem.total”,
“mem_used”: “avg:system.mem.used”
}
no
system_trigger_by Tags to trigger a separate alert for each tag mentioned. string "{host,Environment}" no
tags Tags to apply to all resources map {} no

Outputs

Name Description
output n/a

Contributing

If you want to contribute to this repository check all the guidelines specified here before submitting a new PR.