How to Fetch EC2 CPU and IOPS Data using Python (Boto3) and CloudWatch
Fetching CPU Utilization is straightforward, but fetching IOPS (Input/Output Operations Per Second) is a common stumbling block. This is because AWS splits storage metrics between the instance itself (for instance store) and the EBS service (for attached volumes).This guide will walk you through the correct way to retrieve both using Python.
Contents
- Prerequisites
- Part 1: The Easy Part (CPU Utilization)
- Part 2: The Hard Part (IOPS & The “Missing” Data)
- The Complete Script
- Critical Implementation Details
Prerequisites
- Python 3.x installed.
- Boto3 library (
pip install boto3). - AWS Credentials configured (via
~/.aws/credentialsor environment variables). - IAM Permissions: Your user/role needs
cloudwatch:GetMetricStatisticsandec2:DescribeInstances.
Part 1: The Easy Part (CPU Utilization)
CPU metrics are standard for all instances and live in the AWS/EC2 namespace. We can use the get_metric_statistics API to fetch them.
Key Concept: CloudWatch returns data points. To get a single “current” number, we usually request the last few minutes of data and take the average.
Part 2: The Hard Part (IOPS & The “Missing” Data)
If you look for IOPS metrics in the AWS/EC2 namespace, you will see DiskReadOps and DiskWriteOps.
- The Trap: These metrics only track “Instance Store” (ephemeral) volumes.
- The Reality: Most modern EC2 instances use EBS (Elastic Block Store) volumes. EBS metrics live in the
AWS/EBSnamespace and are reported per volume, not per instance.
To get the “Total IOPS” for an instance, your script must:
- Identify all EBS volumes attached to the instance.
- Fetch
VolumeReadOpsandVolumeWriteOpsfor each volume from theAWS/EBSnamespace. - Sum them up.
The Complete Script
This script solves the aggregation problem. It fetches CPU utilization and then calculates the total Read/Write IOPS across all attached EBS volumes.
Python
import boto3
import datetime
from botocore.exceptions import ClientError
def get_ec2_metrics(instance_id, region='us-east-1'):
cw = boto3.client('cloudwatch', region_name=region)
ec2 = boto3.client('ec2', region_name=region)
# Time window: Last 10 minutes
end_time = datetime.datetime.utcnow()
start_time = end_time - datetime.timedelta(minutes=10)
period = 300 # 5 minute intervals
print(f"--- Metrics for {instance_id} ({region}) ---")
# 1. Fetch CPU Utilization
try:
cpu_response = cw.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=start_time,
EndTime=end_time,
Period=period,
Statistics=['Average']
)
if cpu_response['Datapoints']:
# Get the most recent data point
latest_cpu = sorted(cpu_response['Datapoints'], key=lambda x: x['Timestamp'])[-1]
print(f"CPU Utilization: {latest_cpu['Average']:.2f}%")
else:
print("CPU Utilization: No data available")
except ClientError as e:
print(f"Error fetching CPU: {e}")
# 2. Fetch EBS IOPS (Aggregated across all attached volumes)
try:
# Find attached volumes
instance_info = ec2.describe_instances(InstanceIds=[instance_id])
volumes = instance_info['Reservations'][0]['Instances'][0].get('BlockDeviceMappings', [])
total_read_ops = 0
total_write_ops = 0
has_volumes = False
for vol in volumes:
vol_id = vol['Ebs']['VolumeId']
has_volumes = True
for metric_name in ['VolumeReadOps', 'VolumeWriteOps']:
response = cw.get_metric_statistics(
Namespace='AWS/EBS',
MetricName=metric_name,
Dimensions=[{'Name': 'VolumeId', 'Value': vol_id}],
StartTime=start_time,
EndTime=end_time,
Period=period,
Statistics=['Sum']
)
if response['Datapoints']:
# Sort to get latest, sum to aggregate
latest_point = sorted(response['Datapoints'], key=lambda x: x['Timestamp'])[-1]
value = latest_point['Sum']
if metric_name == 'VolumeReadOps':
total_read_ops += value
else:
total_write_ops += value
if has_volumes:
# CloudWatch returns "Ops" (Total count in the period).
# To get IOPS (Ops Per Second), divide by Period.
read_iops = total_read_ops / period
write_iops = total_write_ops / period
print(f"Total Read IOPS: {read_iops:.2f}")
print(f"Total Write IOPS: {write_iops:.2f}")
else:
print("No EBS volumes attached.")
except ClientError as e:
print(f"Error fetching IOPS: {e}")
# Usage
# Replace with your actual Instance ID
get_ec2_metrics('i-0123456789abcdef0')
Critical Implementation Details
- Ops vs. IOPS: CloudWatch returns the count of operations in the period (
VolumeReadOps). To get the rate (IOPS), you must divide this count by the period in seconds (e.g.,Count / 300). - Latency: CloudWatch metrics are not instant. There is typically a 5-15 minute delay for standard metrics. If you need real-time data, you must enable Detailed Monitoring, which incurs extra costs.
- Namespace Confusion: Always verify if you are monitoring an Instance Store volume (
AWS/EC2>DiskReadOps) or an EBS volume (AWS/EBS>VolumeReadOps). 90% of the time, you want the latter.
