Skip to main content

Auto Scaling and Load Balancing

Now that you've learned about disaster recovery strategies, let's explore how to automatically scale your applications and distribute traffic across multiple resources. Auto Scaling and Load Balancing are essential services for building resilient, cost-effective applications that can handle variable workloads while maintaining performance.

Learning Goals

  • Understand Auto Scaling concepts and components
  • Configure Elastic Load Balancing for traffic distribution
  • Implement scaling policies based on metrics
  • Combine Auto Scaling Groups with Load Balancers
  • Monitor and troubleshoot scaling operations

Auto Scaling Fundamentals

Auto Scaling helps you maintain application availability by automatically adding or removing EC2 instances based on conditions you define. It ensures you have the right number of instances to handle your application's load.

Auto Scaling Components

An Auto Scaling setup consists of three main components:

  • Launch Template/Configuration: Defines what to scale (instance type, AMI, security groups)
  • Auto Scaling Group: Defines where to scale (subnets, minimum/maximum instances)
  • Scaling Policies: Defines when to scale (CPU utilization, network traffic, custom metrics)
launch-template.json
{
"LaunchTemplateName": "web-server-template",
"LaunchTemplateData": {
"ImageId": "ami-0c02fb55956c7d316",
"InstanceType": "t3.micro",
"KeyName": "my-key-pair",
"SecurityGroupIds": ["sg-1234567890abcdef0"],
"UserData": "IyEvYmluL2Jhc2gKc3VkbyB5dW0gdXBkYXRlIC15CnN1ZG8geXVtIGluc3RhbGwgLXkgaHR0cGQKc3VkbyBzeXN0ZW1jdGwgc3RhcnQgaHR0cGQK"
}
}

Elastic Load Balancing

Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses.

Load Balancer Types

AWS offers three types of load balancers:

  • Application Load Balancer (ALB): Layer 7, ideal for HTTP/HTTPS
  • Network Load Balancer (NLB): Layer 4, for extreme performance
  • Classic Load Balancer (CLB): Previous generation (avoid for new applications)
cloudformation-load-balancer.yml
Resources:
WebAppLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: web-app-alb
Scheme: internet-facing
Subnets:
- subnet-12345678
- subnet-87654321
SecurityGroups:
- sg-1234567890abcdef0
Type: application

WebAppTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: web-app-targets
Port: 80
Protocol: HTTP
VpcId: vpc-12345678
HealthCheckPath: /health
HealthCheckPort: "80"
HealthCheckProtocol: HTTP
tip

Use Application Load Balancers for web applications and microservices. They provide advanced routing features, host-based routing, and path-based routing that make them ideal for modern architectures.

Configuring Auto Scaling Groups

Auto Scaling Groups manage the lifecycle of your EC2 instances and ensure the desired number of instances are running.

Creating an Auto Scaling Group

terraform-auto-scaling.tf
resource "aws_autoscaling_group" "web_app" {
name = "web-app-asg"
min_size = 2
max_size = 10
desired_capacity = 2
health_check_type = "ELB"
health_check_grace_period = 300
vpc_zone_identifier = [aws_subnet.private_a.id, aws_subnet.private_b.id]
target_group_arns = [aws_lb_target_group.web_app.arn]

launch_template {
id = aws_launch_template.web_app.id
version = "$Latest"
}

tag {
key = "Name"
value = "web-app-instance"
propagate_at_launch = true
}
}

Scaling Policies

Scaling policies define when and how to scale your Auto Scaling Group.

Target Tracking Scaling

target-tracking-policy.json
{
"AutoScalingGroupName": "web-app-asg",
"PolicyName": "cpu-target-tracking",
"PolicyType": "TargetTrackingScaling",
"TargetTrackingConfiguration": {
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": 70.0
}
}

Step Scaling Policies

step-scaling-policy.json
{
"AutoScalingGroupName": "web-app-asg",
"PolicyName": "step-scaling-cpu",
"PolicyType": "StepScaling",
"AdjustmentType": "ChangeInCapacity",
"StepAdjustments": [
{
"MetricIntervalLowerBound": 0,
"MetricIntervalUpperBound": 10,
"ScalingAdjustment": 1
},
{
"MetricIntervalLowerBound": 10,
"ScalingAdjustment": 2
}
]
}

Health Checks and Lifecycle Hooks

Auto Scaling Groups can use EC2 status checks or Load Balancer health checks to determine instance health.

health-check-config.sh
# Configure health checks for Auto Scaling Group
aws autoscaling update-auto-scaling-group \
--auto-scaling-group-name web-app-asg \
--health-check-type ELB \
--health-check-grace-period 300
warning

Always set an appropriate health check grace period. If it's too short, instances may be terminated before they finish initializing. If it's too long, unhealthy instances may continue serving traffic.

Integration Patterns

Complete Web Application Stack

complete-stack.yml
WebAppAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: web-app-complete
LaunchTemplate:
LaunchTemplateId: !Ref WebAppLaunchTemplate
Version: !GetAtt WebAppLaunchTemplate.LatestVersionNumber
MinSize: 2
MaxSize: 10
DesiredCapacity: 2
VPCZoneIdentifier: !Ref PrivateSubnets
TargetGroupARNs:
- !Ref WebAppTargetGroup
HealthCheckType: ELB
HealthCheckGracePeriod: 300

Common Pitfalls

  • Insufficient capacity: Setting minimum capacity too low for baseline load
  • Over-scaling: Aggressive scaling policies causing rapid instance churn
  • Missing health checks: Not configuring proper health checks leads to serving traffic from unhealthy instances
  • Incorrect subnet selection: Not distributing instances across multiple Availability Zones
  • Ignoring cooldown periods: Scaling actions triggering too frequently without allowing time for stabilization
  • Poor metric selection: Using inappropriate metrics for scaling decisions
  • Cost underestimation: Not accounting for the cost of additional instances during scaling events

Summary

Auto Scaling and Load Balancing work together to create resilient, scalable applications. Auto Scaling ensures you have the right number of instances running, while Load Balancers distribute traffic evenly across those instances. Remember to configure appropriate scaling policies, health checks, and monitoring to maintain optimal performance and cost efficiency.

Quiz

AWS Auto Scaling & Load Balancing Fundamentals

What is the primary purpose of an Auto Scaling Group?

Question 1/5