Auto Scaling and Load Balancing

Now that you've learned about disaster recovery strategies, let's explore how to automatically scale your applications and distribute traffic across multiple resources. Auto Scaling and Load Balancing are essential services for building resilient, cost-effective applications that can handle variable workloads while maintaining performance.

Learning Goals

Understand Auto Scaling concepts and components
Configure Elastic Load Balancing for traffic distribution
Implement scaling policies based on metrics
Combine Auto Scaling Groups with Load Balancers
Monitor and troubleshoot scaling operations

Auto Scaling Fundamentals

Auto Scaling helps you maintain application availability by automatically adding or removing EC2 instances based on conditions you define. It ensures you have the right number of instances to handle your application's load.

Auto Scaling Components

An Auto Scaling setup consists of three main components:

Launch Template/Configuration: Defines what to scale (instance type, AMI, security groups)
Auto Scaling Group: Defines where to scale (subnets, minimum/maximum instances)
Scaling Policies: Defines when to scale (CPU utilization, network traffic, custom metrics)

launch-template.json
{
  "LaunchTemplateName": "web-server-template",
  "LaunchTemplateData": {
    "ImageId": "ami-0c02fb55956c7d316",
    "InstanceType": "t3.micro",
    "KeyName": "my-key-pair",
    "SecurityGroupIds": ["sg-1234567890abcdef0"],
    "UserData": "IyEvYmluL2Jhc2gKc3VkbyB5dW0gdXBkYXRlIC15CnN1ZG8geXVtIGluc3RhbGwgLXkgaHR0cGQKc3VkbyBzeXN0ZW1jdGwgc3RhcnQgaHR0cGQK"
  }
}

Elastic Load Balancing

Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses.

Load Balancer Types

AWS offers three types of load balancers:

Application Load Balancer (ALB): Layer 7, ideal for HTTP/HTTPS
Network Load Balancer (NLB): Layer 4, for extreme performance
Classic Load Balancer (CLB): Previous generation (avoid for new applications)

cloudformation-load-balancer.yml
Resources:
  WebAppLoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Name: web-app-alb
      Scheme: internet-facing
      Subnets:
        - subnet-12345678
        - subnet-87654321
      SecurityGroups:
        - sg-1234567890abcdef0
      Type: application

  WebAppTargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      Name: web-app-targets
      Port: 80
      Protocol: HTTP
      VpcId: vpc-12345678
      HealthCheckPath: /health
      HealthCheckPort: "80"
      HealthCheckProtocol: HTTP

tip

Use Application Load Balancers for web applications and microservices. They provide advanced routing features, host-based routing, and path-based routing that make them ideal for modern architectures.

Configuring Auto Scaling Groups

Auto Scaling Groups manage the lifecycle of your EC2 instances and ensure the desired number of instances are running.

Creating an Auto Scaling Group

terraform-auto-scaling.tf
resource "aws_autoscaling_group" "web_app" {
  name                 = "web-app-asg"
  min_size             = 2
  max_size             = 10
  desired_capacity     = 2
  health_check_type    = "ELB"
  health_check_grace_period = 300
  vpc_zone_identifier  = [aws_subnet.private_a.id, aws_subnet.private_b.id]
  target_group_arns    = [aws_lb_target_group.web_app.arn]

  launch_template {
    id      = aws_launch_template.web_app.id
    version = "$Latest"
  }

  tag {
    key                 = "Name"
    value               = "web-app-instance"
    propagate_at_launch = true
  }
}

Scaling Policies

Scaling policies define when and how to scale your Auto Scaling Group.

Target Tracking Scaling

target-tracking-policy.json
{
  "AutoScalingGroupName": "web-app-asg",
  "PolicyName": "cpu-target-tracking",
  "PolicyType": "TargetTrackingScaling",
  "TargetTrackingConfiguration": {
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 70.0
  }
}

Step Scaling Policies

step-scaling-policy.json
{
  "AutoScalingGroupName": "web-app-asg",
  "PolicyName": "step-scaling-cpu",
  "PolicyType": "StepScaling",
  "AdjustmentType": "ChangeInCapacity",
  "StepAdjustments": [
    {
      "MetricIntervalLowerBound": 0,
      "MetricIntervalUpperBound": 10,
      "ScalingAdjustment": 1
    },
    {
      "MetricIntervalLowerBound": 10,
      "ScalingAdjustment": 2
    }
  ]
}

Health Checks and Lifecycle Hooks

Auto Scaling Groups can use EC2 status checks or Load Balancer health checks to determine instance health.

health-check-config.sh
# Configure health checks for Auto Scaling Group
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name web-app-asg \
    --health-check-type ELB \
    --health-check-grace-period 300

warning

Always set an appropriate health check grace period. If it's too short, instances may be terminated before they finish initializing. If it's too long, unhealthy instances may continue serving traffic.

Integration Patterns

Complete Web Application Stack

CloudFormation
Terraform

complete-stack.yml
WebAppAutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties:
    AutoScalingGroupName: web-app-complete
    LaunchTemplate:
      LaunchTemplateId: !Ref WebAppLaunchTemplate
      Version: !GetAtt WebAppLaunchTemplate.LatestVersionNumber
    MinSize: 2
    MaxSize: 10
    DesiredCapacity: 2
    VPCZoneIdentifier: !Ref PrivateSubnets
    TargetGroupARNs:
      - !Ref WebAppTargetGroup
    HealthCheckType: ELB
    HealthCheckGracePeriod: 300

complete-stack.tf
resource "aws_autoscaling_policy" "web_app_cpu" {
  name                   = "web-app-cpu-policy"
  autoscaling_group_name = aws_autoscaling_group.web_app.name
  policy_type            = "TargetTrackingScaling"
  
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ASGAverageCPUUtilization"
    }
    target_value = 70.0
  }
}

Common Pitfalls

Insufficient capacity: Setting minimum capacity too low for baseline load
Over-scaling: Aggressive scaling policies causing rapid instance churn
Missing health checks: Not configuring proper health checks leads to serving traffic from unhealthy instances
Incorrect subnet selection: Not distributing instances across multiple Availability Zones
Ignoring cooldown periods: Scaling actions triggering too frequently without allowing time for stabilization
Poor metric selection: Using inappropriate metrics for scaling decisions
Cost underestimation: Not accounting for the cost of additional instances during scaling events

Summary

Auto Scaling and Load Balancing work together to create resilient, scalable applications. Auto Scaling ensures you have the right number of instances running, while Load Balancers distribute traffic evenly across those instances. Remember to configure appropriate scaling policies, health checks, and monitoring to maintain optimal performance and cost efficiency.

Quiz

AWS Auto Scaling & Load Balancing Fundamentals

What is the primary purpose of an Auto Scaling Group?

Question 1/5

Auto Scaling Fundamentals​

Auto Scaling Components​

Elastic Load Balancing​

Load Balancer Types​

Configuring Auto Scaling Groups​

Creating an Auto Scaling Group​

Scaling Policies​

Target Tracking Scaling​

Step Scaling Policies​

Health Checks and Lifecycle Hooks​

Integration Patterns​

Complete Web Application Stack​

Common Pitfalls​

Summary​

Quiz​