Combining Elastic Load Balancers with EC2 Auto Scaling helps to manage and control your AWS workloads. This combination supports the demands put upon your infrastructure, while minimizing performance degradation. With this in mind, engineers and solution architects should have a deep understanding of how to implement these features.
In this article, we’ll cover the basics about Elastic Load Balancers and EC2 Auto Scaling. To dive deeper and discover how to implement and configure load balancing and auto scaling to build a scalable, flexible architecture, check out my newest course: Using Elastic Load Balancing and EC2 Auto Scaling.
Elastic Load Balancers
The main function of an Elastic Load Balancer, commonly referred to as an ELB, is to help manage and control the flow of inbound requests to a group of targets by distributing these requests evenly across the targeted resource group. These targets could be a fleet of EC2 instances, AWS Lambda functions, a range of IP addresses, or even containers. The targets defined within the ELB could be situated across different availability zones (AZs) for additional resilience or all placed within a single AZ.
Let’s look at this from a typical scenario. For example, let’s suppose you just created a new application currently residing on a single EC2 instance within your environment which is being accessed by a number of users. At this stage, your architecture can be logically summarized as shown below.
If you are familiar with architectural design and best practices, then you would realize that using a single instance approach isn’t ideal; although, it would certainly work and provide a service to your users. However, this infrastructure layout brings some challenges. For example, the one instance where your application is located can fail — perhaps from a hardware or a software fault. If that happens, your application will be down and unavailable