Businesses made the right choice using spot instances for Black Friday
Black Friday is the unofficial kickoff to the holiday shopping season, when retailers announce exciting deals to attract as many customers as they can into their stores. With online shopping surging in recent years, e-commerce and AdTech businesses are also participating in Black Friday to increase their online presence and revenues. Today, millions of customers have the choice of shopping online or in brick-and-mortar stores. While these stores have their challenges managing massive amounts of shoppers, e-commerce and AdTech businesses are dealing with their own set of challenges when it comes to handling shopper traffic.
During peak shopping days such as Black Friday, businesses use significantly more capacity to process online shopper requests while constantly trying to provide them optimal performance. In theory, as more companies utilize resources from a cloud provider, they may experience a capacity shortage when trying to scale. When a busy online shopping day finally approaches, this theory can become a reality as online retailers scale to meet their shoppers’ demands and cloud providers struggle to accommodate capacity. However, what many cloud users may not know is that about 50 percent of the major cloud vendors’ infrastructure is mostly idle.
To be clear, this is not because AWS, Microsoft or Google are struggling to sell their capacity; it’s because they have to be ready for rapid increases in usage—just like the one businesses experience during Black Friday.
The fear of insufficient capacity also has made users apprehensive to use cloud providers’ excess capacity such as spot instances to run their workloads, since many customers are using more cloud resources and are eating all of the cloud inventory. Businesses simply do not want to take a chance trying to compete for excess capacity and opt for something more predictable, such as on-demand instances. The fact that cloud providers are capable of handling the most aggressive computing demands has created a situation in which businesses that chose to make intelligent use of excess capacity during Black Friday saw great returns on their investments.
Events such as Black Friday, national elections and the Super Bowl are pretty extreme stress tests for any consumer-facing business. We thought it would interesting to go through the data and see how the AWS spot market behaved during Black Friday, and see what we can learn from it. In this blog post, I will go over the spot market statistics and lessons learned from 2018’s Black Friday.
Spot Markets and Statistics
Spot markets are defined as the supply and demand for a specific instance type at a specific spot price in an availability zone. Spotinst Elastigroup ranks spot markets for availability and costs from current and historical data in a process called Spot Market Scoring. Spot Market Scoring enables us to predict an instance interruption in a market approximately 15 minutes before it happens (sometimes even more than an hour). This advance notification on upcoming interruptions is what triggers the spot instance replacement and ensures the continuous availability of compute workloads. Let’s see how the spot markets behaved before and after Black Friday.
Now that you have a better understanding of the spot market, let’s delve into the statistics from Black Friday. In the image below, we can see the number of spot interruptions for each Friday three months prior to Black Friday in the us-west-1 and us-east-1 region. The us-east-1 region had significantly more spot interruptions during this period.
The graph below shows the number of spot interruptions that occurred a week before and after Black Friday in us-east-1 and us-west-1 across all instance types. We can see that us-west-1 was more reliable for spot instances than us-east-1. Also, us-east-1 continued to experience more outages the week after Black Friday.
We then took a look at popular instance types and sizes used before, on and after Black Friday to see how many interruptions there were. In the image below, we can see that c4.4xlarge had the highest number of interruptions and incrementally improved after.
The section below shows metrics for Back Friday in comparison across different availability zones in us-east-1 and us-west-2:
- Total Spot Interruptions this Black Friday in the US: 462
- Breaking down the Interruptions by Availability Zones, some had more Spot interruptions than others
Availability Zone | us-east-1 | us-west-2 |
AZ 1 | 29 | 1 |
AZ 2 | 3 | 7 |
AZ 3 | 24 | 3 |
- Average Spot Uptime: 2675 Minutes (1.8 days)
What Spot Interruptions and Autoscaling Events Look like
Let’s examine what a spot market interruption looks like and how Elastigroup handles the situation. In the data below, you can see an actual event log from one of our customers.We can see that early on Black Friday, the day started with Elastigroup automatically replacing terminated or soon-to-be-terminated instances. A little later, as shoppers began scavenging the internet for deals, we see that the instances were automatically scaled up to meet demand. Elastigroup scaled down the instances the following day because demand was not as high as on Black Friday. With this type of autoscaling automation in Elastigroup, administrators can spend more time worrying about application development and other tasks.
Takeaways
- DevOps Automation is Key – Elastigroup uses predictive algorithms to identify and drain instances that are about to be terminated. Prior to termination, Elastigroup will launch a new instance and replace it seamlessly. Elastigroup will also make sure to distribute your instances across different types to optimize costs and longevity.
- Spot Instance Availability – Having a stable uptime is crucial for peak days such as Black Friday. With millions of shoppers visiting e-commerce sites, an outage can result in revenue loss. On Black Friday, the small number of spot instances that experienced interruptions had an average uptime of 1.8 days. With that uptime, organizations can rely on spot instances managed by Elastigroup to reduce costs while not sacrificing service availability during peak times.
- Spot Market – Spots allow businesses to make better decisions without sacrificing costs and availability. Elastigroup reliably leverages the spot market to optimize your underlying infrastructure for cost, without compromising availability. Per the statistics above, we can see that some availability zones had more spot terminations than others. AZ 1 in us-east-1 had the highest number of spot interruptions.