Optimizing AWS cost optimizationinsights
Erik Carlin, Co-Founder and Chief Product Officer
- Cost optimization is a journey. There is no silver bullet, but there is a 'no regrets' choice that can accelerate your progress.
- Convertible RIs offer the best balance of execution speed and savings for the majority of compute workloads. They require no rearchitecture, have the flexibility to be adjusted as instance needs change, and offer significant savings of 20-65%.
- Achieving the best savings outcome with Convertible RIs requires fully automated management vs. periodic and manual administration.
As part of building ProsperOps and talking to engineering and finance leaders about Reserved Instance management, we've found many conversations track back to the broader topic of AWS cost optimization and the tactics being employed or considered. At ProsperOps, we have conviction around Convertible Reserved Instances (RIs), not as the only tactic to use, but as the foundation of your cost optimization strategy. If managed with the appropriate automation, we believe Convertible Reserved Instances offer the biggest bang-for-the-buck in the majority of EC2 use cases. You should be skeptical that we're biased, so in this post we make a case for why Convertible RIs should be at the top of your list.
No Silver Bullets
Convertible RIs, and RIs more broadly, are part of a larger cloud cost optimization problem space. Much has been written on the topic and AWS has developed a five pillars of cost optimization framework and presented publicly on the topic. This framework is a good summary, and we're believers in these tactics, but the question remains - with so much information available on the topic, why is cost optimization still a major challenge for AWS users of all sizes and experience levels?
Much like the journey to cloud, the path to cost optimization is a continual process. There is no silver bullet (although certain tools are helpful) because many cost optimization decisions require human judgement and are fundamentally tied to cloud architectural choices and application requirements. Given all of this, how do you eat the cost optimization elephant?
In our experience, compute averages about 60% of AWS spend (your mileage may vary), so begin there for the greatest impact on your bill. EBS, S3, RDS, DynamoDB, data transfer, etc. are other key areas that can and should be optimized, but for this post, we're zeroing in on compute. We approach the problem pragmatically based on the following two cost optimization factors:
- Which techniques have the biggest savings impact to the overall AWS bill
- Which techniques are the easiest to execute and thus yield savings the fastest
Based on these dimensions, we can map compute cost optimization techniques as follows:
Figure 1. AWS Compute Cost Optimization Matrix
Note: This is based on our experience across hundreds of AWS customers and is meant to be directional in nature. There are certainly exceptions.
Convertible RIs - The Best Balance of Speed and Savings
While techniques like cleaning up unused resources can yield savings quickly, and other approaches, like spot instances, can yield large savings if the workload is appropriate, the best overall balance in terms of speed-to-execution and savings impact is with Convertible RIs.
Here are 5 reasons why:
EC2 is everywhere [savings impact]. EC2 is the fundamental building block for most cloud apps whether you have just migrated to AWS or are down the road on your journey, and Convertible RIs generate material savings on EC2. There are of course other tactics to optimize compute that are workload dependent: Spot is great for stateless workloads that can tolerate interruption, Auto Scaling matters when demand is elastic, Standard RIs are good when specific instance forecasting is deterministic and fixed, and Lambda is awesome for periodically executed code. We unequivocally recommend these approaches where they make sense, but they don't apply to many common workloads.
Flexibility enables saving NOW [execution speed]. Many people still follow an antiquated model of right sizing first and then purchasing RIs, but this strategy delays the savings that could be generated from RIs. Since Convertible RIs are flexible, you can flip that model and cover an appropriate amount of EC2 spend first and then pursue other cost optimization techniques in parallel. As your EC2 usage changes with your optimization efforts, you can adapt your Convertible RI portfolio to match.
Savings magnitude [savings impact]. Depending on your commitment tolerance, Convertible RIs can deliver savings of 66% off EC2 on-demand rates, which makes them one of the most significant cost savings levers available. While Convertible RI discount rates are on average 9% lower than the equivalent Standard RI, the flexibility to exchange the RI when your EC2 usage shifts gives you the confidence to cover a larger percentage of your expected EC2 footprint. Similarly, 3 year Convertible RI commitments become palatable (we would generally recommend against 3 year Standard RI commitments) because the dollar commitment can be exchanged over time as EC2 needs change. Both of these drive up your Effective Savings Rate.
No application risk [execution speed]. An RI is a discount given in exchange for a term commitment. It's an abstract billing construct, not an engineering resource. Unlike other cost optimization tactics like right sizing or spot which change the infrastructure, a Convertible RI strategy poses zero risk to your applications.
Macro vs. micro capacity planning [execution speed]. Convertible RIs open up the ability to make a “top down” spend commitment based on aggregate expected EC2 usage, independent of specific instance forecasts. Prior to Convertible RIs, Standard RIs required “bottoms up” instance-by-instance capacity planning which forced a level of precision that made many users hesitant to commit due to the risk of future architecture change or a price drop. “Top down” planning allows you to move faster with more aggressive spend commitments because you always have the ability to update your RI portfolio as your EC2 fleet evolves.
We're massive fans of optimizing cloud spend; it's a journey you must take as your cloud spend becomes material. Everyone is busy, so you have to be pragmatic and balance optimizing for speed of execution and savings impact. Compute is often the largest part of the AWS bill and Convertible RIs are the fastest way to make a significant savings impact. That's why we recommend them as a starting point, and on an ongoing basis, for almost all customers.
While Convertible RIs are powerful and flexible, achieving an optimal savings outcome is tricky, and requires not only making the right purchases, but also managing the Convertible RI portfolio hour after hour, day after day, as EC2 usage shifts. Tactically, this requires monitoring EC2 changes as they happen, matching RIs to those instances based on the greatest discount (a discount for the same Convertible RI dollar commit can vary as much as 33% depending on which EC2 instance it is matched to), executing efficient exchanges - across all accounts, in real-time, and based on your commitment constraints.
Operationally pulling off this complicated choreography with reporting tools and ad hoc scripts is hard, and frankly not possible at scale. While there is no magic tool for broad cost optimization, we believe RI management is one of those areas where autonomous usage forecasting and financial optimization algorithms deliver the best outcome.
Some of the most sophisticated technology companies in the world have come to the same conclusion and have implemented a fully automated Convertible RI strategy. For example, check out Stripe here and Quora here.
We started ProsperOps to bring this approach to the masses. Our mission is to deliver a simple and automatic Convertible RI savings outcome for our customers. We do the heavy lifting and you enjoy the savings. We call it Autonomous Reserved Instance Management, and it's how we contribute to helping businesses prosper in the cloud.