AWS provides three storage options - Block, Object & File storage. AWS EBS is Amazon’s primary block storage used for mission-critical instances. AWS S3 is Amazon’s object storage for latency-insensitive data. AWS EFS is their simple, serverless, set-and-forget, elastic file system.
When it comes to Infrastructure-as-a-service (IaaS), organizations focus on setting up their EC2 instances in AWS. An EC2 instance is an on-demand, scalable computer resource, generally used to host applications when the customer requires more control over the computing environment.
But, when it comes to cloud cost optimization, optimizing your instances does not necessarily translate to storage optimization.
Several myths similarly undervalue the importance of EBS optimization, some that downplay their dependence on EBS, while some that are borne purely out of lack of visibility.
Today, we would like to take a look at these inherent challenges and figure out a way forward to understand if your EBS is optimized or not.
Throughout our conversations with industry leaders and enterprises, we faced one single conundrum where organizations were clearly focused more on managing their compute services bill.
Yes, they constitute 40-50% of your cloud costs, but storage costs come second with a 20-25% share. EBS constitutes 10-20% if your IaaS/PaaS heavy.
But, EBS storage bills are often shielded away by the larger over-arching compute services bill. As a result, organizations end up believing that they spend on compute and not on storage.
EBS bills can constitute 10-20% of your Cloud costs.
The above figure is an excerpt of an EC2 Instance bill taken from AWS Cost Explorer. Until and unless you explore, you wouldn’t realize how the EBS bill constitutes most of the costs.
Organizations are oblivious to this and continue to believe that all they pay for are the EC2 instances. And their EBS costs are well within their control.
As a result, it becomes imperative to focus on storage optimization if you want to truly optimize your cloud.
Do you actively monitor your EBS disk utilization? Because 9/10 customers we spoke with don’t.
And the reason? - AWS doesn’t give these metrics out of the box.
More often than not, organizations assume they are properly utilized, only to fall short of their own expectations.
Now, there are two alternate ways to acquire disk utilization metrics:
You either Individually run SSH in each EC2 instance & find the utilization numbers. As you may agree, this gets cumbersome if you have 100+ EC2 servers.
9 / 10 Customers don't track their disk utilization.
The second alternative is to install a monitoring tool in your EC2 instances like Datadog or Newrelic to pull the disk utilization numbers. Considering that these tools are costly, organizations end up installing them only on production workloads & not worry about the rest.
Based on a survey, the average disk utilization of firms is only 25%
So we decided to test this approach and ran a survey across 100+ organizations to see if they are properly utilized. The average disk utilization came out to be only 25%. This means that you are paying for 75% of the buffer daily, a buffer you don’t use.
If you look at the graph closely, you will find that it’s not an exception but a norm. So it might be high time you go back and check if you are properly utilizing your cloud storage.
The introduction of Kubernetes helped change the cloud infrastructure landscape. It became one of the most efficient ways available to developers, to enable virtualization ensuring the use of all available resources, as well as minimizing the overhead.
On top of that, the in-built orchestration features like auto-scaling and logical volume management meant that organizations were spending less time on capacity planning and management.
However, in all this, we all skip the thought, that in the end, even managed services like EKS run EC2 instances with EBS volumes attached to them.
In other words, all the technical and logical challenges being faced when provisioning an AWS block storage, like downtime during shrinking and required buffer time, still persist.
AWS S3 or Amazon’s simple storage services is a type of object-level data storage meant to store unstructured data.
Considering their use for latency-insensitive data like big data, it usually gives the impression that S3 is more used just by the sheer volume it stores. However, S3 is 4x cheaper than EBS. So it also means that if you store 1 TB of data in S3, it will take only 250 GB of EBS storage to match the spend of your S3 storage.
So the question to ponder is are you sure that your root volume, your self-hosted databases, and application logs are not accounting for data volumes that match your S3 spends?
Regardless of the fact that you might be recreating your EC2 instances every week, an effective storage provisioning of your EBS can significantly reduce your cloud bills. Even for stateless workloads the same problem persists. Whenever you create an EC2 instance you have the same constraints and you end up over-provisioning. Although the instances are/can be recreated, the configuration used to create them is static. On top of that, each service has different storage requirements. No organization actively monitors and updates the configuration for each service every week.
The provision, henceforth, is kept in mind with the peak workload for the week. But the utilization would instead occupy an average figure seldom touching the peak.
Click here to know more.