DevOps teams exist to provide continuous delivery of high-quality software to meet the needs of the business, on time and on budget. To achieve this goal, you need to understand cloud cost KPIs. They include the business value that application features provide and the cost of producing those features.
Whether your applications run in the cloud or not, KPIs that you could be tracking, include:
- Basic resource utilization metrics
- Workload performance metrics
- Unit economics measurements
- Revenue measurements
In the Cloud Cost Optimization Cookbook, Azul product and engineering experts look at metrics and methodologies for managing and optimizing both cloud costs and Java applications. Following are some of the critical areas they detail.
Revenue to build efficiency metrics
At the highest levels of your FinOps practice, you break down spend by units that correspond to different parts of your product offering and combine the results with data to identify the business value of each unit, like revenue.
You can track high-level strategic KPIs like revenue generated per cloud dollar spent, broken down by teams. Product managers, teams with low revenue/cloud dollar spend, and executive team can make their own strategic decisions about the long-term and short-term direction of the company.
Resource utilization metrics
Resource utilization metrics measure the underlying infrastructure, which comes in handy when performing other cost optimization tasks like right-sizing instances and splitting up shared costs. Common resource utilization metrics include:
- CPU utilization. Measure the percentage of the CPU capacity of each provisioned virtual machine (VM) or container that is utilized by Java applications.
- Memory utilization. Track memory consumption, both heap and non-heap, and compare to the total amount of memory available on the VM or container.
- Network throughput. Analyze inbound and outbound network traffic to identify potential bottlenecks or overprovisioned resources.
Workload performance
Workload performance metrics measure what is being done with those resources and how efficiently they are operating. Common workload performance metrics include:
- Response time. Monitor the latency of Java application responses to ensure optimal user experience.
- Error rates. Track the frequency of errors and exceptions in Java applications to identify performance degradation or potential bugs.
- Application/service usage. Monitor how many users the application serves and how many requests it handles.
- Carrying capacity. Track the maximum number of requests an individual VM or container can handle while staying within the key performance boundaries of the application.
Establishing fully loaded costs on different units
Cost KPIs are derived from the application usage and resource consumption with billing metrics from the cloud provider. However, there are differing levels of maturity in which you can evaluate cloud costs and use unit economics to better tie costs to business value. The FinOps Foundation refers to fully loaded costs as follows:
“Fully loaded costs are amortized, reflect the actual discounted rates a company is paying for cloud resources, equitably factor in shared costs, and are mapped to the business’s organizational structure. In essence, they show the actual costs of your cloud and what is driving them.”
Important Use and Cost Metrics
Metric | Description |
---|---|
Cost per transaction | Divide costs between transactions and correlate spending to business value and set reasonable pricing for features. |
Hourly cost per feature | Identify blind spots in your measurement and scale inefficiencies if costs do not increase linearly with feature usage. |
Cost per customer | Establish the baseline cost of onboarding and basic provisioning of a single customer, then segment the additional costs into low/medium/high activity customers. |
A critical element is factoring in costs beyond just compute costs. For example, you need to factor in not only the amount of compute needed to run your JDK instances, but also the licenses for any software that is running on those JDKs. Software that is licensed per server can have a multiplier effect on the savings delivered by reducing the total number of servers compared to free open-source software or homegrown software.
Conclusion
It takes a rich toolset and coordinated effort to build a dataset of proven and valuable KPIs that can deliver real business value to your organization. In the Cloud Cost Optimization Cookbook, you can get some recommendations for tools for tagging cloud resources, application performance monitoring, and breaking down cloud spend. Check out the FinOps Guide to Cloud Cost Allocation for strategy tips.
In addition, Azul’s high-performance Java platform uses fewer resources to achieve the same or even greater performance. This efficiency enables organizations to save on infrastructure costs, especially expensive cloud instances. A High-performance Java Platform is one tool in your toolbox for lowering cloud costs.