In the ever-evolving landscape of digital transformation, businesses are rapidly embracing cloud technologies to drive innovation, enhance agility, and stay ahead in a competitive marketplace. However, with the myriad opportunities presented by the cloud comes the challenge of efficiently managing the associated costs. The need for precise financial control and strategic optimisation in the cloud environment has never been more critical.
Large organisations face several significant challenges when it comes to managing and controlling cloud resources and their consumption. These challenges can impact their ability to optimise costs and effectively use cloud resources. Some of the main challenges include:
Extensive and complex cloud infrastructures with numerous accounts, workloads, services, and resources. Managing this complexity at scale can be challenging and may lead to increased spending.
Gaining full visibility into cloud usage and spending across multiple teams and business units can be difficult. Without clear visibility, it is challenging to identify cost drivers and opportunities for optimisation.
Over time there is an increase in resource sprawl, where unused or underutilised resources accumulate, driving up costs. Managing and eliminating this sprawl can be time-consuming.
Cloud providers offer a variety of pricing models and options, including on-demand, reserved instances, and spot instances, each with its own complexities. Choosing the right pricing model for each resource can be challenging.
Data transfer costs between cloud regions, zones, and services can be significant and are often overlooked. Managing and optimising these costs can be complex.
The absence of robust cloud governance practices can lead to unchecked resource creation and configuration changes, resulting in higher costs and potential security risks.
Some legacy systems need to be integrated with cloud services. This integration can introduce complexity and costs that need to be carefully managed.
In this dynamic scenario, Celfocus’s Cloud FinOps Offer is tailored to empower businesses to harness the full potential of cloud resources but also to do so with a strong focus on cost-effectiveness and financial transparency.
Celfocus proposes a multi-cloud solution that will become the foundation for the existing and future cloud architectures, aligned with the FinOps Foundation Framework and compliant with the organisations' business, security, and operations requirements.
Celfocus supported the implementation of a FinOps solution framework for cloud optimisation on AWS Cloud. The FinOps solution used a spot instances model implementation for the workloads (balanced for production environments, full Spot in non-production), aiming to switch the demand for short-lived situations.
One of the main challenges was the lack of best development practices that caused applications to be deployed without any information about the minimum specifications required for their operation, causing the Kubernetes cluster to become unstable as it was unable to make the best decisions about how applications should be accommodated. Vertical Pod Autoscaling (VPA) was brought into the dynamic assessment of each application's consumption baseline, and Karpenter was configured to analyse the required workloads and decide the number and type of instances that should be deployed for a cost-effective environment. These mechanisms instantly adapted to new situations and baselines.
By having spot instances, right-sizing, and intelligent autoscaling, the cost of the EC2 service was reduced by more than 40% when compared to its previous full on-demand one instance type fits-all approach, while providing an overall more stable and resilient platform.
Celfocus supported the implementation of Next Generation Monitoring (NGM) on the AWS Cloud with FinOps optimisation by design. NGM is a near real-time AI operations service monitoring with autonomous anomaly detection over core service events. Overall service response times were reduced by performing faster pinpoint of probable root causes with Machine Learning, and automating ticket creation & tracking.
Projects involving real-time data streaming require a lot of computing resources, and keeping cloud costs low can be a challenge to maintain. With non-production environments with automatic shutdown and startup policies, spot instances for project sandboxes, and right-sizing of production infrastructure to avoid over-provisioning, it was possible to achieve approximately 35% cost reduction.