7 Steps for managing multi-cloud expenses
Cloud computing is seen by most CIOs and senior IT execs as a key enabler for driving their digital strategy, providing greater flexibility, agility, faster time-to-market and driving innovation. However, the counter balance to organisations fully embracing the cloud is the potential high cost of operating from the cloud.
But why would running services in the cloud be more expensive than housing those services in a private infrastructure? Surely the whole point of the cloud movement has been that it leverages economies of scale, using high density state-of-the-art computing technology with a full layer of automation wrapped around it.
In contrast legacy IT is highly customised, lacking the massive scale and computing density of public Infrastructure as a Service (IaaS) and the vast levels of automation that is wrapped around cloud services. So, it would be reasonable to expect operating in public cloud would be more cost effective. But that is not what we see. Most organisations struggle to achieve the anticipated Total Cost of Ownership (TCO) calculations for migrating their services to the cloud.
This blog attempts at summarising the key areas contributing to high costs of running IT services in the public cloud (IaaS) and 7 best practice steps, defined by Gartner, to help bring these expenses under control.
Optimistic cost models for transitioning to the cloud
The reality is that not all applications and legacy systems are cloud-ready. We have seen many organisations addressing the tactical business demands to move services into the cloud without thoroughly assessing the transition plans for a full cloud migration. This typically results in significantly higher levels of staff workload reacting to elongated project timelines and having to manage and maintain these services with little best practice and automation in place, negating the cloud migration business case.
Cloud skill gaps
Lack of best practice and know how are also a significant contributing factor in skewing the TCO models. The more time is spent trying to address the technology, the less time is spent in driving innovation based on new capabilities offered by the cloud. Although the situation is improving, in a recent survey, 25% of enterprises are struggling to address this skills gap.
Hidden vs Visible costs
Another contributing factor in balancing the books is down to the granularity of costs in running IT. Understanding the detailed costs for running a service in the cloud is made easy by the fine level of usage metering and billing that the cloud providers offer. Unfortunately, the same does not apply to the legacy IT running in a data centre. Most organisations do not charge back the business for the use of IT services. So internal computing, storage, network and software costs are often heavily estimated (average across hundreds of services).
Whereas in the past a set of services would be operating as part of the backend IT cost centre, migrating them to the cloud suddenly incurs visible costs. Costs that were not modelled in the TCO in the first place.
In other words, the cloud option may not be costing more, it’s just that compared to an on premise inaccurate cost model, they seem much higher. These discrepancies have a significant impact on the perceived high costs of running IT in the cloud.
IT organisational structure aligned with multi-clouds
Some costs are far more difficult to articulate clearly in a TCO model. For instance, driving high levels of automation across multiple IT organisation silos and varying operating tempo clearly has a direct impact on efficiencies gained by moving to the cloud.
New organisation roles such as Cloud Orchestration, Provisioning Manager, Persistence Manager, API Manager or Service Portfolio Manager need to be mapped against IaaS (and SaaS/PaaS) operating model. Aligning these new cloud-required functions with organisation structure will help drive automation and cloud best practice maturity, thereby helping to realise the potential cloud cost savings much faster.
Complex pricing models offered by cloud providers
The vast range of consumable services, coupled by a myriad of cloud à la carte pricing, contractual based discounts and periodic incentives make the task of choosing the right provider and service options difficult. IT organisations who are just venturing into the cloud may not fully appreciate the implications of these pricing models, which may contribute to unnecessary spend.
And even when best practices are well understood, applying them consistency across a large deployment is often proving very tricky and time consuming.
To make matter worse, the same fine grain cost analysis that we applaud cloud service providers for, manifests itself as highly complex monthly or quarterly billing reports that are difficult to comprehend, analyse and cross reference against internal accounting. One client we are working with is reporting 120,000 billing line items for 600 servers hosted in a public cloud every month!
Best practice for containing public cloud costs
Given these obstacles what are the best practices and small-scale maturity steps that IT organisations can take to bring governance and control over their cloud-based IT services?
Gartner has gone a significant way in articulating a set of best practice to help organisations understand their cloud related expenses and how to minimise them.
Our findings are that even a modest application of these best practices can deliver substantial savings. Over time as these are ingrained into formal processes and day-to-day operations, the levels of maturity for cloud adoption will increase thereby magnifying cost reduction and efficiency efforts.
1. Design and adhere to a tagging plan
Tags are used to designate environments, organisations, technical infrastructures, and projects in addition to cost centres. The amount of tagging functionality varies from provider to provider, which in turn makes it a difficult task to have a consistent understanding of what cloud assets have been deployed, what is being used and what is being billed for.
A good tagging plan is one which supports the downstream reporting, automation and governance tasks required to support a complex cloud infrastructure. Although each enterprise has its own reporting and management requirements, the following are what we typically see as best practice for managing and controlling IT assets residing in public cloud:
- Organisation/Business Unit
- Cost Centre
Applying a consistent tagging plan becomes even more complicated once IT organisations offer public cloud self-service to their internal users.
Typical activities involved in assessing the state of the deployed tags against the tagging plan must include:
- Understand what IT assets and their dependencies residing in the cloud.
- Discovering cloud tags associated to each cloud IT asset.
- Improve tagging quality through tag enrichment from the information held in the CMDB.
- Define best practice tagging and ensure the tagging plan is consistent with the Capacity Management function. This is critical as the following steps all are dependent on accurate utilisation reporting based on tags and application services.
2. Right-size resources
A key contributor to the high costs of running services in the public cloud is that as companies move apps hosted on-premise to the cloud, they simply replicate their existing VM resource allocations without first determining if the app really needs all those resources. McKinsey estimates that anywhere from 88 percent to 94 percent of public cloud capacity is wasted.
Enterprises that haven’t instituted a charge-back model for resource usage are especially prone to over-provisioning abuses, as the application owners have no incentive to scale-back their infrastructure allocations.
Underlying theme is a lack of visibility into the amount of IT resources an application needs to meet its performance and service-levels requirements. Many cloud management tools are quite simplistic and provide broad information about peak loads and other cloud instance data.
To avoid resource starvation or overbuying, you must select the appropriate resource profile for your organisation’s needs and review utilisation regularly. Ideal review frequency can range from hourly to monthly depending on the size and volatility of the deployment.
A breakdown of the activities supporting this step include:
- Know all the components and dependencies that deliver an application service.
- Gather baseline resource consumption usage for this service.
- Identify application migration priority list based on criteria such as consumption, operating systems, business service criticality and deployed environments.
- Migrate application into the cloud based on actual usage to avoid resource starvation, security vulnerabilities and a general degradation in performance and conversely overbuying unnecessary resources resulting in high costs.
- Continue monitoring resource consumption post migration. That’s where the cost is.
3. Choose an appropriate pricing model
Each cloud provider has a unique bundle of services and pricing models. Different providers have unique price advantages for different products. Typically, pricing variables are based on the period of usage with some providers allowing for by the minute usage as well as discounts for longer commitments. These models are then further enhanced (read more complicated) through various incentive programmes.
When used properly, they can lead to overall savings, but casual and unplanned use could lead to more unnecessary spending.
Key steps include:
- Compare costs for your entire application service across multiple cloud providers prior to migration. Remember, prices change on a regular basis and the billing mechanism may be complex, so this process needs to be as automated as possible.
- Once migrated, continue to monitor cloud expense against the original pricing model and verify if expenditure is in line with the original TCO model.
4. Scrutinise and limit data egress
Generally, migrating data into the cloud is free. However, data egress charges, or the costs organisations pay to move data from the cloud to another area are almost always far more expensive. Subsequently, in order to bring control over cloud costs and bring additional levels of security governance, it is important that all data egress is scrutinised regularly.
Key steps here are:
- Discover both in-bound and out-bound application services data communications dependencies. This needs to be an automated process.
- Establish if all data transfers mapped out in the first step are required and remove those links that are unnecessary.
- Analyse total inbound and outbound network volumes to establish data transfer costs.
- Remove or throttle data egress, where unnecessary and continue to review on an ongoing basis.
5. Reclaim orphaned resources
Although it is quite easy to create resources in a public cloud, tracking them through their lifecycle can be quite tricky, leading to orphaned (or unused) VMs, storage or network resources which continue to be billed for. By fully understanding all the assets related to the service, organisations can bring about good data governance (regulatory requirements such as GDPR or PCI compliance), while at the same time having visibility of all required assets and their status (available/offline) in delivering the service.
Where assets are no longer needed, they can either be stopped (so you don’t pay for them anymore) or reallocated to other services.
To have full visibility of orphaned resources the following steps need to be taken:
- Identify list of unallocated computing and storage resources.
- Resource utilisation reports to identify orphaned resource targets (say all resources running at <10% utilisation).
- Delete or reassign orphaned resources to other application services.
- Regular automated monitoring of unused resources is highly recommended.
6. Throttle poorly utilised resources
Another major factor in controlling costs of running services in a public cloud is managing resources which are used during specific day parts and are idle at other times. For some use cases, such as development and test environments, simply turning off the service is not acceptable as the work-in-progress will need to persist for the next usage period.
Since public IaaS clouds operate virtual machines, the working state of unused compute resources can be saved and the resource turned off until the next busy period, thereby significantly reducing operational costs.
Throttling underutilise resources involves:
- Regular reporting to identify poorly utilised resources.
- Action plan for throttling/minimising under-utilised resources.
- Verifying that the throttling action plan is effective.
7. Use the free tools
Major cloud providers offer basic, free, native tools that can help provide some visibility into resource consumption and help with cost monitoring and planning. As part of the general cloud governance maturity journey, these free tools need to be reviewed and augmented to enable full analysis of cloud costs and consumption tracking for entire application service lifecycles across both multi-cloud and legacy infrastructures.
Gartner’s 7-step best practice are just a starting point. They are relatively simple, small step activities that not only help reduce costs, but also bring about good IT governance.
Once these steps are in place, organisations can focus on:
- Legacy IT application migration strategy to the cloud, including SaaS, PaaS and IaaS adoption plans.
- Applying further automation for seamless service delivery and operations
- Full utilisation of multi-cloud architectures to maximize flexibility and cost efficiencies.
Head of Managed Services
For all company updates and industry insights, follow us on Twitter @FusionGBS, LinkedIn and Facebook.