Managing Day 2 in a Hybrid Cloud Environment

Managing Day 2 in a Hybrid Cloud Environment

A smooth experience with operations in Kubernetes, especially at scale, often comes down to managing complexity effectively. As application footprints get larger, the complexity explodes, making it challenging — or impossible — to manage manually. This is true even when operating in a single environment. But many organizations want or need to operate in multiple public clouds and/or on a mix of public cloud and on-premises infrastructure. 

Containers and Kubernetes make multi-cloud and hybrid cloud strategies possible, but they do not make them easy. Every time you add an environment, the complexity increases. This can easily become an unmanageable Day 2 nightmare if organizations don’t put the right tools into place.

Before jumping into the specific Day 2 challenges for multi-cloud set-ups, let’s address why organizations might need to operate in many environments. 

Why we adopt hybrid and multi cloud

There are a number of reasons organizations might need to be in hybrid and multiple clouds. They include:

  • The ability to fail over to another environment in case of an outage
  • A desire to keep particularly sensitive workloads on-premises while other workloads run in the cloud
  • The ability to leverage the different pricing models at each cloud provider to find the most economical way to run each workload
  • The public cloud providers have different geographical coverage, so some organizations want the ability to route users to the cloud provider with the data center closest to their physical location
  • Competitive reasons. Companies like Walmart might not want to store their data with Amazon, whereas Apple might not want to store its data on Azure. If both are your customers, you will need to offer the ability to operate in multiple clouds
  • Avoiding lock-in to a particular cloud provider, which can be important during pricing negotiations. 

These are compelling business reasons, and indeed hybrid and multi-cloud approaches are common. However, they can lead to their own Day 2 challenges, especially if these potential issues aren’t proactively addressed at the design and implementation phase. 

Avoiding snowflakes

The core challenge when running in multiple environments is getting cross-environment consistency. It’s tempting to think that because it’s all Kubernetes, everything will be the same. But above the Kubernetes control plane, there are still environment-specific differences, including:  

  • Kubernetes plugins for networking and storage
  • Add-on services related to security, monitoring and logging
  • Governance policies and best practices
  • Resource management and optimization. 

In addition, each environment will likely have its own dashboard, creating a fragmented experience for both the developers and for the teams responsible for operating the systems. The risk is that each environment can become a sort of snowflake environment, with no coordinated way to ensure organizational consistency. This increases the risk of misconfigurations, makes it more difficult for platform teams to manage centrally, and increases the learning curve. As a result, productivity drops and the risks of operational and security incidents increase. It also becomes more difficult to recover from errors, because if every environment is different they are more challenging to troubleshoot. 

These Day 2 pitfalls are in addition to the challenges organizations often encounter on Day 2 with Kubernetes, even when running in a single environment. If not addressed, they threaten to derail the entire Kubernetes transition. 

Designing for consistency

The key to success with Kubernetes in a multi-cloud or hybrid cloud set-up is to build consistency into how you manage the application lifecycle, including managing things like configurations and security in a consistent way. There should be a consistency throughout the design, implementation and operations phases — because Day 2 problems don’t happen in isolation but rather as a result of oversights during earlier phases in the application lifecycle. 

You won’t be able to change the inherent inconsistencies between environments — each environment will be optimized for that environment, using the storage, networking and operating system that works best. But organizations can and should add in an operational layer to bring management of the different environments into one platform, providing a consistent way to manage deployments and upgrades, to monitor applications from and to manage security. 

Using one platform to manage the multiple environments reduces the cognitive load and the learning curve for the platform engineers, while also providing developers with a more seamless experience. The result is not just better operational metrics, but also higher development velocity and better productivity for both development and operations engineers. 

In addition to thinking about bringing control over multiple environments together under one platform, organizations also need to think about how their organizational structure promotes consistency — or not. A structure in which one central team is responsible for managing the Kubernetes infrastructure across environments — with the assistance of a management platform — will result in better organizational consistency and ultimately better productivity, availability and security postures. 

Day 2 Kubernetes is ultimately about taming management complexity. Multiple environments add complexity to the system, and require organizations to proactively think about ways to bring disparate environments under consistency, central control. A platform like Nirmata brings all the information into one dashboard, making it easier to understand and control. See how it works.

How Policy Engines Make Day 2 Easier
How to Overcome the Day 2 Kubernetes Skills Gap
No Comments

Post a Comment