Successfully managing Kubernetes on Day 2 often boils down to successfully taming complexity. The more complex the environment, the more challenging it is to monitor, update and heal. Even with only one cluster, there will be multiple components that need tracking — services, agents, networking and storage. Every part of the cluster has to be monitored for security issues and patched if one is found. Multiplied by the dozens or hundreds of clusters in a typical environment, and it’s unreasonable to expect that humans would be able to wrap their heads around the many moving parts without the right tools.
There’s good news and bad news for complexity management in Kubernetes. The bad news is that Kubernetes and containers in general create a more complex system than virtual machine-based environments. The good news, though, is that Kubernetes has a powerful declarative API interface that makes it possible to manage all of these components in a consistent way.
The role of complexity
In fact, Kubernetes’ complexity isn’t always a bad thing. Kubernetes’ flexibility and extensibility are one of its main selling points, but both of those things contribute to complexity. It’s also not necessarily a bad idea to install Kubernetes manually, and even to operate it manually, as part of learning how Kubernetes works.
The problem is that the more complex a manual task is, the more likely that critical steps are overlooked or done incorrectly. Manually installing Kubernetes once is a good learning experience, but a bad way to scale Kubernetes adoption throughout the organization.
Complexity during set-up
If an organization is going to bring hundreds of clusters a day, whether it is virtual clusters or physical clusters, there’s absolutely no question that the process needs to be as simple as possible. Taking several hours to manually configure Kubernetes the first couple times is a very good way to learn how the platform works, but it does not scale.
Not only does a complicated cluster spin-up process slow down development, it also increases the risk of mistakes and oversights. The manual steps needed to launch or operate a cluster should be as simple as possible — ideally, with a policy engine like Kyverno in place so that when mistakes inevitably do happen they are automatically fixed. Taming complexity in Kubernetes, at an organizational level, isn’t just about controlling how one individual manages configurations, but rather creating complete organization-wide consistency.
This becomes increasingly important at scale. Reducing unnecessary complexity also requires a level of consistency that’s nearly impossible to achieve without organization-wide guardrails and automation tools. If every cluster in the system is different — uses different storage types, has a different networking set-up — it is much more complex to operate. This type of complexity also adds no value to the system, and should be reduced whenever possible.
A smooth Day 2 experience ultimately relies on a consistent, error-free configuration at the design, development and launch phase. The more manual steps in that process, the more likely problems are to arise on Day 2.
Operational complexity
As long as the system has been set up in a way that minimized complexity and maximized consistency, operational complexity should be easily manageable with the right automation tools. It’s impossible, however, to separate operational complexity from the deployment phase. If organizations wait until an application is in production to consider how the lack of consistency impacts availability and security they will almost inevitably be unable to run the application — and that’s especially true when multiplied across the many applications in a typical organization’s portfolio.
Focus on the unpredictable
In complex environments, the challenges that come up are not always predictable. Yet operations engineers have limited time and a limited ability to focus on everything at once. No team will ever be able to automate away all of the potential operational challenges in Kubernetes, but by automating the repeatable, predictable part of the operations story, platform teams are able to devote more time to the unexpected. In addition, automating as much of the set-up process and ensuring consistent application of best practices reduces the likelihood of unpleasant surprises at runtime.
When automation tools handle the predictable, repeatable parts of both initial configuration and operations, engineers have more time to focus on both any unexpected challenges as well as making incremental improvements in availability, performance and security. They are able to track how the application is performing not just technically, but also against the organization’s business objectives. Those are tasks an automation tool can’t do.
Conclusion
Kubernetes is complex, but the experience for developers and cluster administrators using Kubernetes doesn’t have to be. Organizations should focus on both minimizing Kubernetes inherent complexity by ensuring consistent configurations and consistent application design across clusters while also using tools that simplify the developer and operator experience.
Nirmata helps organizations tame complexity at both the deployment and operations stage, so that Day 2 operations are as simple as possible. See how it works here.
Sorry, the comment form is closed at this time.