Here’s the reality for any platform team – No single person can be an expert in all of it. This constant demand for deep, specialized knowledge across dozens of systems creates inevitable bottlenecks and forces firefighting. This specialization gap is the raison d’être for our field, and there needs to be a way to bridge it.
The Power of Anthropic SKILLs – Codifying Expertise
Recently, we found a potential answer in Anthropic SKILLs, and a production MongoDB alert proved its transformative value.
Anthropic recently released the concept of a SKILL. Simply put, a SKILL is a way to give an LLM reliable, specialized capabilities that go beyond its general training data. You’re essentially providing the AI with a custom, specialized tool and clear instructions on how and when to use it. It’s the mechanism to transform a general purpose LLM into a highly effective specialist.
At Nirmata, we recognized this potential, and immediately integrated support for it in our Nirmata AI platform engineering assistant. Our assistant came pre-loaded with native SKILLs focused on our domain, such as Policy conversion (OPA to Kyverno, between Kyverno versions), Kyverno policy generation and Chainsaw tests. But the real game-changer is the ability for the agent to discover and learn new SKILLs.
Real World Test – The MongoDB Firefight
The opportunity to test this capability came quickly. We started getting alerts about high memory usage in our production MongoDB cluster.
The next step had to be a deep memory usage analysis. I knew a fix would require a significant amount of time researching database internals, even with help from a standard LLM.
Then a lightbulb moment – “why not translate this effort into a reusable Anthropic SKILL?” I should codify the solution so that the rest of the team never has to repeat the research.
The Process – From Troubleshooting to Reusable SKILL
- The Knowledge Base – Use an LLM to generate a very detailed, structured MongoDB memory troubleshooting guide.
- SKILL Generation – With the troubleshooting guide, then leverage Claude Code to write the required SKILL.md file and generate the supporting Python scripts.
- Deployment & Testing – Copy the generated code into the discovery directory used by the Nirmata assistant. It took about three iterations to refine the logic and ensure it performed.
- Execution & Results – Execute the new MongoDB Memory Analyzer SKILL on the production cluster. The SKILL provides immediate insight pointing directly to a sub-optimal WiredTiger cache configuration and specific indexes that need attention.
The recommended solution will allow us to safely save about 20 GB of memory. This enables us to switch to smaller AWS EC2 instances, representing a savings of around $1,400$/month.
The Unexpected Policy Guardrail
This is where the story shifts from fixing a problem to preventing future ones.
Because this specialized SKILL was running within the Nirmata AI assistant, something unexpected happened. After applying the fix and confirming the memory usage dropped, the agent asked to proactively install policies to check MongoDB configuration best practices, in order to avoid the issue from recurring.
We were so focused on solving the immediate fire that we hadn’t thought about the necessary guardrails. But the agent did! Empowered by SKILLs and Policy-as-Code.
Here is the key section of the Kyverno ClusterPolicy the agent generated:
The Kyverno Policy uses complex CEL logic to automatically enforce the memory best practice discovered by the SKILL. This is the ultimate feedback loop: Reactive SKILL → Proactive Policy.
The Takeaway – The Missing Link in Platform Engineering
The Anthropic SKILL technology is more than just a new feature, it’s potentially the missing link that platform engineering has needed for years.
- The Specialization Multiplier – SKILLs allow to write, share, and reuse highly specific, actionable knowledge that currently only lives in the heads of a few engineers.
- Capturing Organizational Wisdom – There is now a clean, versionable mechanism to capture the specialized skills that LLMs don’t have, like troubleshooting esoteric Kyverno issues or performing detailed performance analysis in all sorts of environments, and share it across the entire organization.
- Bridging the Heterogeneity Gap – The SKILL approach is the perfect abstraction layer that allows for mastery of all the different systems to manage (Kubernetes, MongoDB, cloud providers, etc.) without having to hire siloed specialists for every single one.
This could be the future of platform engineering – a system that empowers generalists with expert knowledge, leading to tangible cost savings, better stability, and a truly proactive posture.




Sorry, the comment form is closed at this time.