Here is the second part (here is part 1) of our intern’s journey into Kubernetes and will highlight Aastha’s project and experience at Nirmata.
Thank you, Aastha for your contribution. We hope to see many amazing things from you in the future!
Leader Election Using Fabric8 Library
I started off by studying examples of leader election in the Fabric8 library. Having completed the first exercise of leader election which was built upon Curator, I had a proper understanding of the Curator recipe for leader election and studied the Fabric8 library with the intention of mapping the existing Curator logic to Fabric8’s code base.
Next, it was time to delve into Nirmata’s Environments service which uses leader election in the UsersChangeConsumer class and replace the existing implementation using what I had learned from the Fabric8 leader election examples. In order to understand the expected behavior, Yun taught me how to use the devtest environment to observe the results of the UsersChangeConsumer class as it currently is in the Nirmata service. Each time I added a user via the Nirmata interface after scaling the environments pod to two replicas, one pod would print a log saying that it was handling the users notification and the other would ignore the notification, thus demonstrating a successful leader election where a single instance is responsible for handling or executing the UsersChangeConsumer logic.
The process of testing my own code using a new Github branch was something we had to do in class, but being able to actually use that skill using the Nirmata Bitbucket was really exciting for me. I would recommend that during meetings where an employee is setting aside time to teach you something, make sure you take notes or even record the meeting for future reference. This is one huge advantage of the remote environment, where it’s easy to start a Zoom recording that you can revisit. If at any point I was stuck, our mentors made us feel very comfortable with asking questions. We could message any of them about our questions without hesitation, include our blocking point in our daily stand-up message, or bring it up in our daily check-in meetings and they would be more than happy to help us. This made my learning experience at Nirmata a lot stronger!
Looking deeper into the code for the leader election project, there were a few key changes I made to the UsersChangeConsumer class for the Fabric8 implementation.
- The constructor creates a variable called _executorService using Executors.newSingleThreadExecutor() in order to allocate a separate thread for the leader election. The constructor also calls the startLeaderElection() method.
- Added NAMESPACE and NAME private static final variables to create the lock. Used namespace of service pod for NAMESPACE and “userschangeconsumer” for NAME. If the program is working correctly, this Lease object can be referenced through kubectl get Lease where you can see who is holding the lock.
- Changed the startLeaderElection() method to print the lockIdentity so we can cross reference this with which instance is holding the Lease object to confirm which instance has leadership. Next, I used the executor service to submit a runnable thread for the leader election using the lock identity as a parameter. This leader thread is responsible for configuring leader election using a DefaultKubernetesClient.
- Created private instance variable isLeader which updates in the Callbacks. The isLeader variable is false onStopLeadLeading and true onStartLeader. This isLeader flag is used in handleNotification() in place of curator’s leaderSelector.hasLeadership method and determines whether the instance will handle user notifications if it’s the leader or ignore them as a non-leader.
- Used the LeaderExecutor takeLeadership debug messages of “Taking leadership…” and “Giving up leadership…” in the callbacks of the leader election configuration in the leader() method. There is no LeaderExecutor class in my implementation.
- Check if _executorService is not null in the stop() method and shut it down if this is true.
Finally, it was time to test the new implementation. These were the three tests I focused on:
- Scale environments to 2 replicas and confirm one pod is the leader that executes the necessary task
- Delete the leader pod and confirm the other pod assumes leadership
- Delete the lease lock object and confirm that another pod becomes the holder of the lock instantly
The configuration that matched the expected behavior was using a lease duration of 8L seconds, a renewal deadline of 5L seconds, and a retry period of 2L seconds which I determined through conducting several trials using various configurations.
Ultimately, this was a proof of concept that Fabric8 can be used to replace the existing leader election implementation, thus using Kubernetes API machinery and decoupling Nirmata services from etcd.
The distributed locks, leader election, and workflow prototypes demonstrated the ability to allow the Nirmata platform to depend on Kubernetes machinery as much as possible, eliminating the need for Zookeeper and Curator. For leader election, although we replaced the UsersChangeConsumer class as a proof of concept, the next steps would be to create a plan to replace leader elections throughout all of the different Nirmata services. The same goes for distributed locking: although we successfully integrated MongoDB distributed locks into one service, more work must be done to eventually apply that change across the Nirmata platform. As for distributed workflows, our prototype applications serve as a basis for a workflow library that may also eventually support functionalities such as task dependency, automatic cleanup of tasks, and error handling upon task failures.
What We Learned
This internship, which placed us in the middle of a fast-paced startup environment surrounded by cutting-edge technologies like Kubernetes, Docker, and MongoDB, had so many new experiences and valuable insights in store for us. One amazing aspect of the internship was the exposure it gave us to these tools, many of which we had never used before–especially not in our college classes. At a company that moves quickly with such a variety of technologies, we were constantly learning new things, rapidly changing focus from one project to the next. It was really cool to see the inner workings of distributed systems at such a broad scale, on so many different levels!
We learned firsthand about what it’s like to work at a tech startup. The weekly meetings gave us a bigger picture of what other employees at Nirmata were working on, which is helpful as we start to think about the roles we would like to take on as we enter the industry. We saw how meetings were structured, and how engineers communicate about and work together to solve issues. The concept of standup was also a great tool for productivity, as team members can easily gauge what others are working on and make sure they’re making progress on specific tasks. This was especially important in the fast-paced environment of a startup so that employees are prioritizing the tasks that are the most important. On a similar note, we learned about the importance of finding a balance between handling issues on your own and asking for help when you are blocked. Most likely, another employee has faced a similar issue or can point you in the right direction, so after you have spent an adequate amount of time looking for solutions, there should be no hesitation in asking for some extra guidance. We’re especially thankful to have had mentors like Yun and Shuting, who were always happy to help us with these blocking issues.
Ultimately, we are extremely grateful for the breadth of knowledge and skills we have added to our toolbox through our summer internship at Nirmata. We are excited to apply what we have learned as we continue on our journeys as software engineers.