Understanding Kubernetes’ tools/cache package: part 9—Kubernetes controllers in Java!

In part 8 of this series (you can start at the beginning if you want), we looked at some of the foundational principles and stakes in the ground we’re going to establish and plant in the ground as part of our journey towards making an idiomatic Kubernetes controller framework in Java and CDI 2.0.

Let’s look at an idiomatic implementation of these concepts centered around the fabric8 Kubernetes client implementation of Kubernetes watch functionality, and (eventually) targeted at a CDI 2.0 environment.

A Reflector‘s main purpose is to reflect a certain portion of the state of the Kubernetes universe into a cache of some kind.  It needs to do this in a fault-tolerant and thread-safe way, taking care not to block the thread it’s using to talk to Kubernetes.  As we’ve seen, this involves starting by listing objects meeting certain criteria and then setting up a watch for those objects and reacting to the logical events that occur to them by offloading them quickly into some kind of cache and scampering back to Kubernetes for more.  Then periodically a synchronization of downstream consumers with known state is performed, making sure that downstream consumers never miss any events concerning Kubernetes resources they care about, even if the listing and watching machinery suffers a hiccup.

The Go code around all this is relatively complicated.  Fortunately, though, since we’re using the excellent fabric8 Kubernetes client, we can take advantage of the fact that it already has sophisticated watch functionality that, among other things, handles reconnections for us.  So we don’t have to write code around all that: we just need to set up the listing code and provide a Watcher to be notified of watch events.  That will take care of listing and watching behavior and reduce a lot of code we have to write.  Hooray for laziness.

We can also take advantage of Java’s generics to ensure that we strongly type what objects we support (Pods, Deployments, ConfigMaps, and so on).  We can ensure that type information makes it all the way into the internals of our implementation.  The Go code needs to check this at runtime since Go has no support for generics.  That should simplify our code a little bit.  Hooray for more laziness and type safety.

A Reflector, as we said, needs to update a cache in a very specific way.  The Go code models the cache as a store of objects.  If we dig in a little more, though, it turns out that even in the Go code the Reflector only needs the mutating operations to be exposed, not the getters and listers.  Specifically, a reflector will look to add, update and delete objects in the store, and will also look to resynchronize state and to do a full replace of objects in one atomic shot.  But that’s it.

We can reduce this surface area in our idiomatic Java Reflector implementation even more by recognizing that if we start out by modeling the cache as a cache of logical events rather than a cache of arbitrary objects we only need to support one addition method that takes an event as a payload.  Then we have to support a total of three operations:

  • // Make an event from the supplied raw materials; store it
    public void add(final Object source, final Event.Type eventType, final T resource)
  • // Do a full replace of our state
    public void replace(final Collection<? extends T> objects, final Object resourceVersion)
  • // When told to synchronize, look at our known objects and
    // fire synchronization events downstream for each of them
    public void synchronize()

So from the standpoint of what a Reflector needs to do its job, we know what an event cache must look like at a minimum.  In fact, it might look like this EventCache interface from my microbean-kubernetes-controller project on Github!😀

Then our Reflector might look like my Reflector class in the same project.

We can see that it takes in something that is both Listable (and when it’s listed returns a KubernetesResourceList descendant) and VersionWatchable—which can be produced from a KubernetesClient.  Objects satisfying these requirements are roughly equivalent to the Go code’s ListerWatcher with, as we’ve noted, the added benefit that reconnections are automatically handled, and if you can express them properly, then filtering is already handled.  Hooray for even more laziness.

The Reflector constructor also takes in the writable “side” of a cache of Events.  This is where it will “mirror” the events it receives from watching the Kubernetes API server.  This corresponds roughly with the Go code’s notion of a Store, and more specifically with the notion of a DeltaFIFO, but exposes only the mutating methods actually needed by a writeable store of events, not any extras related to reading or listing.  Also, where a DeltaFIFO had to turn a generic “add an arbitrary object” function into an “add an event recording an addition” invocation, we start with the notion that everything being stored is an event, so it’s one fewer step we have to take and one fewer transformation we have to make.

So, implement an EventCache, call kubernetesClient.configMaps() or something equivalent, pass them in to a new Reflector, start it, and you can begin filling your cache with events!

Of course, you’ll want to drain that cache, and also allow it to periodically inform its downstream customers of its state for synchronization purposes.  And a good implementation will sort incoming events into queues, where each queue is effectively the events that have occurred to a particular keyed Kubernetes resource over time.

Fortunately, there is a good implementation.  The stock implementation of this kind of EventCache is EventQueueCollection, which models all this and another part of the Go Store type.  EventQueueCollection lets you pass in a Consumer and start it siphoning off events in a careful, thread-safe manner, much like the Pop() function in the Go code.  Events are stored in queues per object, keyed by an overridable method that determines how keys for a given Kubernetes resource are composed.

The stock Consumer normally used here, an implementation of ResourceTrackingEventQueueConsumer, is a simple Consumer that takes a Map of objects indexed by keys at construction time.  These “known objects” are updated as events are siphoned off the upstream EventQueueCollection, and then those individual events are passed on to the abstract accept() method.  Once an event arrives here, it can be processed directly, or rebroadcast, or whatever; all housekeeping has been taken care of.  We’ll come back to this.

Back to those known objects.  EventQueueCollection takes in the same kind of Map, and this is no coincidence.  Your consumer of events and your EventQueueCollection should share this cache: when your EventQueueCollection gets instructed to synchronize(), it will send synchronization events to any downstream consumers…

…like the ResourceTrackingEventQueueConsumer we were just talking about, who will update that very same Map of known objects appropriately.

This is all tedious boilerplate to set up, so just like the Go code, we put all of this behind a Controller façade.  A Controller marries a Reflector and an EventQueueCollection together with a simple Consumer of events, and arranges for them all to dance together.  If you start the Controller, it will start the Consumer, then start the Reflector, and you will start seeing events flow through the pinball machine.  If you close the Controller, it will close the Reflector, then close the Consumer and you will stop seeing events.  All overridable methods in Reflector andResourceTrackingEventQueueConsumer are “forwarded” into the Controller class, so you can customize it in one place.

So: create a Controller, give it an expression of the “real” Kubernetes resources you want to watch and list, hand it a consumer of mirrored versions of those events, and a Map representing the known state of the world, and you can now process Kubernetes resources just like Kubernetes controllers written in Go.

This is, of course, still too much boilerplate, since in all of this we still haven’t talked about business logic.

Therefore, in a CDI world, this is all a candidate for parking behind a portable extension.  That will be the topic of the next post.  Till then, have fun looking at the microbean-kubernetes-controller project on Github as it develops further.


2 thoughts on “Understanding Kubernetes’ tools/cache package: part 9—Kubernetes controllers in Java!

  1. Pingback: Understanding Kubernetes’ tools/cache package: part 10—Designing Kubernetes controllers in CDI 2.0 | Blame Laird

  2. Pingback: Understanding Kubernetes’ tools/cache package: part 8 | Blame Laird

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s