In the previous post in this long-running series, we looked at some of the foundations underlying Kubernetes controllers, and started looking into the concepts behind the tools/cache
package, specifically ListerWatcher
, Store
and Reflector
. In this post we’ll look at the actual concept of a Controller
.
So far, I’ve been careful about capitalization. I’ve written “Kubernetes controller”, not “Kubernetes Controller” or “Kubernetes Controller
“. That’s been on purpose. That’s in part because the tools/cache
package has a Controller
type, which logically sits in front of Reflector
. You can see for yourself:
// Controller is a generic controller framework. type controller struct { config Config reflector *Reflector reflectorMutex sync.RWMutex clock clock.Clock } type Controller interface { Run(stopCh <-chan struct{}) HasSynced() bool LastSyncResourceVersion() string }
Note, interestingly, that strictly speaking all you have to do to implement the Controller
type is to supply a Run()
function (for other Java programmers, the stopCh
“stop channel” is Go’s way of (essentially) allowing interruption), a HasSynced()
function that returns true if, er, synchronization has been accomplished (we’ll look into what that means later), and a LastSyncResourceVersion()
function, which returns the resourceVersion
of the Kubernetes list of resources being watched and reflected.
An interesting point here is that this means that whereas Reflector
was a generic interface completely decoupled from Kubernetes, this interface is conceptually coupled to Kubernetes (note the inclusion of the term Resource
and the concept of a resource version, both of which are concepts from the Kubernetes ontology). This observation can help us later on with refining our Java model.
Next, look at the controller
struct
, which is the state-bearing portion of a Controller
implementation. It takes in a Config
representing its configuration, and has a “slot” for a Reflector
, along with unimportant-to-the-end-user bits related to testing and thread safety.
So what does a Controller
do, exactly, that a Reflector
doesn’t already do?
A clue to the answer lies in the Config
type, which is used as an implementation detail by only one particular implementation of the Controller
type:
// Config contains all the settings for a Controller. type Config struct { // The queue for your objects; either a FIFO or // a DeltaFIFO. Your Process() function should accept // the output of this Queue's Pop() method. Queue // Something that can list and watch your objects. ListerWatcher // Something that can process your objects. Process ProcessFunc // The type of your objects. ObjectType runtime.Object // Reprocess everything at least this often. // Note that if it takes longer for you to clear the queue than this // period, you will end up processing items in the order determined // by FIFO.Replace(). Currently, this is random. If this is a // problem, we can change that replacement policy to append new // things to the end of the queue instead of replacing the entire // queue. FullResyncPeriod time.Duration // ShouldResync, if specified, is invoked when the controller's reflector determines the next // periodic sync should occur. If this returns true, it means the reflector should proceed with // the resync. ShouldResync ShouldResyncFunc // If true, when Process() returns an error, re-enqueue the object. // TODO: add interface to let you inject a delay/backoff or drop // the object completely if desired. Pass the object in // question to this interface as a parameter. RetryOnError bool } // ShouldResyncFunc is a type of function that indicates if a reflector should perform a // resync or not. It can be used by a shared informer to support multiple event handlers with custom // resync periods. type ShouldResyncFunc func() bool // ProcessFunc processes a single object. type ProcessFunc func(obj interface{}) error
(As you read the excerpt above, bear in mind that it is in the controller.go
file, but nevertheless contains lots of documentation forward references to types and concepts from other files we haven’t encountered yet.)
So not all Controller
implementations have to use this. In fact, Controller
is completely undocumented! But realistically, the only Controller
implementation that matters, the one returned by the tersely-named New()
function, backed by a controller
struct
, does use it, so we’d better understand it thoroughly.
The first thing to notice is that a Config
struct
contains a Queue
. We can track down the definition for Queue
in fifo.go
:
// Queue is exactly like a Store, but has a Pop() method too. type Queue interface { Store // Pop blocks until it has something to process. // It returns the object that was process and the result of processing. // The PopProcessFunc may return an ErrRequeue{...} to indicate the item // should be requeued before releasing the lock on the queue. Pop(PopProcessFunc) (interface{}, error) // AddIfNotPresent adds a value previously // returned by Pop back into the queue as long // as nothing else (presumably more recent) // has since been added. AddIfNotPresent(interface{}) error // Return true if the first batch of items has been popped HasSynced() bool // Close queue Close() }
So loosely speaking any Queue
implementation must also satisfy the Store
contract. In Java terms, this means a hypothetical Queue
interface would extend Store
. Let’s file that away for later.
We can also see some concept leakage here: recall that Controller
implementations must have a HasSynced
function, but its purpose and reason for being are undocumented. When we look back at the controller
–struct
-backed implementation of Controller
, one possible implementation of the Controller
type, we can see that the implementation of its HasSynced
function merely delegates to that of the Queue
contained by its Config
. So there is a tacit assumption that a Controller
implementation will most likely be backed by a Queue
, though this is not strictly speaking required, since that would be the easiest way to implement the HasSynced
function. This also serves as the only documentation we’re going to get about what that function is supposed to do: Return true if the first batch of items has been popped
.
Back to the Config
. It also contains a ListerWatcher
. Hey! We’ve seen one of these before. We had realized that it is a core component of a Reflector
. So why doesn’t a Config
merely have a Reflector
? Why is encapsulation broken here—isn’t ListerWatcher
basically an implementation detail of a Reflector
? Yes, and there doesn’t seem to be a good reason. We can tell from some source code later on that when the controller
–struct
-backed implementation of Controller
‘s Run
function is called, only one possible implementation of such a function, it creates a Reflector
just-in-time using the ListerWatcher
housed by the Config
. Why the Reflector
isn’t passed in as part of the Config
is an open question. At any rate, logically speaking, part of a controller
–struct
-backed implementation of the Controller
interface is a Reflector
.
Next up is the first actually interesting part that we really haven’t encountered before, which gives us a better idea of what a Controller
is supposed to do: the ProcessFunc
. A ProcessFunc
appears, from its documentation, to do something to a Kubernetes resource:
ProcessFunc processes a single object.
So even from this little bit of documentation we can see that ultimately a Controller
implementation that happens to use a Config
(remember, it’s not required to) will not only cause caching of Kubernetes resources into a Store
(remember, it’s not required to), but will presumably work on objects found in that Store
as well. These are shaky assumptions, and not enforced by any contract, but turn out to be quite important, not just to the de facto standard implementation of the Controller
type (the controller
–struct
-backed one that uses a Config
), but to the implied, but not technically specified, contract of the Controller
type itself.
Summed up, most Controller
implementations probably should use a Reflector
to populate a Queue
(a particular kind of Store
) with Kubernetes resources of a particular kind, and then also processes the contents of that Queue
, presumably by Pop
ping objects off of it.
Again, it is worth noting that this summary is based on one particular implementation of the type (the de facto standard controller
–struct
-backed one), and not on the type’s contract itself, but elsewhere in the package you will be able to see that this is expected, if not enforced, behavior of any Controller
implementation.
We might visually model all this like so:
Here, a Controller
uses a Reflector
to populate a Queue
, and has a protected
process
method that knows how to do something with a given object. It also has a protected
shouldResync()
method, a public
hasSynced()
method, and the ability to report what the last Kubernetes resource version was after a synchronization. We’ll greatly refine and refactor this model over time.
In the next post, we’ll look at informers and shared informers which build on top of these foundations, again with an eye towards modeling all this idiomatically in Java.
3 thoughts on “Understanding Kubernetes’ tools/cache package: part 2”
Comments are closed.