CreationalContext Deep Dive

What is a CreationalContext in CDI?

A perfect example of how naming things is the hardest problem in software engineering, really. The only method that it exposes that anyone should really be concerned with, release(), is used at destruction time, and has nothing to do with creating anything.

Here’s how I would describe it:

A CreationalContext is a bean helper that automatically stores @Dependent-scoped contextual instances on behalf of some other bean that has references to them, and ensures that they are cleaned up when that bean goes out of scope.

That’s basically it.

Consider a bean, B, that has a reference to a @Dependent-scoped bean, D. In Java terms, it might have a field like this:

@Inject
private D d;

Now, if the Bean implementation that created this B contextual instance has its destroy(T, CreationalContext<T>) method called, then B will be destroyed. B, in other words, will be the first argument.

When B is destroyed, you want to make sure that its dependent objects are also destroyed. But you also don’t want to burden the programmer with this. That is, you don’t want to make the programmer have to jump through some hoops inside B‘s logic somewhere to say “oh, when I’m destroyed, make sure to arrange for destruction to happen on my d field”. That should just happen automatically.

To allow this to happen automatically, CDI locates this logic inside a CreationalContext (yes, a “creational” context, even though we’re doing destruction. Naming is hard.). At the moment a CreationalContext is released, it has “in” it:

  • A bean that it is helping (usually)
  • A set of dependent instances that belong to the bean that it is helping
  • For each of those dependent instances some way to tie it to the Contextual (the Bean) that created it

When release() is called, the CreationalContext iterates over that set of dependent instances and their associated Beans and calls destroy() on each of those Beans.

So the programmer didn’t have to do any of this work. She just @Injected a D into a field and even if D is some sort of custom object it gets destroyed properly.

(It all follows from this that release() implementations must be idempotent and pretty much should never be called except from within a destroy() method of a custom bean. They also must work on only the bean that is being helped, and not on every dependent object “known” to the CreationalContext.)

OK, that’s all fine, but how did all these objects get “into” the CreationalContext in the first place?

When B was created, via a Bean‘s create(CreationalContext<T>) method, the container supplied that method with a new, empty CreationalContext that is associated with the Bean doing the creating. That is, prior to the create call, the container called beanManager.createCreationalContext(beanThatIsCreatingBInstances), and the resulting CreationalContext is supplied to beanThatIsCreatingBInstances‘s create method as its sole argument.

What does a custom Bean author here need to do with this CreationalContext as she implements the create method? The answer is: ignore it completely. That’s easy.

(More to the point: push(Object) does not, as you might be tempted to believe, stuff a @Dependent-scoped object into the CreationalContext such that release() will have any effect on it. The two methods are completely orthogonal. Around this point you should start getting suspicious: how does a dependent object get “into” an arbitrary CreationalContext anyway? An excellent question.)

In the case of managed beans—ordinary CDI beans, with @Inject annotations and whatnot processed by the container without any special funny business—remember that the container will take care of satisfying the injection points. So in the case of B with a D-typed d field injection point, the container will arrange for a D-type-producing bean to be invoked and then will automatically arrange for that dependent object to be stuffed into the CreationalContext.

That’s a lot to take in. Let’s try to break it down.

Recall that the container created a brand new CreationalContext to serve as the bean helper for B when it is about to call B‘s create method.

In order to “make” a B, the container is going to have to satisfy its D-typed injection point (the d field in our example).

To satisfy the D-typed injection point, the container will need to find out what scope is in effect. It will discover that the scope is @Dependent (since our example says so; presumably D is annotated with @Dependent which allows the container to call BeanManager#getContext(Class<? extends Annotation>)).

With the right Context in hand, the container will ask it for an appropriate D instance. The method that the container uses here is Context#get(Contextual<T>, CreationalContext<T>). Here, the container does not create a new CreationalContext. It passes the CreationalContext it has made for creating the B instance. That’s important.

A Context is responsible for doing basically whatever it wants to return an instance, so long as if a new instance is required it results (ultimately) from the return value of Contextual#create(CreationalContext<T>).

The @Dependent-scoped Context is obliged to create a new instance with every request, so it will dutifully invoke Contextual#create(CreationalContext<T>) and get back a new D instance. (If we pretend for just a moment that D is made by some kind of custom bean, the custom bean author never had to touch the CreationalContext when she implemented the create method. She probably just returned new D() or something similar.)

OK, so now the Context has possession of a new D object. But before it hands it back to the caller, it is going to stuff it in the supplied CreationalContext as a dependent instance. After all, the @Dependent-scoped Context always produces dependent objects that are tied to some “higher-order” bean, and we know that this is what CreationalContext instances are for: to store such dependent objects together with their referencing bean.

So it’s fine to say all this, but how does the @Dependent-scoped Context implementation actually add a dependent object to the CreationalContext? We’ve already seen the push(Object) method is not for this purpose.

It does it via proprietary means.

Weld, for example, does it via casting. This can get interesting since a user can supply her own @Dependent-scoped Context implementation: in order to add dependent objects herself she must tie her Context implementation to Weld.

You might think you could have your own @Dependent-scoped Context implementation that arranges for a CreationalContext to be used as a key into a Map of this kind of state. Then you wouldn’t be bound to a particular implementation of CDI. But of course if someone calls release() on a CreationalContext, you would have to somehow arrange to be notified of such a call, and that’s impossible.

So the upshot is that the CDI vendor, who returns CreationalContext implementations from BeanManager#createCreationalContext(Contextual<T>), is the only one who can supply any @Dependent-scoped Context implementations, no matter what the specification says.

Returning back to what the end user should “do” with a CreationalContext: the answer is basically ignore it. If you are writing a custom Bean implementation, then as the last operation in your delete implementation you can do this:

if (cc != null) {
  cc.release();
}

Otherwise, just leave the thing alone.

Decoding the Magic in Weld’s Instance Injection

I went down this rathole today and wanted to write it down.

In CDI, let’s say you have an injection point like this:

@Inject
@Flabrous // qualifier, let's say
private Instance<Frobnicator> flabrousFrobnicators;

The container is obligated to provide a built-in bean, whatever that is, that can satisfy any injection point whose raw type is Instance.

If you think about this for a moment or two you’ll realize that this is really weird. The container cannot possibly know “in advance” what injection points there will be, and so can’t actually create one bean for a @Default Instance<Frobnicator> and another for a @Flabrous Instance<Frobnicator>. So somehow its built-in bean has to be findable and appropriate for any possible combination of parameterized type (whose raw type is Instance) and sets of qualifiers.

Weld solves this problem by rewriting your injection point quietly on the fly (or at least this is one way to look at it). This was quite surprising and I was glad to finally find out how this machinery works.

For example, in the code above, as part of my injection point resolution request I have effectively said: “Hey, Weld, find me a contextual reference to a contextual instance of the appropriate bean found among all beans that are assignable to an Instance<Frobnicator>-typed injection point and that have the @Flabrous qualifier among their qualifiers.” Of course, Weld cannot actually issue the bean-finding part of this request as-is, because there is no such bean (how could it possibly pre-create an Instance<Frobnicator>-typed bean with @Flabrous among its qualifiers?). So how does this work, exactly? Something must be going on with @Any but it’s nowhere to be seen here and isn’t applied by default to injection points.

It turns out Weld recognizes a class of beans that they call façade beans for which all injection requests are effectively rewritten (during the bean-finding part of the resolution process). Instance is one kind; Event is another; Provider is another and so on—you can see why they’ve decided these are special sorts of things.

At any rate, when you ask for a façade bean, the request that is made for the bean itself uses only the @Any qualifier, no matter what you’ve annotated your injection point with. All beans, including built-in ones, have the @Any qualifier, so the one true container-provided Instance bean will be found. And there’s our answer.

OK, that’s fine, but in the example above now we have a qualifier, @Flabrous, that we actually want to use, or we wouldn’t have gone to all this trouble. How does that get applied, given that it is ignored in the bean sourcing part of the injection resolution request?

Weld has tricked its own innards into supplying what is technically an inappropriate bean—it pretended that we asked for an @Any-qualified Instance<Frobnicator> bean even though we didn’t—but now that it has it, it can ignore whatever qualifiers the bean bears (@Default and @Any, as it turns out, and none other) because they’re no longer relevant once the bean is found. All that matters now is contextual instance mechanics.

Because Instance and Event and Provider and other façade beans are required to be in @Dependent scope, it turns out that the current injection point is available and can be used internally by the bean itself to find out what qualifiers are in effect so that it can create an appropriate contextual instance. And that’s exactly what happens: the bean supplied by Weld is an extension of AbstractFacade which uses the injection point to determine what qualifiers are in effect.

This whole process is of course deeply weird and I’d imagine that it or derivative effects rely on a hard-coded list of façade beans somewhere. Sure enough, here’s an example of the sort of thing I mean.

Another way to approach this sort of thing might be to introduce a super-qualifier or something instead that says, hey, if a bean is qualified with this super-qualifier then it matches all qualifier comparison requests (which is really what’s going on here).

Anyway, I hate magic and am glad to have found out how this works!

ByteBuddy and private static final fields

Boy is this amazingly difficult. I’m writing it here so I won’t forget. I hope this helps someone else. Hopefully, too, there is a less verbose way to accomplish this.

The excerpt below does private static final MethodHandle gorp = MethodHandles.lookup().findStatic(TestPrivateStaticFinalFieldInitialization.class, "goop", MethodType.methodType(void.class)); in ByteBuddy. *** goop shows up on the console at the end. I have a StackOverflow post in case this changes.

Awful formatting courtesy of your friends at WordPress:

// Excerpt from JUnit Jupiter unit test whose class is named
// TestPrivateStaticFinalFieldInitialization:

  @Test
  final void testAll() throws Throwable {

    final MethodDescription findStaticMethodDescription = new TypeDescription.ForLoadedType(MethodHandles.Lookup.class)
      .getDeclaredMethods()
      .filter(ElementMatchers.named("findStatic"))
      .getOnly();
    
    final MethodDescription methodHandlesLookupMethodDescription = new TypeDescription.ForLoadedType(MethodHandles.class)
      .getDeclaredMethods()
      .filter(ElementMatchers.named("lookup"))
      .getOnly();

    final MethodDescription methodTypeMethodTypeMethodDescription = new TypeDescription.ForLoadedType(MethodType.class)
      .getDeclaredMethods()
      .filter(ElementMatchers.named("methodType")
              .and(ElementMatchers.isStatic()
                   .and(ElementMatchers.takesArguments(Class.class))))
      .getOnly();
    
    final ByteBuddy byteBuddy = new ByteBuddy();
    DynamicType.Builder<?> builder = byteBuddy.subclass(Object.class);
    builder = builder
      .defineField("gorp", MethodHandle.class, Visibility.PRIVATE, Ownership.STATIC, SyntheticState.SYNTHETIC, FieldManifestation.FINAL)
      .invokable(ElementMatchers.isTypeInitializer())
      .intercept(MethodCall.invoke(findStaticMethodDescription)
                 .onMethodCall(MethodCall.invoke(methodHandlesLookupMethodDescription))
                 .with(new TypeDescription.ForLoadedType(TestPrivateStaticFinalFieldInitialization.class))
                 .with("goop")
                 .withMethodCall(MethodCall.invoke(methodTypeMethodTypeMethodDescription)
                                 .with(new TypeDescription.ForLoadedType(void.class)))
                 .setsField(new FieldDescription.Latent(builder.toTypeDescription(),
                                                        "gorp",
                                                        ModifierContributor.Resolver.of(Visibility.PRIVATE,
                                                                                        Ownership.STATIC,
                                                                                        SyntheticState.SYNTHETIC,
                                                                                        FieldManifestation.FINAL).resolve(),
                                                        TypeDescription.Generic.OfNonGenericType.ForLoadedType.of(MethodHandle.class),
                                                        Collections.emptyList())));
    final Class<?> newClass = builder.make().load(Thread.currentThread().getContextClassLoader()).getLoaded();
    final Field gorpField = newClass.getDeclaredField("gorp");
    gorpField.setAccessible(true);
    final MethodHandle methodHandle = (MethodHandle)gorpField.get(null);
    assertNotNull(methodHandle);
    methodHandle.invokeExact();
  }

  public static final void goop() {
    System.out.println("*** goop");
  }

On Terminology

The hardest thing in software engineering is naming things.

We have some conventions, but not a lot. Many of those conventions come from design patterns. For example, we have builders and adapters and factories and visitors and so on.

But there are strikingly few conventions about how to name other things. For example, when implementing an interface that consists of a single method that can return something either new or old, what should we call it? The JDK has settled on the term Supplier, which maybe is fine, but then the method is called get, rather than supply. Does get really capture what a Supplier does? Again, naming things is hard.

As another example, sometimes factories assemble things out of raw materials—and then simply return what they’ve assembled, over and over. Is that actually what a factory does? No, it is not. Naming things is hard.

My own personal dictionary includes these concepts and I try to use them very carefully in my own software:

  • A supplier may create something or return a single instance of a prefabricated something, or switch arbitrarily. I avoid producer or provider since they don’t really convey why something is being retrieved or made: when something is supplied, by contrast, it is because there is a need, and the supplying fulfills the need.
  • A factory always creates something.
  • If something is backed, then it is implemented in terms of something else. This lets me do things with the adapter pattern but adapter is a terrible word that doesn’t tell you what is being adapted to what and hence which aspect is more primal.
  • If something is default, then it is usually a straightforward something that can be extended or overridden or otherwise made more complicated or performant or interesting. I try to avoid simple, since simplicity should be an emergent property, not something legislated.
  • I try to avoid the word provision since for very strange reasons in the computer industry it often means to create something out of thin air, rather than its English meaning, which is to stock. (When you provision your pantry, you don’t build the pantry, you put cans on its shelves.)
  • Priority is always always always largest-number-wins. Unlike most bug systems I’ve worked with, in English the highest priority problem is the one deserving the most attention. (If you want smallest-number-wins, you’re probably looking for rank. Avoid the use of ordinal entirely since many projects use it to mean priority, and others use it to mean something roughly akin to an indicator of which item should come out of an array first.)
  • An arbiter is something that takes in two or more inputs that may have some ambiguity (or not) and performs arbitration on them, selecting a single output for further processing.
  • If I am tempted to use the word module in any way, shape or form, then I know I have failed spectacularly in every possible way. Something that is a module is inscrutable, which is a fancy way of saying that you really have no idea what it is. Component is rarely any better. Feature is even worse.
  • Name is usually an indication that I haven’t thought the problem domain through well enough. Same goes for description. Both have inherent localization issues as well.
  • A facet is a selection of notional attributes of some other thing that go together. This is a nice terminology pattern to use to keep your classes small and to encourage composition.
  • Helper is right out. Same goes for util or utils or utility. If I am tempted to write any of these, I write crap instead so that it is quite clear that that is what I am creating.
  • In the realm of configuration, a setting is a name. A value or a setting value is a value for a setting. When you have many settings, you have many names, not many values, or at least you’re not talking about the values. (I deliberately try to avoid configuration and property since these are massively overloaded and confusing: is configuration a bunch of something, or just one thing? Is a property a name-value pair, or just a name? Or just a value?)

I’m sure there’s more where this came from. What are some of your terminology systems?

ByteBuddy and Proxies

Here’s another one that I am sure I’m going to forget how to do so I’m writing it down.

ByteBuddy is a terrific little tool for working with Java bytecode.  It, like many tools, however, is somehow both exquisitely documented and infuriatingly opaque.

ByteBuddy works with a domain-specific language (DSL) to represent the world of manipulating Java bytecode at runtime.  For a, uh, seasoned veteran (yeah, let’s go with that) like me, grappling with the so-called fluent API is quite difficult.  But I’ve figured out that everything is there if you need it.  You just need the magic recipe.  Sometimes even with the help of an IDE the magic recipe is akin to spellcasting.

So here is the magic recipe for defining a runtime proxy that forwards certain method invocations to the return value of a method that yields up the “real” object being proxied:


import net.bytebuddy.description.modifier.Visibility;
import net.bytebuddy.dynamic.DynamicType;
import net.bytebuddy.implementation.FieldAccessor;
import net.bytebuddy.implementation.MethodCall;
import static net.bytebuddy.implementation.MethodCall.invoke;
import static net.bytebuddy.matcher.ElementMatchers.named;
DynamicType.Builder<?> builder = //… acquire the builder, then:
.defineField("proxiedInstance", theClassBeingProxied, Visibility.PRIVATE) // (1)
.implement(new DefaultParameterizedType(null, Proxy.class, theClassBeingProxied)) // (2)
.intercept(FieldAccessor.ofBeanProperty()) // (3)
.method(someMatcher) // (4)
.intercept(invoke(MethodCall.MethodLocator.ForInstrumentedMethod.INSTANCE) // (5)
.onMethodCall(invoke(named("getProxiedInstance")))
.withAllArguments());
// 1: Adds a field to the proxy class named proxiedInstance. It will hold the "real" object.
// 2: Proxy.class is a made-up interface defining getProxiedInstance()/setProxiedInstance(T),
// where T is the type of the thing being proxied; e.g. Proxy<Frob>.
// DefaultParameterizedType is a made-up implementation of java.lang.reflect.ParameterizedType.
// 3: Magic ByteBuddy incantation to implement the Proxy<Frob> interface by making two methods
// that read from and write to the proxiedInstance field just defined
// 4: Choose what methods to intercept here; see the net.bytebuddy.matcher.ElementMatchers class
// in particular
// 5: The serious magic is here. It means, roughly, "whatever the method the user just called,
// turn around and invoke it on the return value of the getProxiedInstance() method with all
// of the arguments the user originally supplied". That INSTANCE object is not documented
// anywhere, really; you just have to know that it is suitable for use here in this DSL
// "sentence".

Configuring Narayana

I always forget how to do this so I’m writing it down.

First, Narayana fundamentally accesses its properties from instances of environment beans, which are simple Java objects.  Here are all of the non-testing ones (the last five are the most relevant for most JTA situations):

narayana/XTS/WS-C/dev/src/org/jboss/jbossts/xts/environment/RecoveryEnvironmentBean.java
narayana/XTS/WS-C/dev/src/org/jboss/jbossts/xts/environment/XTSEnvironmentBean.java
narayana/XTS/WS-C/dev/src/org/jboss/jbossts/xts/environment/WSCEnvironmentBean.java
narayana/XTS/WS-C/dev/src/org/jboss/jbossts/xts/environment/WSCFEnvironmentBean.java
narayana/XTS/WS-C/dev/src/org/jboss/jbossts/xts/environment/WSTEnvironmentBean.java
narayana/ArjunaJTS/jts/classes/com/arjuna/ats/jts/common/JTSEnvironmentBean.java
narayana/ArjunaJTS/orbportability/classes/com/arjuna/orbportability/common/OrbPortabilityEnvironmentBean.java
narayana/ArjunaJTA/jta/classes/com/arjuna/ats/jta/common/JTAEnvironmentBean.java
narayana/ArjunaJTA/jdbc/classes/com/arjuna/ats/jdbc/common/JDBCEnvironmentBean.java
narayana/ArjunaCore/txoj/classes/com/arjuna/ats/txoj/common/TxojEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/internal/arjuna/objectstore/hornetq/HornetqJournalEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/arjuna/common/RecoveryEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/arjuna/common/ObjectStoreEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/arjuna/common/CoreEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/arjuna/common/MetaObjectStoreEnvironmentBean.java
narayana/ArjunaCore/arjuna/classes/com/arjuna/ats/arjuna/common/CoordinatorEnvironmentBean.java

To instantiate them, it grabs a source of property information, one of the environment bean classes, and something called a BeanPopulator.  The BeanPopulator is sort of like a crippled version of java.beans.Introspector.  It instantiates a given environment bean class, and then calls relevant setter methods on the resulting instance with values sourced from whatever the source of property information is.

The source of property information has several restrictions.

First, it has to be in java.util.Properties XML format.  Elements are named entry and have key attributes; their content is that key’s value.

Second, if you do nothing else it will be named jbossts-properties.xml.  Weirdly, this most-default-of-all-possible-defaults is set during the build.

Third, if you want to rename it then you have to set a system property named com.arjuna.ats.arjuna.common.propertiesFile.

Fourth, there is a lookup algorithm.  First it treats this thing as an absolute path.  If it doesn’t exist, it treats it as a path relative to the current directory.  If it doesn’t exist, it treats it as a path relative to user.dir, user.home (!) and java.home (!).  If it doesn’t exist in any of those places, then it treats it as a classpath resource before giving up.

Fifth, there is no merging.

As you can see if you’re going to bother specifying this thing you should probably specify it as an absolute file path.  Hey, Narayana is old.

When you’re all done with this (or maybe you punt and decide to just let the defaults ride), you can selectively override certain properties by specifying them as System properties.

The environment beans each have a prefix defined via annotations and not in any other documentation that I can find so to understand how to configure them you have to look at the Narayana source code (!).  For example, JTAEnvironmentBean‘s @PropertyPrefix annotation sets its prefix to com.arjuna.ats.jta.  So an entry with a key attribute of com.arjuna.ats.jta.transactionManagerClassName will be used as the value of an invocation of the JTAEnvironmentBean#setTransactionManagerClassName(String) method.

Lastly, almost all you ever really want to do is set the default timeout for the transaction manager.  To do this, set a system property named  com.arjuna.ats.arjuna.coordinator.defaultTimeout to a numeric value denoting the timeout value in seconds.

Make H2 Log Via SLF4J

If you want your in-memory H2 database to log via SLF4J so you can control its logging output using your logging framework of choice, add this canonical string to your JDBC URL:

INIT=SET TRACE_LEVEL_FILE=4

The actual property you are setting in this case is the INIT property.  Here, you have set it to be equal to exactly one statement.  You can of course set it to be equal to several statements:

INIT=SET TRACE_LEVEL_FILE=4;SET DB_CLOSE_DELAY=-1;

The examples above are canonical strings, free of escape characters.  If you are setting these inside a Java String you’ll need to escape things properly.  Here’s a sample JDBC URL written as a double-quoted Java String that makes H2 log using SLF4J and runs some DDL on every connection:

"jdbc:h2:mem:chirp;INIT=SET TRACE_LEVEL_FILE=4\\;RUNSCRIPT FROM 'classpath:foo.ddl'"

Note the double backslashes: the first backslash escapes the next one, which, in turn, escapes the semicolon, because the semicolon is used as a delimiter in the actual JDBC URL itself.

The important thing in all of these cases is that SET TRACE_LEVEL_FILE=4 must come first in the INIT property value.

Introducing microBean™ Jersey Netty Integration

I’m really excited about my latest personal side project.  After a couple of false starts, I’ve put together microBean™ Jersey Netty Integration, available on Github.

This project inserts Jersey properly and idiomatically into a Netty pipeline without the flaws usually encountered by other projects that attempt to do this (including the experimental one actually supplied by the Jersey project).  Let’s take a look at how it works.

The Pipeline

First, a little background.

The first construct to understand in Netty is the ChannelPipeline.  A ChannelPipeline is an ordered collection of ChannelHandlers that each react to a kind of event or message and ideally do a single thing as a result.  Events flow in from the “left” in a left-to-right direction, are acted upon, and flow back “out” in a right-to-left direction.  (In the ChannelPipeline Javadocs, events flow “in” and “up” from the “bottom”, and are written from the “top” and flow “down”.)

ChannelHandlers in the pipeline can be inbound handlers or outbound handlers or both.  If a ChannelHandler is an inbound handler, then it participates in the event flow as the events come in left-to-right from the socket.  If a ChannelHandler is an outbound handler, then it participates in the event flow as the events go out right-to-left to the socket.

Used normally, ChannelHandlers are invoked by only a single thread, so you are insulated from threading gymnastics when you’re writing one.  However, the thread that invokes them is usually the Netty event loop: a thread whose main job is and that is in the process of ferrying bytes to and from a socket.  So it’s critical that any work you do that might block this thread gets offloaded elsewhere.

Events are basically messages that are read, and messages that are written.  There are other kinds of events, but that’s a good start.

A ChannelPipeline is hooked up to a Channel, which is an abstraction over sockets.  So you can see that a socket read ends up flowing left-to-right through the pipeline, and is transformed at some point along the way by a ChannelHandler into a socket write.

I am painting with a very broad brush and only so I can talk about plugging Jersey in to this machinery properly.  For more, you really should buy Netty in Action from your favorite local bookstore.

The Pipeline Philosophy

This elegant architecture is very much in the spirit of Unix’s “do one thing and do it well” philosophy, and I’m sure it is not unintentional that a Netty pipeline resembles a Unix pipeline.

In a well-behaved Netty pipeline, a handler near the head of the pipeline is usually performing translation work.  It takes handfuls of bytes, and turns them into meaningful objects that can be consumed as events by handlers further on up the pipeline.  This act is called decoding, and Netty ships with lots of decoders.

Decoder

One such decoder is the HttpRequestDecoder, which converts bytes into HttpRequest and HttpContent objects.  When this is at the head of the pipeline, then every other inbound handler upstream from it can now wait to receive HttpRequest and HttpContent objects without worrying about how they were put together.

What’s important to notice here is that the HttpRequestDecoder does just one thing: it takes in bytes and transforms them into another message, fires it up the pipeline, and that’s it.

On the writing front, there is, unsurprisingly, an HttpResponseEncoder that accepts requests to write HttpResponses and HttpContent objects, and turns them into bytes suitable for supplying (eventually) to the socket layer.  Like its decoding sister, this handler just does translation, but in the other direction.

Of HttpRequests and ContainerRequests

So now we have a pipeline that deals in HttpObjects (HttpRequests and HttpContent objects on the way in, and HttpResponses and HttpContent objects on the way out).  That’s nice.

Let’s say we want to put Jersey “into” this pipeline in some fashion.  Clearly we’d put it somehow after the decoder on the way in, so it can read HttpObjects, and before the encoder on the way out, so it can also write HttpObjects.

Sadly, however, Jersey does not natively know about Netty objects such as HttpObject, HttpRequest, HttpContent and HttpResponse.  To put Jersey in this pipeline, we will have to adapt these Netty objects into the right kind of Jersey objects using other decoders.

Additionally, of course, Jersey exists to run JAX-RS (or Jakarta RESTful Web Services) applications, and we don’t know what those applications are going to do.  Remember the bit above where I said that we shouldn’t block the event loop?  That comes into play here.

So what are we to do?

Jersey has a container integration layer.  In this parlance, a container is the Thing That Hosts Jersey, whatever “hosts” might mean.  Many times this is a Servlet engine, such as Jetty, or a full-blown application server, such as Glassfish.

But it doesn’t have to be.  (Some people aren’t aware that JAX-RS (or Jakarta RESTful Web Services) does not require Servlet at all!  It’s true!)

As it turns out, all you need to get started with Jersey integration is a ContainerRequest.  A ContainerRequest is the bridge from whatever is not Jersey, but is hosting it in some way, to that which is Jersey, which will handle it.  So as you can see from its constructor, you pass in information that Jersey needs to do its job from wherever you got it, and Jersey takes it from there.  In this case, we’ll harvest that information from HttpRequest and HttpContent objects.

Combining Messages

The proper and idiomatic way to do this sort of thing is to further massage our pipeline.  Just as Netty ships relatively small handlers that do one thing and do it well, we’re not going to try to cram Jersey integration into a single class.  Instead, we want to turn a collection of HttpRequest and HttpContent objects into a ContainerRequest object first.  Do one thing and do it well.  We’ll worry about what comes after that in a moment.

This decoding is a form of message aggregation.  In some cases, we’ll need to combine an initial HttpRequest with a series of HttpContent objects that may follow it into a single ContainerRequest.

Accordingly, microBean™ Jersey Netty Integration ships with a decoder that consumes HttpRequest and HttpContent messages, and emits ContainerRequest messages in their place.  This forms the first part of idiomatic Netty-and-Jersey integration.

HttpObjectToContainerRequestDecoder

Creating a ContainerRequest from an HttpRequest is relatively simple.  The harder part is deciding whether to let the ContainerRequest under construction fly up the pipeline or not.

For example, some incoming HttpRequests represent simple GET requests for resources, and have no accompanying payload and therefore a Content-Length header of 0.  These are easy to deal with: there’s no further content, so translating such an HttpRequest into a ContainerRequest is a one-for-one operation.  Grab some fields out of the HttpRequest, use them to call the ContainerRequest constructor, and you’re done.  That case is easy.

On the other hand, a POST usually features an incoming payload.  The ContainerRequest we’re building will need an InputStream to represent this payload.  This case is a little trickier.

Specifically, in Netty, a complete HTTP request is deliberately represented in several pieces: the initial HttpRequest, and then several “follow-on” HttpContent items representing the incoming payload, terminated with a LastHttpContent message indicating the end of the payload.  Netty does things this way to avoid consuming lots of memory, and for speed reasons.

You could wait for all such messages to arrive, and only then combine them together, create an InputStream that somehow represents the whole pile of HttpContents, install it, and fire the resulting ContainerRequest up the pipeline.

But that’s a lot of waiting around, and therefore isn’t very efficient: Jersey is going to have to read from the ContainerRequest before it starts writing, so you might as well give it the ability to do that as soon as possible, even if all the readable content isn’t there yet.  Remember too that ideally Jersey will ultimately be running on a thread that is not the event loop!

Really what you need to do is hold the ContainerRequest being built for a moment, specifically only until the first HttpContent describing its payload shows up.  At that point, you can create a special InputStream that will represent the yet-to-completely-arrive inbound payload, and install it on the ContainerRequest.  Then you can let the ContainerRequest fly upstream, attached to this InputStream pipe, even though strictly speaking it’s still incomplete, and wait for incoming HttpRequests to start the process all over again.  The special InputStream you install can read the remaining and yet-to-arrive HttpContent pieces later, on demand.  We’ll discuss this special InputStream shortly.

This is, of course, what microBean™ Jersey Netty Integration does.  This approach means that more things can happen at the same time that would otherwise be the case, and that keeps the event loop from being blocked.  In many ways, the special InputStream is the heart of the microBean™ Jersey Netty Integration project.

Encoding

Now we can happily leave all the translation work behind.  Because of the beauty of the Netty pipeline architecture, we can now simply trust that at some point there will be a ContainerRequest delivered via a Netty pipeline.

What are we going to do with it?

The quickest possible answer is: we’re going to hand it to an ApplicationHandler via its handle(ContainerRequest) method.  That kicks off the Jersey request/response cycle, and we’re done, right?

No.  We haven’t discussed writing yet.

It is true that we’re basically done with the reading side of things.  We have relied upon HttpRequestDecoder to turn bytes into HttpRequests and HttpContents.  We’ve posited a decoder that reads those messages and turns them into a ContainerRequest and emits it.  And we know that the final reader in our pipeline will be some sort of inbound ChannelHandler that will accept ContainerRequest instances.  Now we need to handle that ContainerRequest and write something back to the caller.

On the Jersey end of things, ContainerRequest contains everything Jersey needs to know about a JAX-RS (or Jakarta RESTful Web Services) Application.  Jersey will use its InputStream to read incoming payloads, and will use its ContainerResponseWriter to write outgoing payloads (by way, of course, of following the setup of the user’s Application).  We haven’t talked about ContainerResponseWriter yet, but we will now.

ContainerRequest.png

ContainerResponseWriter

Once a ContainerRequest gets a ContainerResponseWriter installed on it it is then possible to actually write messages back to the caller from within Jersey.  A Jersey application typically relies on finding (by well-established rules) a MessageBodyWriter to encode some kind of Java object relevant to the user’s Application into byte arrays that can then be written by Jersey to an OutputStream.  Once the writeTo method has been called, Jersey considers its writing job done.

Of course our job is not done, as now we have to somehow hook this OutputStream up to the outbound part of the Netty channel pipeline.  And, of course, recall that our decoder, following the “do one thing and do it well” philosophy, deliberately never installed a ContainerResponseWriter on the ContainerRequest it sent on up the pipeline.

microBean™ Jersey Netty Integration tackles this problem by having an inbound ChannelHandler implementation that is itself a ContainerResponseWriter.  It is set up to consume ContainerRequest objects, and, when it receives one, it immediately installs itself as that ContainerRequest‘s ContainerResponseWriter implementation.

To do this, this handler-that-is-also-a-ContainerResponseWriter will need an OutputStream implementation that it can return from the writeResponseStatusAndHeaders method it is required to implement for those cases where a payload is to be sent back to the caller.

The OutputStream implementation that is used functions as its own kind of mini-decoder.  It “accepts” byte arrays, as all OutputStreams do, and it “decodes” them into appropriate HttpObjects, namely HttpResponse and HttpContent objects.  Along the way, this requires first “decoding” byte arrays into Netty’s native collection-of-bytes container: ByteBuf.

Without spending much time on it, a ByteBuf is in some ways the lowest-level object in Netty.  If you have a ByteBuf in your hand, decoding it into some other Netty-supplied object is usually pretty trivial.  In this case, creating a DefaultHttpContent out of a ByteBuf is as simple as invoking a constructor.

Getting a ByteBuf from an array of bytes is also straightforward: just use the Unpooled#wrappedBuffer method!  So on every write to this OutputStream implementation you are effectively emitting HttpContent objects of a mostly predictable size.

Next, the OutputStream implementation does not, obviously, “point at” a file or any other kind of traditional destination you might be useful.  Instead, it wraps a ChannelOutboundInvoker implementation.  A ChannelOutboundInvoker implementation, such as a ChannelHandlerContext, is the “handle” that you use to send a write message up the Netty channel pipeline.  So every OutputStream#write operation becomes a ChannelOutboundInvoker#write operation.

Finally, you want the OutputStream implementation that “consumes” byte arrays and writes HttpObjects of the right kind to do so without necessarily waiting for all the content that Jersey might write over the stream before sending it up the pipeline.  So the OutputStream implementation in question automatically calls its own flush() method after a configurable number of bytes have been writtenThe OutputStream, in other words, is auto-flushing.  (Unless you don’t want it to be!)

About that flush() method: it’s mapped to (you guessed it) ChannelOutboundInvoker#flush.

So now we have connected the dots: a ContainerRequest goes to Jersey, which processes it.  Jersey writes to an OutputStream we provide that itself bridges to the actual Netty channel pipeline.  And downstream in the pipeline we have a Netty-supplied HttpResponseEncoder that accepts the HttpResponse and HttpContent objects we emit.

microBean Jersey Netty

Threading

Now let’s talk about threads.  In Netty, as noted, events coming up the pipeline—events being read, events being processed by ChannelInboundHandlers—are executed by the event loop: a thread that is devoted to processing what amount to socket events.  It is very important to let these event loop threads run as fast and free as possible.  We’ve already talked about how a Jersey application should not, therefore, be run on the event loop, because you don’t know what it is going to do.  Will it sleep?  Will it run a monstrous blocking database query?  You don’t know.  More concretely, this means that therefore our ChannelInboundHandler-that-is-also-a-ContainerResponseWriter must not execute ApplicationHandler#handle(ContainerRequest) on the event loop.

Most other projects that integrate Jersey with Netty in some way use a thread pool or Jersey’s own schedulers to do this.  But they overlook the fact that Netty lets you do this more natively.  This native approach is the one that microBean™ Jersey Netty Integration has taken.

First, let’s just note that the only mention that we’ve made so far of anything in the integration machinery that could block is the special InputStream that is supplied to a ContainerRequest as its entity stream.  We mentioned that this InputStream gets installed on a ContainerRequest as a kind of empty pipe before (potentially) the entire incoming payload has been received.  Therefore, Jersey might start reading from it before there are bytes to read, and indeed, that InputStream implementation will block in that case, by contract, until the downstream (or “leftstream”) decoder catches up and fills the pipe with other HttpContent messages.

But you’ll note that otherwise we haven’t made mention of anything like a ConcurrentHashMap or a LinkedBlockingQueue or anything from the java.util.concurrent.locks package.  That’s on purpose.  To understand how microBean™ Jersey Netty Integration gets away with this minimal use of blocking constructs, we have to revisit the ChannelPipeline.

When you add a ChannelHandler to a pipeline—when, as a Netty developer, you build your pipeline in the first place—you typically use various flavors of the ChannelPipeline#addLast method.  This appends your ChannelHandler in question to the tail of the pipeline as you might expect.  And then all the event handling we’ve talked about flows through the pipeline in the way that we’ve talked about it.

But note that there’s another form of the ChannelPipeline#addLast method that takes an EventExecutorGroup!

In this form, if you supply a new DefaultEventExecutorGroup as you add a ChannelHandler, then its threads will be those that run your ChannelHandler‘s event-handling methods, and not those of the event loop!  So all you have to do to get the Jersey ApplicationHandler#handle(ContainerRequest) method to be run on a non-event loop thread is to set up your pipeline using this variant of the ChannelPipeline#addLast method, supplying a DefaultEventExecutorGroup.  Then whatever the JAX-RS or Jakarta RESTful Web Services Application does (slow database access, Thread.sleep() calls…) will not block the event loop.

Now, another tenet of the Netty framework is that Thou Shalt Not Write to the Pipeline Except on the Event Loop.  So if our ApplicationHandler#handle(ContainerRequest) method is being run on a non-event-loop thread, then don’t we have to do something to “get back on” the event loop thread when our OutputStream implementation calls ChannelOutboundInvoker#write(Object, ChannelPromise)?

As it turns out, no, because since a ChannelOutboundInvoker‘s whole job is to “do” IO, it always ensures that these operations take place on the event loop.  In other words, even though our Jersey application is correctly running its workload on one non-event-loop thread, when our special OutputStream invokes ChannelOutboundInvoker#write(Object, ChannelPromise), the implementation of that method will ensure that the write takes place on the event loop by enqueuing a task on the event loop for us.

To put it one final other way, if you have introduced queues of any kind or homegrown thread pools into your Netty integration machinery—other than the minimal amount of blocking necessary for adapting incoming entity payloads into InputStreams as previously discussed—you’re doing it wrong, because Netty already has them.

Conclusion

There is a lot more to this library than I’ve covered here, including HTTP/2 support.  I encourage you to take a look at its Github repository and get involved.  Thanks for reading!

Jersey and Netty Together Again For The First Time Once More

I had some time and put together a Jersey-and-Netty integration that I think is pretty interesting.

Its Github repository is here: https://github.com/microbean/microbean-jersey-netty.  Its website is here: https://microbean.github.io/microbean-jersey-netty/.

It lets you run Jersey as a Java SE application using Netty to front it.  It supports HTTP and HTTP/2, including upgrades.

Jersey itself ships with a Netty integration, but it seems to have some problems and parts of it are designated by its author as very experimental.  I wanted to see if I could do a decent integration myself, both to learn more about Netty and to solve a real problem that I’ve had.

The main challenge with Netty is to ensure that its event loop is never blocked.  But the very nature of JAX-RS, with its InputStreams supplying inbound payloads, means that some blocking in general is necessary, so immediately you’re talking about offloading that blocking onto extra threads or executors to free up the event loop, and therefore coordination of IO activity between the Jersey side of the house and the Netty side of the house.

This in itself is not terribly difficult on the face of it and can be addressed in many different ways.  The Jersey-supplied integration accomplishes this by passing InputStreams to Jersey using blocking queues.  This is fine, but now the integration author has to deal with queue capacity, and rejecting submissions, and so forth.  As you might expect there is at least one issue around this that turns out to be somewhat severe (apparently).  This also involves creating a fair number of objects.

But of course Netty already has a really nice system of queues that it uses to do this same sort of thing, and you can easily get access to it: it’s the event loop itself, which lets you submit things to it.

Netty also has its ByteBuf construct, which is a better cheaper faster ByteBufferByteBufs are sort of Netty’s “coin of the realm”.  Netty goes to extraordinary lengths to ensure a minimum of object allocations and garbage generation occur when you’re working with ByteBufs, so they seem like a good thing to center any integration strategy around.  They are not thread-safe, but if you mutate them only on the event loop, you’re good.

So the general approach I take is: instead of making extra queues to shuttle byte arrays or other Java-objects-wrapping-byte-arrays back and forth between Jersey land and Netty land, I use a CompositeByteBuf that gets the ByteBuf messages that Netty supplies in its HTTP message constructs added to it as they come in on the Netty event loop, and use the Netty event loop task queue to ensure that all ByteBuf operations of any kind always take place on the event loop.

This means that I can take advantage of everything Netty gives me under the covers in terms of memory efficiency and potential transparent usage of off-heap buffers and such, while also gleefully punting any task queue saturation issues to Netty itself, which already has customizable strategies for dealing with them.  A lower chance of bugs for you, since Netty has to deal with this sort of problem all day every day, and a lower chance of bugs for me, since it is their code, not mine, doing this coordination.  Win-win!

On the outbound side, I take advantage of Netty’s ChunkedWriteHandler, which, contrary to how it might appear at first, has nothing to do with Transfer-Encoding: chunked.  Instead, it is a nice little handler that deals with user-supplied-but-Netty-managed hunks of arbitrarily-typed data, writing them when their contents are available, and doing other things when it can’t.  The upshot: your Jersey OutputStreams are chunked up and enqueued on the event loop using a single ByteBuf implementation that is optimized for IO as data is written to them.

The net effect is a nice little as-nonblocking-as-JAX-RS-can-get tango that Netty and Jersey perform together, coordinated by Netty’s and Jersey’s own constructs.

microBean™ Jersey Netty has extensive Javadocs and is available on Maven Central.

A CDI Primer: Part 4

In the previous post, we learned that the only Contextuals that matter, really, are Beans, and we learned a little bit about injection points and loosely how they’re filled.

In this post we’ll look at how CDI actually discovers things, and how it normalizes them into Bean implementations, and how it marries them with Context implementations to figure out what their lifecycles are going to be.

The Container Lifecycle

CDI has a very well-specified lifecycle.  There is a general startup phase during which all the raw materials in the world are discovered, arranged and pared down to form a network of Bean implementations that produce and consume each other.  Once that startup phase has completed, the actual application—whatever it might be, that makes use of all these producers and consumers—starts, and runs, and does its thing.  Then there is a shutdown free-for-all where all Bean implementations can, indirectly, get a chance to clean up, and that’s all there is to it.

But what are you, the end-user, the ordinary developer, supposed to do with all this?  In your hand, you have a class that has some injection points in it (fields or methods annotated with Inject), and you want them filled.  What does this have to do with Bean implementations?  After all, your class doesn’t implement Bean.  We’ll see how internally, in a way, it sort of does.

Discovering Things

As part of the startup phase, CDI performs type discovery and bean discovery.

Type discovery is a fancy name for taking stock of what classes exist on the classpath.  Plugins, called portable extensions, can affect this; that’s a subject for another day.  For now, know that CDI will scan the classpath for particular classes and will add them to a set of such classes that represents its type world.

Once types have been discovered, CDI performs bean discovery, where some of those types it found lying around are normalized into Bean implementations internally.

It’s important to realize what’s going on here.  For the typical developer scenario, most classes that are discovered are turned into producers.  This can be a little counterintuitive—I don’t know about you, but that’s not usually what I think of when I have a class in my hand named with a noun, like Person or Customer.  Really?  Customer is a producer?  A producer of what?  Well, of instances of itself, it turns out.  Which of course we all actually know, because it has a constructor.

Managed Beans

So consider our ongoing stupid example of a Backpack class, with a simple zero-argument constructor, annotated with Singleton, and a Gorp-typed field annotated with Inject.  Let’s say that CDI discovers it.  Internally during bean discovery CDI creates a Bean to represent it:

A Bean implementation built internally like this is called a managed bean, and the class itself is frequently called a managed bean as well.

It’s important to realize that for a given managed bean one of its bean types and its bean class are always identical.

Producer Methods

Another very common kind of bean is a producer method.  In this case, CDI has done its type discovery, and has found a class called CandyShop also annotated with Singleton (I’m picking Singleton because it is a scope that everyone is intuitively familiar with.)  Let’s say this class has a method in it declared like this:

@Produces
@Singleton
public Candy produceCandy() {
  return new Candy();
}

This time, CDI will create a Bean internally for CandyShop that looks a lot like the one above, namely:

But then it will also create a Bean internally that represents the producer method:

Do you see that the bean class here is CandyShop but the getTypes() method returns a Set of bean types that includes Candy.class, not CandyShop.class?  The takeaway here is that usually when you’re considering a true CDI bean you are, as a normal developer, interested in one of its bean types, not its bean class.  In other words, you’re interested in (in this example) Candy.class, and normally not so interested in the fact that the producer method that creates its instances happens to be housed in a class called CandyShop.  The terminology in the documentation certainly does its best to make this about as clear as mud.

Producer Fields

Just as you can have a producer method whose bean class is one thing while its bean types are another, you can have a producer field.  Let’s say our CandyShop class from the example above also has a field in it declared like this:

@Produces
@Singleton
private PeanutButter peanutButter;

Once again, CDI will create a Bean for CandyShop, just as it did before, and here it will also create another Bean:

In practice, producer fields are comparatively rare.  They exist primarily to bridge the worlds of Java EE, where you might have a field that is annotated with Resource (so is being set by some kind of JNDI-aware machinery), and CDI where you can have the very same field annotated with Produces.  This lets CDI set up a Bean implementation under the covers that creates instances from Java EE-land without any other aspects of the CDI ecosystem really being aware that Java EE is even in the picture.

Decorators and Interceptors

Two other primary Bean implementations are Decorators and Interceptors.  I’m not going to talk about them because although it may look like it by this point I’m not writing a book.  Suffice it to say they are represented in the CDI ecosystem by Beans, and so are inherently producers of things, which can also be counterintuitive.

Custom Beans

Finally, without digging deep into portable extensions, we can at least say that portable extensions have a variety of ways to cause a Bean implementation to be created by hand at container startup time.  When Beans are added into the system this way, the portable extension has absolute full control over them.

Putting It Together

The biggest takeaway here is that everything in CDI is a Bean.  In the documentation, almost everything is a little-b bean.  A bean, represented by a Bean, is fundamentally a producer of something.  There are recipes that we’ve just enumerated that translate ordinary Java constructs into Bean implementations when bean discovery is performed: a creator and a destroyer (Contextual) together with bean types and a scope (BeanAttributes), a hosting bean class and a set of injection points (Bean).

There’s another thing that falls out of this.  Bean is a parameterized type: you have a Bean<T>.  The <T> is, however, always usage-specific, because it comes by way of Contextual, whose usage of its type parameter is the return type from its create method.  That is, <T> represents one of a Bean‘s bean types—one of the kinds of things it can make—not its bean class—the class hosting the producer method or constructor.

Consider a Bean<Person>.  Because a Bean<Person> is a Contextual<Person>, it follows that you can call its create method and get a Person back.  But it does not follow that calling its getBeanClass() method will return Person.class!  Perhaps the Bean<Person> you are looking at represents a producer method, or a producer field, for example.

Finally

So finally we can see how CDI makes @Inject work, which was what we set out to do in part 0:

  • CDI discovers types
  • For each type found during type discovery, CDI creates a Bean to represent it, or at least tries to
  • For each producer method and producer field (and other cases), CDI creates another Bean to represent it
  • For every InjectionPoint found in this pile of Beans, CDI performs typesafe resolution to see what Beans have bean types that “match” the InjectionPoint‘s type
  • Assuming that all slots get matched up to producers, whenever CDI needs to obtain an instance of something, it does so through the right Context which is in the business of handing out contextual instances
  • CDI can find the right Context by checking out the scope available from each Bean
  • Ultimately the Context, when asked to produce an instance of something, will call through to a Bean‘s create method

I’ve deliberately left out qualifiers, which are extremely important, but are deserving of their own article later.

There’s much more to say about CDI, but I think I’ll stop there for now.

Thanks for reading this primer!