MicroProfile Config and Other Configuration Musings

Disclaimer: I work for Oracle but never speak for them when I can help it, and especially not on this blog.

Here are some unstructured musings on MicroProfile Config and configuration in Java in general.

Extremely Quick MicroProfile Config Overview

MicroProfile Config is a Java-centric, read-only configuration library. At its most fundamental level, you ask it for a named textual value, tell the system what Java type in your program the textual value should have for this request, and get that typed value back. For example, you might ask the system for a value corresponding to the name wheel.circumference, and specify that it should be returned as an Integer:

final Integer wheelCircumference = config.getValue("wheel.circumference", Integer.class);

Then, unbeknownst to the requesting application developer, MicroProfile Config uses at least three different abstractions to hide where this named scalar value comes from and how it is muscled into its final typed form.

Incidentally, by “scalar”, I mean, very loosely, something like very simple textual values. That is, the atomic nugget of information in MicroProfile config is individual “flat” strings named by opaque names. At the lowest level, you are always asking MicroProfile Config for individual text nuggets, not anything more complicated. Everything else is built on top of that.

For example, MicroProfile Config does layer some Java Beans-like object building facilities on top of this scalar value manufacturing system. You can sprinkle some annotations around and cause a Wheel object to come into being with getCircumference() and getSpokes() accessors, but this Wheel object will in such a case always be produced out of the scalar values for (say) wheel.circumference and wheel.spokes with appropriate data type conversion applied (a primitive form of part of the concept of object binding). I’ll term these kinds of values made up of multiple other values composites: the Wheel type is a composite; its individual properties here all happen to be scalars.

So much for a quick overview of MicroProfile Config.

Of Names And (Typed) Values

One of the problems you have when you have any scalar-centric key-value system like MicroProfile Config is: if you ask for something by name, then you have to determine what the (Java) type of that thing is in order to use it. Either there is exactly one type (for example, System.getenv("foobar") always yields a String), or you have to somehow know what the type of the named thing is, potentially applying appropriate type conversions along the way (for example, System.getProperties().get("foobar") returns an Object; it is now up to you to figure out what to do with that value).

MicroProfile Config forces the application developer to specify the type during lookup (e.g. config.getValue("wheel.circumference", Integer.class)). (There’s no way to learn if it makes sense to type a wheel.circumference property as an Integer; you just kind of have to know that it is. Or if you want to look at it the other way, equivalently, the configuration author has to use common sense to figure out what kind of usage her textual configuration nugget will have within any arbitrary application. This is of course a problem, and we’ll talk about it later.)

One of the questions that immediately arises is: suppose the configuration information is fundamentally textual in nature. Who converts "27" into (effectively) Integer.valueOf(27)? (What if the configuration value is "27 inches"?)

MicroProfile Config has the concept of a stateless Converter that accepts a String and converts it to a particular type if possible. There can be user-supplied Converters and built-in ones. All of them are overridable. So somewhere in the system there will be a Converter that can convert Strings to Integers, so it will kick in and the application developer will get back a value equal to (say) Integer.valueOf(27). So far so good.

Of course, if the application developer calls config.getValue("wheel.circumference", java.awt.Point), there may be no such Converter so she may be out of luck, but won’t learn this until runtime. So, of course, she can supply her own Converter that converts Strings to Points, and hope that it is used (and not overridden deliberately or accidentally by some other component’s Converter that also converts Strings to Points but in a different way, by reading a different kind of textual format, for example).

This is where we start talking about the fact that Converters can have priorities and are sorted in a particular way, can be packaged with any library or container, and discovered from various locations, and—

But let’s back way up.

Why is the application developer looking up values by arbitrary names in the first place? Is that really what she wants to do? Isn’t this just an unfortunate means to an end? Take some time to think about this.

Suppose she were magically able to call a typed method on a hypothetical configuration object sourced from a hypothetical configuration system instead:

final Integer circumference = HypotheticalConfiguration.get().getWheel(FRONT).getCircumference();

From her point of view, how the circumference of the front wheel is found or produced or manufactured by the configuration system is entirely immaterial. Maybe it was deserialized. Maybe it was a Java Record that got assembled out of its record components somehow. Maybe it fell from the sky. Maybe Jackson kicked in and did object binding from JSON. Maybe it’s a JPA entity retrieved, already constructed and converted, from the database. Maybe it sprang full-fledged from an underlying directory system. Maybe it was built by a JNDI ObjectFactory. Maybe it resulted from a storm of Java Beans PropertyEditor operations. She doesn’t know, and she doesn’t care. Nor does she have to worry about accidentally making typos in configuration value names, or about supplying the wrong data conversion type. If you look carefully, her use case does not mention the production side of things at all. There’s a wheel circumference, you get it by calling getCircumference() on a Wheel object, and it returns an Integer. The end.

Remember, she is an application developer, so she can (and ultimately must) express her requirements using the Java language—as she does here. Here, she defines the composite Wheel type to feature whatever properties of whatever type she wishes. There is no need to “go after” those properties individually as textual scalars by some kind of name. It is up to whatever “back end” of the hypothetical configuration system to make whatever is needed, however it wants to, which may involve textual scalars and object binding or may not. Those kinds of problems are now pushed off entirely into the domain where they properly belong: that of production, object binding and instantiation, which are none of her concerns. But more on that aspect a little later.

So why are there these arbitrarily-typed key-value Java configuration systems, if there are so many opportunities to make mistakes with them?

For environment variables, of course, the answer is simple: environment variables can be passed to programs of any type, not just Java, so they are expressed in the lowest common denominator of the operating system: simple humble strings, so that’s what you ask for as an application developer, and exactly what you receive.

And something like System.getProperty("foobar") is a convenient wrapper around (String) Hashtable.get("foobar") because system properties can be supplied on the command line where, again, they must be strings. (However, note that it is possible for someone to install a Glorp-typed object into the Properties object returned by System.getProperties() under the name “foobar” in which case System.getProperty("foobar") will of course fail. There’s our old friend type conversion again and all the problems it entails.)

And for systems like JNDI or java.util.Preferences that allow for reading and binding you need to make sure that the application developer can write things into the abstracted directory that other programs written in other languages can read. So scalars are the way to go here.

But for Java-only, read-only configuration? Where the application developer doesn’t have to write anything “back” to the configuration system? There is nothing in her use cases that requires untyped lookups by arbitrary name. It is, to be sure, a pattern that we’ve all gotten used to because we all look up system properties and environment variables by arbitrary name, and system properties and environment variables are some of the things that configuration systems usually abstract over, but, again, there’s no requirement that this be so of a configuration system that supports application developer use cases.

Back to MicroProfile Config for a moment.

In MicroProfile Config, you can inject a composite object like this (though I don’t think you can look one up programmatically?):

// Somewhere in your program
@Inject
@ConfigProperties
@Front
private Wheel wheel;

// …provided that somewhere else you also have something like:
@ConfigProperties(prefix = "wheel")
@Front
public interface Wheel {
  public Integer getCircumference();
  public Integer getSpokes();
}

So they recognize the utility of composite configuration objects even if it’s clumsy for them to make one.

Now wheel in the example above will receive an instance of Wheel whose getCircumference() method will return an Integer created from the wheel.circumference textual property as converted by some Converter<Integer> somewhere (likewise, the getSpokes() method will return an Integer created from the wheel.spokes textual property as converted by some Converter<Integer> somewhere).

This is nicer for the application developer, because now she doesn’t have to worry about spelling out scalar names (typos!) and (often non-discoverable) data conversions (oops, wrong type) herself.

But even here it strikes me as bizarre that MicroProfile Config leaks concerns from one domain into another. Specifically, it leaks object binding concerns (the fiddly bits about how a Wheel is produced, including the prefix element of the strangely-named ConfigProperties annotation) into the application developer’s usage domain (she doesn’t give a flying hoot about how a Wheel is produced, but oddly here has to indicate binding information herself even though binding isn’t her concern, and either the configuration author has to make sure that her named textual configuration nuggets have exactly the names dictated by the object binding requirements—e.g. drop “get“, lowercase the first character of what follows—or the application developer has to make sure that her composite object has accessors that line up with the configuration author’s textual configuration nugget names).

Why not leave binding concerns where they belong (on the “production” side), and usage concerns where they belong (on the “consumption” side)?

On DropWizard

Let’s take a hopefully relevant detour into DropWizard land for a moment. Disclaimer: I have no affiliation with or opinion about DropWizard; it is just an example of a larger point I want to make.

In DropWizard, a master configuration object is handed to you. You can’t escape it. You must define such a configuration object. Here’s what I mean; look for where HelloWorldConfiguration shows up:

public class HelloWorldApplication extends Application<HelloWorldConfiguration> {
    public static void main(String[] args) throws Exception {
        new HelloWorldApplication().run(args);
    }

    @Override
    public String getName() {
        return "hello-world";
    }

    @Override
    public void initialize(Bootstrap<HelloWorldConfiguration> bootstrap) {
        // nothing to do yet
    }

    @Override
    public void run(HelloWorldConfiguration configuration,
                    Environment environment) {
        // nothing to do yet
    }
}

When your HelloWorldApplication starts (via its main method), the run method will eventually be called, and you can see that it is supplied with a master configuration object (in this case a HelloWorldConfiguration object). How did it get there? It doesn’t matter.

Here’s what the HelloWorldConfiguration object might look like (I’ve cribbed this whole thing from the DropWizard documentation itself):

import io.dropwizard.Configuration;
import com.fasterxml.jackson.annotation.JsonProperty;
import javax.validation.constraints.NotEmpty;

public class HelloWorldConfiguration extends Configuration {
    @NotEmpty
    private String template;

    @NotEmpty
    private String defaultName = "Stranger";

    @JsonProperty
    public String getTemplate() {
        return template;
    }

    @JsonProperty
    public void setTemplate(String template) {
        this.template = template;
    }

    @JsonProperty
    public String getDefaultName() {
        return defaultName;
    }

    @JsonProperty
    public void setDefaultName(String name) {
        this.defaultName = name;
    }
}

If you’ve been around the block a few times, you can see that there are Jackson annotations all over it. (You’ll also know that the setters are not necessary!) But apart from that (and even the annotations aren’t necessary depending on how Jackson is used), it’s a POJO. As it should be!

Now look back at theHelloWorldApplication. The application developer has no idea where the HelloWorldConfiguration object came from or how it was made, when you look at the actual application code. That is, the use case made manifest by the application code itself has no object binding or production concerns in it whatsoever. At an abstract level, the application developer has:

  • Defined what her configuration looks like (by virtue of creating the HelloWorldConfiguration type and defining the getters she’ll call on the HelloWorldConfiguration object in her run method)
  • Advertised this configuration to whoever and/or whatever is going to supply her with such a thing

By advertising, I mean an abstract notion of: has put this class somewhere where people (such as configuration authors) know to look for it. Advertising can take many forms, from placing a file in a well-known spot to listing the file’s location in a file that is itself in a well-known spot (e.g. the Java ServiceLoader contract), or a variety of other means. In DropWizard, some of the advertising is baked into the application itself: if you were, for example, to javap the HelloWorldApplication.class file, you would be able to figure out what the master configuration object is.

Anyway, the master configuration object here is just Java code. There are no name-based lookups involved. There’s no data conversion. There are no conventions to follow. The application developer will call configuration.getTemplate(), for example, and get exactly what she is expecting as an instance of exactly the Java type she’s expecting it. If she added getComplicatedHairball() to it, then she’d get a ComplicatedHairball object back to work with. She can’t make typos, or her class won’t compile. She can’t specify the wrong data types for conversion. Namespace clashes between components are no longer a thing. All of this is because use case concerns are in the right place and don’t leak.

Back to DropWizard specifics. On the production and unmarshaling side, DropWizard is opinionated: it turns out that the configuration author must write her configuration in YAML, and Jackson is used to muscle this YAML into an instance of the master configuration object, following its usual object binding conventions. (Many of you will know that this of course does not require that there be any annotations on the HelloWorldConfiguration object at all, and that any customizations can be made using mixin annotations or Jackson ObjectMapper incantations. The point here is that even the Jackson bits are not required to sully the HelloWorldConfiguration if you really want to keep things pure.)

They’ve also left themselves some wiggle room with this sensible architecture: even though DropWizard is currently opinionated, if you squint you can see that if it decided to change its configuration-object-producing machinery from YAML-and-Jackson to XML-and-JAXB, the application developer’s code would not really change.

Fantasyland

DropWizard is opinionated on how configuration objects are unmarshaled, but as noted there’s some architectural wiggle room, and so now let’s travel to fantasyland for a moment.

Let’s pretend there exists in the world some kind of facility like DropWizard’s configuration machinery, but instead of requiring the configuration author to write YAML (or JSON) in a specific file or classpath resource, and instead of requiring Jackson to be used for object binding—in fact, instead of requiring object binding at all—let’s pretend this “back end” production mechanism is pluggable. Let’s pretend a hypothetical configuration system can locate what back end it’s going to use to make configuration objects. Seriously, let’s pretend.

Oh look, nothing really changes from the application developer’s use cases or standpoint!

Specifically, she doesn’t have to change any names or data type conversion code. She doesn’t have to make sure that composites have names that “work with” the configuration system. She still can’t make typos and still can’t provide incorrect typing information. Component namespaces are still safe. All of those problems are now (properly!) the concern of whatever concrete back end provider is in the picture, and the concrete back end provider doesn’t have to deal in scalars at all.

Let’s further pretend that the back end provider is not one that uses object binding, Jackson- or MicroProfile Config-based or otherwise. Perhaps it is something like Hibernate that loads an entity straight out of a database. In this example, the data conversion happens at the native database level so there’s no need for data conversion to surface at the configuration system level at all, because it’s already being handled by a combination of the database, JDBC and Hibernate.

Or let’s say that MicroProfile Config is the back end provider, and uses a whole bunch of config.getValue("someScalar", SomeType.class) invocations to assemble a composite, namely the master configuration object. Or maybe somehow it uses the same mechanism that it uses to produce @ConfigProperties-annotated objects today. The usual problems that MicroProfile Config has with things like “which Converter will be used” and “which ConfigSource will produce scalar value x” and “how do I set an empty string” and “what do we do about nondeterministic ConfigSource ordering” and so on are still in play, but now they are isolated where they belong, namely in the domain of object production and binding, and do not impact the application developer use cases in any way. The configuration author—the person responsible for authoring the scalar textual values that MicroProfile Config will fundamentally consume—also now has the benefit that she can at least have the possibility of discovering what configuration she must supply since the master configuration object type is advertised, and contains all relevant naming and typing information.

Or maybe the master configuration object itself simply is a MicroProfile Config Config object, if the application developer desperately wants to work fundamentally with it for some reason, in which case the back end is absurdly simple. Who am I to stop her? Or maybe it’s a Properties object. Or a JNDI Context. Or a File. Or anything else, really. Fantasyland handles this too because all that is required is that it be able to produce a Java object of some kind that the application developer can work with.

On Configuration in General

This starts to get into: now we’re just talking about plain Java objects, so just what is configuration anyway? That of course is an enormous question and I don’t have the time, space or inclination to deal with that here. But! I can say that as far as a configuration system is concerned: configuration objects should probably be able to be whatever you want them to be. On the reading end, it’s just POJOs. On the production/writing end, it’s just how those objects get built.

Often times here is an unspoken assumption that configuration systems that produce “rich” Java objects—composites—should tread carefully, as there are many other subsystems that deal with the production and management of such objects. For example, CDI lets you write producer methods that produce rich objects out of other objects. So shouldn’t CDI be in charge of producing rich objects? In some vague unspecified way you want configuration to not really tread on these subsystems’ toes.

Here I tend to think that a configuration object is whatever the application developer wants it to be. If she wants a configuration system to produce a ferociously complicated master configuration object singleton, then that is her (possibly ill-advised) right. It may not be a very good choice, but it isn’t really something that the configuration system should have an opinion about.

An application developer will probably want to work with a master configuration object that is relatively simple that can then be used as raw materials for higher order Java objects. “Relatively” simple is of course fuzzy: A Bicycle CDI bean, for example, might be produced by a CDI producer method, not by configuration, but using two Wheel objects sourced from the master configuration object and injected into the producer method’s parameter list.

At some point I hope to prototype such a system so it’s easier to talk about; stay tuned.