Error Handling Done Right, Part 1

So I had this idea that error handling has been done badly for years (maybe forever (cf. awk’s “bailing out at line 1”)). There are a couple camps here.The first camp worships at the church of error codes and says, well, just as with BASIC line nu…

So I had this idea that error handling has been done badly for years (maybe forever (cf. awk’s “bailing out at line 1”)).

There are a couple camps here.

The first camp worships at the church of error codes and says, well, just as with BASIC line numbers, you define some error codes with some gaps in them—100 should be enough—and then you shoehorn every possible error condition into what amounts to a really poor compression scheme.

“I couldn’t find the X because the Y was unavailable even though the Z was responding” becomes, simply, lossily, 404.

This is the way that error handling has been done from the late 1960s till (through?) now. Wanna find out what really happened? Start digging, bearing in mind that a “not found”-type error is what you’re really looking for.

The second camp says rather officiously: well, clearly that’s not enough; we need to augment this poor lossy compression scheme with some other poor lossy compression schemes. That will solve all our problems. And so you get ANSI and X/Open dueling on exactly what a SQLState is. Anyway, this extra cryptic number, which never tells you what standards body coined it, conjoined with the equally inscrutable error code number, paired with the vendor’s own set of error codes, makes everything crystal clear. Right? No?

Look, we can all describe the process that we go through when we get a Java stack trace (I’m a Java guy; the rest of this is about Java error handling). It goes something like this:

  1. The developer gets a stack trace (because, er, someone else’s code died, not his).
  2. The developer says, “WTF?” and then begins combing through the piles of garbage. You scan from the top towards the bottom. The top is generally irrelevant; as you get closer to the bottom you get more interested.
  3. At this point the details of the stack trace are entirely irrelevant. They may very well become relevant later, but for now the developer wants to know: was it a SQLException? Was it a JNDI lookup exception? Was it one of those wrapping the other kind? Are there other things that stand out as the elevator goes lower? Scroll, scroll, scroll.
  4. The developer stumbles across the problem, which may very well not be the root cause, but which I bet you is near the bottom. We’re getting closer to the actual carnage.
  5. The developer may notice some other accreted state hanging off the Throwable chain as he descends; often without realizing it, he files it away (“Blah blah blah…foreign key…oh, hmm; interesting….”).
  6. After this quick skim, which typically takes a second or two, the developer can construct a pseudo-literate, pseudo-native-language sentence about what went wrong. “Ahhhh, OK, the frobnicator blew up because it couldn’t find the data source during user login—huh, that’s funny; the user name has some Unicode characters in it—which happened with that crufty old SSO ‘solution’ we purchased from YoYoDyne a while back; I bet they might have something to say here; Joe told me he thought they didn’t use foreign keys…hmm.”

Let’s look at step six.

Step six has the message that anyone remotely technical is going to need to begin to figure out how to fix the problem.

You’ll note that the message in step six did not come from one of the Throwables in the stack trace directly, nor any of its embedded messages. It came from bits of information scattered throughout the stack.

You’ll also note that step six does not feature an error code, though perhaps an error code of some vendor variety was involved (typically in step five). Nor does it feature a so-called disambiguating error code, or an error code that is intended to disambiguate the disambiguating error code. That’s in part because we all ignore error codes as a matter of course because they’re largely useless. That’s because they’re bad, lossy compression schemes that lose the very information you need. But I digress (sorry!).

You’ll notice that step six makes reference to some state that probably occurred higher up the Throwable chain (the fact that the user id had some Unicode characters in it).

Instead, what the developer did was pattern match—the pattern matched the Throwable chain, and, once a pattern he didn’t even know he was looking for was encountered, constructed the appropriate message for that chain (in his head).

So why don’t error messages work this way? Why do we make humans do this?

Because we’re lazy, that’s why, and also because at the point that we throw a Throwable we have no idea what’s going on in the system as a whole. It’s only when we catch a Throwable that we have the tools available to figure out what happened.

See, the thrower only knows that his little piece of the puzzle has failed. Usually this is because some inscrutable bit of machinery beneath him has failed. So he dutifully does the following:

try {
  frobnicator.frobnicate();
} catch (final FrobnicationException kaboom) {
  throw new BorkificationException("Encountered problem frobnicating", kaboom);
}

Ye gods. No wonder technologists accrue a reputation for being hopelessly out of touch. By the time this thing has passed up through other layers, we get some godawful stack trace like:

com.yourstartup.EnbratzificationException: Error in enbratzifying
    at com.yourstartup.Enbratzifier.enbratzify(Enbratzifier.java:120)
    [and so on]
Caused by: com.yourstartup.BorkificationException: Borkification died
    at com.yourstartup.Borkifier.borkify(Borkifier.java:307)
    [and so on]
Caused by: com.yourstartup.FrobnicationException: Encountered problem frobnicating
    at com.yourstartup.Frobnicator.frobnicate(Frobnicator.java:566)
    [and so on]
Caused by: com.yourstartup.CaturgiationException: caturgiating didn't work
    at com.yourstartup.Caturgiator.caturgiate(Caturgiator.java:1544)
    [and so on]

Quick! What happened? Oh, no! The whole system…it’s f***ed! We’re all going to die!

But, see, you know. You honed this ability a long time ago.

Because if you’re experienced, it will take you less than a quarter of a second to realize that the Enbratzifier couldn’t borkify because the Frobnicator exploded while caturgiating. And that will probably start you immediately thinking about frobnication in the context of enbratzification, but only when the Borkifier is properly configured, and how come the…. Congratulations; you’ve just engaged in some sophisticated pattern matching, and you’ve come up with a useful error message.

Computers are really good at pattern matching. So how come we don’t use pattern matching for error handling?

Part of the reason is architectural. We’ve all been trained that a layer is supposed to be ignorant of the layers above it, and should know only about the layer beneath it.

But when you surfed through the stack trace above, you didn’t pay any heed to these architectural principles—nor should you have. When errors are involved, architecture goes out the window. That’s because by the time the error has been encountered, the architecture has failed: something died, so you need to go pick up the bodies, and you’ll be damned if some ivory tower notion of isolation is going to stand in your way. Good for you.

So you busted through the layers like a hot knife through butter. You dug down the stack, glossed over and yet somehow retained bits of state (messages) involved in lower layers, and by mining this sludge you came to your conclusion.

You need an error handling library that does the same thing. If a big hairy Throwable chain matches a sophisticated pattern, then the library should let you construct a message based off the information encoded throughout the chain, and in its very structure.

Stay tuned for part 2, in which our hero descends into non-deterministic finite automata theory to bring this to fruition. It’s also where our hero takes a lot of Advil.

JAXB and interfaces

My readership (hi, both of you) may notice that my previous article on JAXB interfaces has been taken down. I did it on purpose, so there.I had posted it, and I think it was substantially correct. Then I started messing with it. After a few experi…

My readership (hi, both of you) may notice that my previous article on JAXB interfaces has been taken down.?? I did it on purpose, so there.

I had posted it, and I think it was substantially correct.?? Then I started messing with it.

After a few experiments, I managed to leave behind an annotated package-info.java file that led me to think I was seeing results I actually wasn't seeing.?? That led me to question my own post and to edit and update it twice.?? By the time I realized my mistake, the damage was done, so I took the post down to make sure I didn't further muddy the already polluted JAXB waters out there.

Thanks to Blaise Doughan for a pointer to his own document on working with interfaces and JAXB.

One area I'm still interested in is late binding of implementation classes to interfaces.?? In my project, I have a one-to-one correspondence of implementation to interface, but it is not known which implementation will be used until .ear file packaging time.?? The interface packages don't have compile-time visibility into the implementation packages (if yours do, you're Doing It Wrong), so specifying XmlAdapters with good type information is impossible.?? I can use XmlAdapter<Object, Object> to break the compile-time annotation dependency as recommended in the inscrutable Unofficial JAXB Guide, but that leads to some very funky XML output featuring lots of namespace declarations and things like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<person>
?????? <age>61</age>
?????? <habit xsi:type="habitImplementation" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
?????????????? <name xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema">Staying up late coding</name>
?????? </habit>
?????? <name xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Fred</name>
</person>

(How come the <name> element needs its type specified?)

I'd like to use the excellent Swiss Army knife approach of @XmlJavaTypeAdapters applied to a package-info.java file, but I don't actually want to specify this until the moment when I'm packing up my .ear file.?? At that point I will know (or can know, anyhow) what single implementation of a hypothetical Person interface, for example, will be used.

I know I'm ignoring lots of complexities and whatnot and surely there are valid reasons why the following is a bad idea, but it all sure would be simpler if it were possible to have the type attribute of @XmlElement take a String, not a Class, and get resolved at runtime.?? After all, how many times does an interface actually have a compile time dependency on its implementation class??? Zero, in almost every case I can think of.

I am currently musing over a nasty but intriguing stew of a Maven plugin, Javassist and programmatically generated package-info.class files to basically produce a big list of XmlAdapter specifications.?? The pro is this should get me exactly what I want.?? The con is this shouldn't be this difficult.

Zeroing out times in java.util.Date the right way

File this one under The Mob Is Coming For java.util.Date Author Alan Liu With Flaming Torches And Pitchforks (one in a long series).Folks who have worked with Java’s java.util.Date and java.util.Calendar classes know that when you create a new Dat…

File this one under The Mob Is Coming For java.util.Date Author Alan Liu With Flaming Torches And Pitchforks (one in a long series).

Folks who have worked with Java’s java.util.Date and java.util.Calendar classes know that when you create a new Date object without any arguments it is initialized with a long representing the number of milliseconds since the epoch.

Sometimes you need a Date that just has its date fields initialized. Here is the only way to do it properly, ensuring that all date fields are set to their defaults, and all time fields are set to their minimums. Anything else runs the risk of missing a time or date field or two, or uses a deprecated constructor, or is not performant, or all three.

final Calendar calendarNow = Calendar.getInstance();
assert calendarNow != null;
calendarNow.set(Calendar.HOUR_OF_DAY, calendar.getMinimum(Calendar.HOUR_OF_DAY));
calendarNow.set(Calendar.HOUR, calendar.getMinimum(Calendar.HOUR)); // for maximum correctness and safety you need to set both (!)
calendarNow.set(Calendar.MINUTE, calendar.getMinimum(Calendar.MINUTE));
calendarNow.set(Calendar.SECOND, calendar.getMinimum(Calendar.SECOND));
calendarNow.set(Calendar.MILLISECOND, calendar.getMinimum(Calendar.MILLISECOND));
calendarNow.set(Calendar.AM_PM, calendar.getMinimum(Calendar.AM_PM)); // this makes it "really correct" for future modifications
final Date now = calendarNow.getTime();
assert now != null;

Getting Jenkins Running On A Mac

I wanted to blog about how to get Jenkins running on a Mac using its installer.Jenkins is a great product, but its frenetic and crazed (but cheerful and enthusiastic) development process often shows through. Case in point: the default Mac installe…

I wanted to blog about how to get Jenkins running on a Mac using its installer.

Jenkins is a great product, but its frenetic and crazed (but cheerful and enthusiastic) development process often shows through.

Case in point: the default Mac installer, which you can download from the jenkins-ci.org website, sets up Jenkins to run as a Mac LaunchDaemon running as user daemon.

Now, there's nothing inherently wrong with this–indeed, it can be quite nice.?? You'll only have one instance of Jenkins running, and no user needs to be logged on for it to do its thing, and if Jenkins ever got hacked you're running as a low-privilege user rather than as some kind of full-fledged user with the ability to ruin your day.

However, this caused some weird problems, nullifying the entire intent of a one-click installer.?? These problems manifest themselves the moment you try to run a Maven build, which suggests to me that this (simple) smoke test is simply not run before new versions of the installer are released.?? Oh well, time to roll up our sleeves and turn the one-click installation process into an exercise in Mac system administration.?? 🙂

What's wrong with daemon?

The first thing to know about user daemon is that his home directory is /var/root.?? That should start to give you a funny feeling.

The reason that should give you a funny feeling is that Maven looks for its settings.xml file in $HOME/.m2.?? Which of course does not exist in /var/root.

So when Jenkins launches, it appears to come up fine.?? But if you try to run a Maven build, you'll get a lovely stack trace about how the file /var/root/.m2 couldn't be created.

When I first encountered this error, I just wanted to get the stupid thing working, so I did:


sudo mkdir -p /var/root/.m2

…and:


sudo chmod a+rwx /var/root/.m2

So this gets Jenkins-running-as-daemon past this problem, but now it wants to create temporary files in /Users/Shared/Jenkins/Home, which it doesn't own, and can't write to.

At any rate, I now realized that I didn't want this thing running as user daemon anyway, because I didn't want him doing anything to /var/root.?? And even if I could somehow tell him to use a different user directory so that $HOME/.m2/settings.xml would be resolved somewhere else, it was clear that I was going to have to edit .plist files.?? So, so much for the installer.?? And as long as the installer wasn't going to work, I decided that I wanted to make Jenkins run as a different kind of daemon user anyway.

This turned out (for this rookie Mac system administrator) to be quite difficult.

The steps involved are:

  1. Create a daemon user (I called mine _jenkins)
  2. Create a daemon group (I called mine???surprise!???_jenkins)
  3. Put the daemon user in the newly-created daemon group
  4. Create the home directory for the new daemon user (/Users/_jenkins in my case)
  5. chown the /Users/Shared/Jenkins directory so that its hierarchy is owned by your new user.
  6. edit /Library/LaunchDaemons/org.jenkins-ci.plist so that it reflects all this information.

Creating the user is a task that should not be accomplished through the usual Mac GUI methods.?? You need to use dscl instead.?? This is because you want to create a daemon user.?? I snooped around for a bit and came up with this lovely tutorial: http://www.minecraftwiki.net/wiki/Tutorials/Create_a_Mac_OS_X_startup_daemon#The_hard_.28and_correct.29_way.?? It walked me through steps 1-4 above.

Then I did:


sudo chown -R _jenkins:_jenkins /Users/Shared/Jenkins

Finally, my /Library/LaunchDaemons/org.jenkins-ci.plist looks like this:


<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
?????? <dict>
?????????????? <key>EnvironmentVariables</key>
?????????????? <dict>
?????????????????????????????? <key>JENKINS_HOME</key>
?????????????????????????????? <string>/Users/Shared/Jenkins/Home</string>
?????????????????????????????? <key>_JAVA_OPTIONS</key>
?????????????????????????????? <string>-Dfile.encoding=UTF-8</string>

?????????????? </dict>
?????????????? <key>GroupName</key>
?????????????? <string>_jenkins</string>
?????????????? <key>KeepAlive</key>
?????????????? <true/>
?????????????? <key>Label</key>
?????????????? <string>org.jenkins-ci</string>
?????????????? <key>ProgramArguments</key>
?????????????? <array>
?????????????????????????????? <string>/bin/bash</string>
?????????????????????????????? <string>/Library/Application Support/Jenkins/jenkins-runner.sh</string>
?????????????? </array>
?????????????? <key>RunAtLoad</key>
?????????????? <true/>
?????????????? <key>UserName</key>
?????????????? <string>_jenkins</string>
?????? </dict>
</plist>

I added the _JAVA_OPTIONS environment variable to force UTF-8 encoding.?? This is because no matter what kind of encoding you might specify in your Java code, Java-on-the-Mac's character encoding for what gets put out to the terminal is MacRoman by default (?!).?? You have to get the file.encoding property passed into the JVM early enough so that it is picked up by the rest of the JVM internals, and the only way to do that is to use the special _JAVA_OPTIONS environment variable picked up by all the Java tools in $JAVA_HOME/bin.?? The only unfortunate side effect of all this is that you get a warning printed to the screen on every JVM startup that says, effectively and incomprehensibly, I am using the environment variable you told me to.

Once you've done all this, you can simply stop the launch daemon and it will automatically restart with the new values:


sudo launchctl stop org.jenkins-ci

I hope that helps other Jenkins Mac users out.

jpa-maven-plugin Released

I’m pleased to announce to my enormous reading audience (both of you) the jpa-maven-plugin project. Please peruse the documentation, the Javadocs and then finally try using it and let me know what you think.

I'm pleased to announce to my enormous reading audience (both of you) the jpa-maven-plugin project.

Please peruse the documentation, the Javadocs and then finally try using it and let me know what you think.

Thread safety and PermissionCollection

I just made an interesting discovery while implementing the PolicyConfiguration#addToUncheckedPolicy(PermissionCollection) method. Notice anything unusual about the contract? No?Consider this: PermissionCollection subclasses need to be thread-safe…

I just made an interesting discovery while implementing the PolicyConfiguration#addToUncheckedPolicy(PermissionCollection) method.

Notice anything unusual about the contract??? No?

Consider this: PermissionCollection subclasses need to be thread-safe, as the Javadoc says:

Subclass implementations of PermissionCollection should assume that they may be called simultaneously from multiple threads, and therefore should be synchronized properly.

OK, so when implementing this method, you might take the na??ve approach, as I did initially.?? The incoming PermissionCollection is thread-safe, so there's no need to synchronize on anything.?? Hopefully you're cringing at my haste:

private final PermissionCollection uncheckedPermissions = new Permissions();

@Override
public void addToUncheckedPolicy(final PermissionCollection permissionCollection) {
?? if (permissionCollection != null && this.uncheckedPermissions != null) {
?????? // Look, ma, no synchronization!

?????? final Enumeration<Permission> elements = permissionCollection.elements();
?????? if (elements != null) {
?? ?? ?? while (elements.hasMoreElements()) {
?? ?? ?? ?? final Permission permission = elements.nextElement();
?????? ?? ?? if (permission != null && !this.uncheckedPermissions.implies(permission)) {
?????? ?? ?? ?? this.uncheckedPermissions.add(permission);
?????? ?? ?? }
?????????? }
?????? }
?? }
}

You could probably get away with this.?? After all, you know that the uncheckedPermissions member variable is guaranteed to be thread-safe (java.security.Permissions instances, like all PermissionCollection subclasses, must be "synchronized properly").?? Surely there's nothing more to worry about?

Wrong.?? JACC makes no guarantees one way or the other about what is happening to the incoming PermissionCollection that you're adding in this method.?? This PermissionCollection could be being modified by some other thread while you're processing it.?? So your elements() call–your enumeration of the individual Permission instances "inside" that PermissionCollection–will be inconsistent and broken.

OK, you think (I think), I'll just synchronize on the incoming PermissionCollection object.?? But there's nothing in the PermissionCollection documentation that indicates that PermissionCollection objects must synchronize on themselves during modification operations (such as add()).?? They can synchronize on whatever they want and are under no obligation to tell you what that mutex is going to be.?? Nine times out of then you're going to be handed a Permissions instance, of course, and if you go look at the source, yes, indeed, the PermissionCollection implementation inside that synchronizes on itself.?? But it certainly doesn't have to, and you shouldn't rely on it.

In the absence of further guarantees, all you can do is add the incoming PermissionCollection to a set of such PermissionCollections, and then, when you actually need to check permissions, walk through the set one by one and call implies() on each PermissionCollection.

The takeaway here is that in any code that you're using that enumerates a PermissionCollection, you're probably doing it wrong unless you have control of "both sides" of the PermissionCollection–unless you control both when Permissions are added to it and when they are enumerated.

Javadocs

One thing I believe very strongly in is good Javadocs.It’s one of the first places that I go when I need to understand how to use a library. I don’t attempt to understand how to use the library through my IDE’s autocomplete statements or the sourc…

One thing I believe very strongly in is good Javadocs.

It's one of the first places that I go when I need to understand how to use a library.?? I don't attempt to understand how to use the library through my IDE's autocomplete statements or the source code.?? I often end up in the source code, but that's usually because the API has not been well-designed or well-documented.

Good Javadocs are very hard to put together because at the same time you begin to put them together you run face-to-face with the weaknesses in your API.?? So you change the API to correct the problem, and now your Javadocs need to be torn up.?? Many developers stop there and throw the Javadocs out, or turn up their nose at them because they're not a very agile thing to do.?? But to me, they're absolutely essential–with them and unit testing, you can put together some decent APIs pretty quickly–or, I guess, more accurately, as quickly as you should be putting them together.?? :-)?? Good APIs take time, and the best ones explain themselves.

Here are some of the things that I do when writing Javadocs:

  • For any class name, I wrap it with the {@link tag}, whenever and wherever it appears.?? Think about that.?? This helps keep the reader in the flow: they can always turn down the styles in their browser or take other measures to get rid of blue underlines–I don't worry about too many links–but having the ability to quickly link to the class under discussion is invaluable.
  • For any identifier, or keyword, I wrap it with the {@code} tag.?? This is just following good typographical style.
  • For every parameter, I document whether null is permitted or not.?? This forces me to figure out what to do with nulls in every situation.?? Sometimes I will realize that accepting nulls is a perfectly reasonable thing to do (more often than not–but this is a stylistic question that every developer has a strong opinion on).
  • For every @return tag, I document whether the method can return null or not.?? Every single one.
  • For personal software, I use the @since tag and supply the month, year and date.?? That helps keep me honest and shows what order I've developed things in.
  • I always try to use the @author tag with my hyperlinked email address.
  • I try to use (and wish I were better at using) the @see tag to establish a rough trail through the API.
  • I try to hyperlink portions of explanatory documentation to methods and fields that the documentation refers to {@linkplain Object#equals(Object) as I do here in this example about the <tt>Object#equals(Object)</tt> method}.?? Note the use of the {@linkplain} tag and the nested <tt>s.
  • I have a standard boilerplate Subversion-friendly Javascript hairball that I use for the @version tag (I use keyword expansion of $Revision$):

@version <script type="text/javascript"><!–
document.write("$Revision: 0.0 $".match(/d+.d+/)[0]);
–></script><noscript>$Revision: 0.0 $</noscript>

Hope this helps you think about how you document your own code.