EntityManager#contains(Object) Rolls Back the Transaction

A note for posterity: if you call EntityManager#contains(Object) on an object that is not known to that EntityManager, it will throw an IllegalArgumentException. When it does so, it irrevocably rolls the transaction back.

Instead, use something like this:

Making EclipseLink Logging Play Nice With GlassFish 3.1.2.2

To get logging working properly with EclipseLink 2.3.2 and GlassFish 3.1.2.2, you want to configure the actual logging values in GlassFish’s logging.properties file, not in your META-INF/persistence.xml file.  You have to set two levels (a bit mysterious, as one is a child of the other; setting values on the parent logger should cause them to flow downhill, but for some reason they do not).

Then, to be able to see SQL parameters in the output, you have to set a property in your META-INF/persistence.xml file.

Assuming you have a domain called domain1:

  • Edit $GLASSFISH_HOME/glassfish/domains/domain1/config/logging.properties and add the following lines:
    1. org.eclipse.persistence.level = FINE
    2. org.eclipse.persistence.sql.level = FINE
      • The first allows you to see SQL statements.  The second must be set in order for SQL parameters to be seen, but it is not sufficient on its own.
  • In your META-INF/persistence.xml, add the following element as a child of the <properties> element:
    <property name="eclipselink.logging.parameters" value="true"/>

TableGenerators and Sequencing

So I learned today that your persistence.xml may have both a <jta-data-source> and a <non-jta-data-source> specified alongside each other. I’m not sure why I thought these were mutually exclusive but I figured someone else probably had this misunderstanding as well.

(I also was reminded of the fact that JDBC does not support nested transactions in a slightly related tangent.)

I also learned that some JPA providers can take advantage of a double listing: your run-of-the-mill JTA environment (such as an EJB server) will use the <jta-data-source> by default, but the other one might be used by your JPA provider. The JPA specification doesn’t really say anything about this; section 8.2.1.5 is about as close as it gets:

8.2.1.5  jta-data-source, non-jta-data-source
In Java EE environments, the jta-data-source and non-jta-data-source elements are used to specify the global JNDI name of the JTA and/or non-JTA data source to be used by the persistence provider. If neither is specified, the deployer must specify a JTA data source at deployment or a JTA data source must be provided by the container, and a JTA EntityManagerFactory will be created to correspond to it.

These elements name the data source in the local environment; the format of these names and the ability to specify the names are product specific.

In Java SE environments, these elements may be used or the data source information may be specified by other means—depending upon the requirements of the provider.

Specifically, EclipseLink can use your <non-jta-data-source> element if you instruct it—for example—to use a separate connection for @TableGenerator-based identity generation. Otherwise, your sequence table operations will be only as granular as the (potentially long, hairy) transactions that wrap them.

Obviously, your <jta-data-source> and <non-jta-data-source> had better be pointing at the same data source in this case.

File this under “Huh”.

The Bag

Here is my usual explanation of the JPA notion of a persistence context.  I have used this analogy with great success before so finally decided to write it down.  For total accuracy in all edge cases, you’ll want to read the JPA specification.  Of course if you want to do that, you’re not the target audience here anyway, so off you go.

I will restrict myself to talking about the most common usage of JPA—inside a stateless session bean with transactional semantics.

So then, here goes.

First, let’s make a distinction between a normal object and an entity.

A normal object is just a Java object in your hand.  It might have JPA annotations on it or be governed by a JPA deployment descriptor, but the point is from holding it you can’t tell whether it is somehow hooked up to the JPA machinery or not.  (In JPA parlance a normal object is either a “new entity instance” or a “detached entity”.)

An entity on the other hand (a “managed entity” in JPA parlance) is a normal object that is managed by an EntityManager.  “Managed” means “snooped on by”, or “known to”, or “watched”, or “surrounded by”, or “tracked” or any of a whole host of other metaphors that might work for you.

(The JPA specification calls normal objects “new entity instances” and “detached entities”, where “detached” means, simply, not snooped on by an EntityManager.  For this little post I’ll just call them normal objects and will try not to use the word entity in order to emphasize the fact that when they aren’t being snooped on by an EntityManager they are just like any other Java object.)

So how does an EntityManager snoop on or manage a normal object?  It doesn’t, actually.  Instead, the EntityManager keeps track of a bag of entities called the persistence context.  If a normal object gets into the bag, it thereby becomes a managed entity.

So by definition a persistence context is a notional bag of JPA entities.  At the end of a (successful, about-to-be-committed) transaction, whatever is in this bag will be flushed to the backing database, assuming its total state is not the same as that of its database representation, regardless of how it got in the bag.

So how does something get into the bag?  You can get an entity into the bag in these ways:

  • You call some EntityManager retrieval method, like find.  The object that is returned is silently put into the bag for you and is therefore an entity.  As an exercise, see if you can predict what will happen at transaction commit time if you call a setter on that entity after you get it back from a query.
  • You create a query of some kind and run it, and it returns an object or a List of objects.  These objects will be silently placed into the bag and are therefore managed entities.
  • You call persist(), in which case the object you supply must not already have an analog in the bag.  If indeed there is no matching entity in the bag (we’ll talk about matching in a moment), then your object becomes known to the EntityManager as a managed entity.
  • You call merge(), passing in a normal object or an entity, in which case various complicated things will happen—more on that in a moment—and an entity will be returned by the method, which you must use in place of your original object.  (Note that it follows that calling merge() on a brand new normal object will accomplish almost the same thing as persist(): the object it hands you back will be an entity that is now in the bag that wasn’t there before, so at commit time it will be flushed to the database.)

Merging is the most complicated thing here so we’ll spend some time on it.

The most important takeaway, first, is that merge() does not cause anything at all database-related to happen.  It is not a save operation or an update operation.  It is not like a Save menu item.  At all.

When you merge an object into the bag that the bag has never seen before, then a new instance of that object is created in the bag, and the whole state of the object you supplied will be copied over onto it replacing any default state it might have had as a result of its default constructor being called, and this newly-created-and-populated entity is returned to you.  You throw away the object you passed.  (“Merge” is a horrible, horrible word to describe what is happening here, as nothing is actually being merged in any normal sense of the word.  The method should have been called something akin to becomeAwareOf or track or monitor or manage—how interesting that the EntityManager object does not have a manage method!.  For that matter, there should be a PersistenceContext object, which would make things a lot simpler, but I digress.)

If you merge an object into the bag, and the bag already has an entity in it with the same persistent identifier and type as your object (but different state), then the state of your incoming object replaces the state of the entity in the bag.  Once again, “merge” is the wrong word: nothing is being merged, but something is being overwritten or replaced!  Note in particular that “merge” here does not mean that any kind of partial operation happens—in all cases, the full state of your object always overwrites whatever state was present on a managed entity in the bag.  Once again, at the termination of all this, you need to effectively discard the object you passed in and use the entity that was returned to you.

If you somehow get your hands on an entity and then merge it again (i.e. it’s in the bag already and has never left the bag and you were just working with it and then you called merge() for no good reason) then nothing happens—it was already in the bag and therefore being tracked so there’s nothing more to be done.

If your normal object refers to other normal objects, and you call merge(), then your normal object is merged as I’ve described above—and apparently whether you have cascade settings set or not the normal objects it refers to via @ManyToOne and @OneToOne and @OneToMany and @ManyToMany annotations are merged into the bag as well: everything behaves as though each of these normal objects has been cloned into the bag, and as though all references to these normal objects have been replaced with references to their newly minted entity clones.  This is sort of like a poor-man’s cascade, and can’t be turned off; nor would you want it to be turned off: if your object graph, in other words, consists solely of normal objects governed by JPA annotations, then a merge on the root of the graph will conveniently shove them all in the bag, and the returned graph, which you must use in place of the graph you passed in, will consist entirely of entities.

Think hard about that: if you have a parent entity and a reference in your code to one of its children, then after merging the parent, the reference you have to the child is probably not what you want to work with.  You’ll be looking at the (now stale, untracked, un-snooped-on, never-flushed-to-disk) old reference, and any changes you make to it won’t get shoved to the database at transaction commit time.  You’ll want to reacquire a reference to that child, or, better yet, do your merge() early and then get a reference to the child after the merge has returned you a managed entity graph.

If your normal object refers to other entities as well that by definition are already in the bag, and you call merge(), and assuming you haven’t done anything with cascade settings, then the bare minimum is done to ensure that everything referenced by JPA annotations or deployment descriptor bits turns into an entity.  So any combination of normal objects and entities linked together into a graph will become a graph of managed entities when the root of the graph is merged.  But there’s no state copying that goes on.  This is the only part of the specification that probably says things as clearly as is possible, and as you’ll see, it isn’t very clear.  Remember that the spec uses “entity” to mean both “managed entity” and “detached entity” (normal object):

If X is an entity merged to X’, with a reference to another entity Y, where cascade=MERGE or cascade=ALL is not specified, then navigation of the same association from X’ yields a reference to a managed object Y’ with the same persistent identity as Y.

The upshot is that if you put a graph into the bag, all the objects it references of any kind get put into the bag as well.

So now you’ve got entities in the bag.

If you make any change to an entity that is in the bag—even one that got in there as the result of a query or a find operation—then at transaction commit time it will be flushed back to the database.

Commit time in most cases happens automatically in the scenarios I’m talking about—your EJB’s transactional method will cause a flush() to happen, followed by a transaction commit() at the end of the method.

flush() takes all the entities in the bag and compares their state with their state as it exists in their database representation.  Any differences are encoded in either INSERT or UPDATE statements.  So you can see that if you do a query, and then call a setter on one of the entities you get back, your harmless-looking query operation will end up resulting in an UPDATE statement behind your back.

Sometimes you need to take entities out of the bag.  This is known as detaching them, turning them into normal objects (“detached entities”).  In my opinion the notion of a persistence context should have been made more explicit in the API.  Unfortunately, as the API exists, detach() is a method on EntityManager, and takes an entity you wish to detach.  This is unfortunate because the specification does not indicate that an entity is somehow attached to an EntityManager—it constantly talks about entities being “in” a persistence context which in turn is managed by an EntityManager. One of the frustrations of the JPA API is the mixing together of all of these concepts (merging, attaching, detaching, adding to the persistence context, removing from the persistence context, managing and unmanaging entities). Anyhow, whenever you see detach(), know that this means “remove the supplied entity from the bag”.

You can also empty the whole bag by calling EntityManager#clear().

Taking entities out of the bag is not some sort of advanced operation best left to experts. It’s often exactly what you want to do. Managing entities takes a lot of work. Keeping the number of things in the bag to a bare minimum at all times is a good thing.

I hope this article helps you out in your work with JPA. Please feel free to leave comments and tell me where this analogy could be improved.

jpa-maven-plugin Released

I’m pleased to announce to my enormous reading audience (both of you) the jpa-maven-plugin project. Please peruse the documentation, the Javadocs and then finally try using it and let me know what you think.

I'm pleased to announce to my enormous reading audience (both of you) the jpa-maven-plugin project.

Please peruse the documentation, the Javadocs and then finally try using it and let me know what you think.

Running JPA tests, part 2

(This is part 2. Have a look at part 1 before you continue, or this won’t make much sense.) Now it turns out that all of the JPA providers except Hibernate (this is going to sound familiar after a while) really really really really want you to enh…

(This is part 2.  Have a look at part 1 before you continue, or this won’t make much sense.)

Now it turns out that all of the JPA providers except Hibernate (this is going to sound familiar after a while) really really really really want you to enhance or instrument or weave your entity classes.

First we’ll cover what this is, and then mention the different ways you might go about it.  Then I’ll pick one particular way and show you how to do it.

JPA entities need to be enhanced to enable things like lazy loading and other JPA-provider-specific tests.  The JPA provider might, for example, need to know when a particular property of your entity has changed.  Unless the specification were to have mandated things like PropertyChangeListeners on all properties (which, thankfully, it didn’t), there isn’t any way for the provider to jump in and be notified when a given property changes.

Enter weaving or enhancement (I’ll call it weaving, following EclipseLink’s term).  Weaving is the process where–either at build time or runtime–the JPA provider gets into your classes, roots around, and transforms them using a bytecode processor like Javassist or CGLIB.  Effectively, the JPA provider rewrites some of your code in bytecode so that the end result is a class that can now magically inform the JPA provider when certain things happen to its properties.

Weaving can be done at build time, as I said, or at runtime.

Now, if you’re like most Java EE developers, the reason you’ve never had to deal with weaving is that the Java EE specification requires all JPA providers in a Java EE container to do weaving silently in the background during deployment (if it hasn’t been done already).  So in a JPA 2.0 container like Glassfish or the innards of JBoss, weaving happens automatically when your persistence unit is discovered and deployed.

But the specification does not mandate that such automatic weaving take place when you’re not in a Java EE container.  And if you’re a good unit testing citizen, you want to make sure that your unit test has absolutely no extra layers or dependencies in it other than what it absolutely requires.

So in unit test land, you have to set this up (unless you want to drag in the testing machinery yourself, which is, of course, a viable option, but here we’re focusing on keeping the number of layers to a minimum).

When you go to set up weaving, you have to choose whether you want to do it at build time or at runtime.  I’ve chosen in all three cases to focus on build time weaving.  This has some happy side effects: if you do build-time weaving correctly, then not only do you get faster, more accurate unit tests, but if you perform that weaving in the right place then you can have Maven also deploy JPA-provider-specific versions of your entity classes for you automatically.  That, in turn, means you can install those jar files in your Java EE container and skip the dynamic weaving that it would otherwise have to perform, thus shortening startup time.

Now, all three JPA providers approach build time weaving in a different way (of course).  All three providers provide Ant tasks, but EclipseLink and OpenJPA also provide command line tools.  So we’ll make use of them where we can to avoid the Ant overhead wherever possible.

Regardless of which provider we’re talking about, weaving at build time involves the same necessary inputs:

  • A persistence.xml file somewhere.  This usually lists the classes to be weaved, as well as provider specific properties.  It isn’t (for this purpose) used to connect to any database.
  • The raw classes to be weaved.

Now, wait a minute.  If weaving alters the bytecode of your classes, then what happens if you try to use a Hibernate-weaved class in an EclipseLink persistence unit?

Things blow up, that’s what.

This is where things get regrettably quite complicated.

Before we start weaving, we’re going to need to set up areas for each persistence provider where the weaving may take place.  To set these up, we’re going to step in after compilation and copy the output of plain compilation to each provider’s area.

In Maven speak, anytime you hear the word “copy” you should be thinking about the maven-resources-plugin.  We’re going to have to add that to our pom.xml and configure it to take the classes that result from compiling and copy them to an area for EclipseLink, an area for Hibernate and an area for OpenJPA.

Here is the XML involved for the EclipseLink copy.  This goes in the <plugins> stanza as per usual:

<plugins>
  <plugin>
    <artifactId>maven-resources-plugin</artifactId>
    <executions>
      <execution>
        <id>Copy contents of build.outputDirectory to EclipseLink area</id>
        <goals>
          <goal>copy-resources</goal>
        </goals>
        <phase>process-classes</phase>
        <configuration>
          <resources>
            <resource>
              <filtering>false</filtering>
              <directory>${project.build.outputDirectory}</directory>
            </resource>
          </resources>
          <outputDirectory>${project.build.directory}/eclipselink/classes</outputDirectory>
          <overwrite>true</overwrite>
        </configuration>
      </execution>
      <!– and so on –>
    </executions>
  </plugin>
</plugins>

So during the process-classes phase–which happens after compilation has taken place–we copy everything in ${project.build.outputDirectory}, without filtering, to ${project.build.directory}/eclipselink/classes.  Most commonly, this means copying the directory tree target/classes to the directory tree target/eclipselink/classes.  This area will hold EclipseLink-woven classes that we can later–if we choose–pack up into its own jar file and distribute (with an appropriate classifier).

We’ll repeat this later for the other providers, but for now let’s just stick with EclipseLink.

Before we get to the actual weaving, however, there’s (already) a problem.  Most of the time in any reasonably large project your JPA entities are split up across .jar files.  So it’s all fine and good to talk about weaving entities in a given project, but what about other entities that might get pulled in?  What happens when weaving only happens on some classes and not others?  Unpredictable things, that’s what, so we have to make sure that at unit test time all our entities that are involved in the test–whether they come from the current project or are referred to in other .jar files–somehow get weaved.  This gets tricky when you’re talking about .jar files–how do you weave something in a .jar file without affecting the .jar file?

The answer is you don’t.  You have Maven unpack all your (relevant) dependencies for you, then move the component classes into an area where they, too, can be weaved, just like the entity classes from the current project.  Let’s look at how we’ll set those pom.xml fragments up.  You want to be careful here that these dependencies are only woven for the purposes of unit testing.

The first thing is to make use of the maven-dependency-plugin, which, conveniently enough, features the unpack-dependencies goal.  We’ll configure this to unpack dependencies into ${project.build.directory}/dependency (its default output location):

<plugins>
  <plugin>
    <artifactId>maven-dependency-plugin</artifactId>
    <version>2.2</version>
    <executions>
      <execution>
        <id>Unpack all dependencies so that weaving, instrumentation and enhancement may run on them prior to testing</id>
        <phase>generate-test-resources</phase>
        <goals>
          <goal>unpack-dependencies</goal>
        </goals>
        <configuration>
          <includeGroupIds>com.someotherpackage,${project.groupId}</includeGroupIds>
          <includes>**/*.class</includes>             
        </configuration>
      </execution>
    </executions>
  </plugin>
</plugins>

Here you can see we specify which “group ids” get pulled in–this is just a means of filtering the dependency list.  You can of course alter this any way you see fit.  You’re trying to pull in any JPA entities that are going to be involved in your tests and make sure they get woven, so choose your group ids accordingly, and see the unpack-dependencies documentation for more tweaking you can do here.

So if you were to run mvn clean generate-test-resources at this point, the following things would happen:

  • Your regular classes would get compiled into target/classes.
  • Your regular resources would get copied into target/classes.
  • The entire contents of that directory would then get copied into target/eclipselink/classes.
  • Classes from certain of your dependencies would get extracted into target/dependency, ready for further copying.

Now we’ll copy the unpacked dependency classes into the test weaving area.  This little configuration stanza goes in our prior plugin declaration for the maven-resources-plugin:

<executions>
  <!– other executions –>
  <execution>
    <id>Copy dependencies into EclipseLink test area</id>
    <goals>
      <goal>copy-resources</goal>
    </goals>
    <phase>process-test-resources</phase>
    <configuration>
      <resources>
        <resource>
          <filtering>false</filtering>
          <directory>${project.build.directory}/dependency</directory>
        </resource>
      </resources>
      <outputDirectory>${project.build.directory}/eclipselink/test-classes</outputDirectory>
      <overwrite>true</overwrite>
    </configuration>
  </execution>
</executions>

This is so that the dependencies can be woven with everything else–remember that you’ve got to make sure that all the entities in your unit tests (whether they’re yours or come from another jar involved in the unit test)–are woven.

We have two more bits of copying to do to get our classes all in the right place.  Fortunately they can be combined into the same plugin execution.

The first bit is that we have to take the classes that will have been woven in target/eclipselink/classes and copy them unmolested into the test area so that they can reside there with all the unpacked dependency classes.  This is to preserve classpath semantics.  That is, we’ve already laid down the dependencies inside target/eclipselink/test-classes, so now we need to overlay them with our woven entity classes (obviously once they’ve already been woven) to make sure that in the event of any naming collisions the same semantics apply as would apply with a normal classpath in a normal environment.  At the end of this we’ll have a target/eclipselink/classes directory full of our entity classes that are waiting to be woven, and a target/eclipselink/test-classes directory that will ultimately contain our woven classes as well as those from our dependencies.

The second bit is that since sometimes unit tests define their own entities, we have to make sure that the regular old target/test-classes directory gets copied into the EclipseLink test weaving area as well, and, moreover, we have to make sure this happens last so that any test entities “shadow” any “real” entities with the same name.

As I mentioned, we can accomplish both of these goals with one more execution in maven-resources-plugin:

<execution>
  <id>Copy contents of testOutputDirectory and contents of EclipseLink area to EclipseLink test area</id>
  <phase>process-test-classes</phase>
  <goals>
    <goal>copy-resources</goal>
  </goals
>

  <configuration>
    <resources>
      <resource>
        <filtering>false</filtering>
        <directory>${project.build.directory}/eclipselink/classes</directory>
      </resource>
      <resource>
        <filtering>false</filtering>
        <directory>${project.build.testOutputDirectory}</directory>
      </resource>
    </resources>
    <outputDirectory>${project.build.directory}/eclipselink/test-classes</outputDirectory>
    <overwrite>true</overwrite>
  </configuration>
</execution>

Finally, you’ll recall that I said that there are two inputs needed for weaving:

  1. A persistence.xml file somewhere
  2. The raw classes to be woven

We’ve abused the maven-dependency-plugin and the maven-resources-plugin to get (2).  Now let’s look at (1).

The persistence.xml file that is needed by the EclipseLink weaver is really just used for its <class> elements and its <property> elements.  Pretty much everything else is ignored.  This makes a certain amount of sense: EclipseLink will use it as the definitive source for what classes need to be woven if you don’t tell it anything else, and a particular property (eclipselink.weaving) will instruct EclipseLink that indeed, weaving is to be done and is to be done at build time.

So we’ll put one of these together, and store it in src/eclipselink/resources/META-INF/persistence.xml:

<?xml version=”1.0″ encoding=”UTF-8″?>
<persistence version=”2.0″ xmlns=”http://java.sun.com/xml/ns/persistence” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd“>
  <persistence-unit name=”instrumentation” transaction-type=”RESOURCE_LOCAL”>
    <class>com.foobar.SomeClass1</class>
    <class>com.foobar.SomeClass2</class>
    <class>com.foobar.SomeClassFromSomeDependency</class>
    <properties>
      <property name=”eclipselink.weaving” value=”static” />
    </properties>
  </persistence-unit>
</persistence>

…and, back in our monstrous maven-resources-plugin stanza, we’ll arrange to have it copied:

<execution>
  <id>Copy EclipseLink persistence.xml used to set up static weaving</id>
  <goals>
    <goal>copy-resources</goal>
  </goals>
  <phase>process-classes</phase>
  <configuration>
    <outputDirectory>${project.build.directory}/eclipselink/META-INF</outputDirectory>
    <overwrite>true</overwrite>
    <resources>
      <resource>
        <filtering>true</filtering>
        <directory>src/eclipselink/resources/META-INF</directory>
      </resource>
    </resources>
  </configuration>
</execution>

It’s finally time to configure the weaving.  For EclipseLink, we’ll use the exec-maven-plugin, and we’ll go ahead and run the StaticWeave class in the same Maven process.  We will run it so that it operates in-place on all the classes in the EclipseLink test area.

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <configuration>
    <includePluginDependencies>true</includePluginDependencies>
    <includeProjectDependencies>true</includeProjectDependencies>
  </configuration>
  <dependencies>
    <dependency>
      <groupId>org.eclipse.persistence</groupId>
      <artifactId>org.eclipse.persistence.jpa</artifactId>
      <version>2.2.0</version>
    </dependency>
  </dependencies>
  <executions>
    <execution>
      <id>Statically weave this project’s entities for EclipseLink</id>
      <phase>process-classes</phase>
      <goals>
        <goal>java</goal>
      </goals>
      <configuration>
        <arguments>
          <argument>-persistenceinfo</argument>
          <argument>${project.build.directory}/eclipselink</argument>
          <argument>${project.build.directory}/eclipselink/classes</argument>
          <argument>${project.build.directory}/eclipselink/classes</argument>
        </arguments>
        <classpathScope>compile</classpathScope>
        <mainClass>org.eclipse.persistence.tools.weaving.jpa.StaticWeave</mainClass>
      </configuration>
    </execution>
    <!– there will be other executions –>
  </executions>
</plugin>


This stanza simply runs the StaticWeave class, supplies it with (effectively) -persistenceinfo target/eclipselink as its first effective argument, and then tells it to work in place on the target/eclipselink/classes directory.

In part 3, we’ll put all of this together.

Running JPA tests

I’ve been trying to get to a place where I can achieve all the following goals: Have my domain entities be pure JPA @Entity instances. Run JUnit tests against those entities using the “big three” JPA providers (Hibernate, EclipseLink and OpenJPA)….

I’ve been trying to get to a place where I can achieve all the following goals:

  • Have my domain entities be pure JPA @Entity instances.
  • Run JUnit tests against those entities using the “big three” JPA providers (Hibernate, EclipseLink and OpenJPA).
  • Set up and tear down an in-memory database in a predictable way
  • Run the whole mess with Maven without any special JUnit code

I’m going to talk a bit about the second and last points.

To run a JPA test, you’re going to need an EntityManager.  And to get an EntityManager in a non-EJB environment, you’re going to need a JPA provider on the classpath.

You basically have three JPA providers to choose from: EclipseLink, Hibernate and OpenJPA.  These are the ones that are in wide use today, so they’re the ones I’m going to focus on.  You want to be able to back up any claim you make that your JPA entities will run under these big three providers, so to do that you need to make sure you’re unit testing them.

We’d like our tests to exercise our entities using each of these in turn.  Further, we’d like our tests to be run in their own process, with an environment as similar to the end environment as possible, while not including any extra crap.

So to begin with, we’re going to have to get Surefire (Maven’s test runner plugin) to run three times in a row, with a different JPA provider each time.

The first time I attempted this, I thought I’d use the @Parameterized annotation that comes with JUnit.  This annotation lets you set up a test class so that JUnit will run it multiple times with different input data.  I had set it up so that the input data was a String that identified the persistence unit and the JPA provider.  This worked fine, but the multiple times your test runs are not each in their own process.  As a result, you end up having all three JPA providers on the classpath at the same time, and various problems can result.  The whole solution was rather brittle.

Instead, we want to have Maven control the test execution, not some looping construct inside JUnit.

The first insight I had was that you can set up a plugin in a pom.xml file to run several times in a row.  This is, of course, blindingly obvious once you see it (if you’re used to staring at Maven’s XML soup), but it took me a while to realize it’s possible.

Here, for example, is a way to configure the maven-surefire-plugin to run three times (with no other configuration):

<build>
  <plugins>
  <plugin>

    <artifactId>maven-surefire-plugin</artifactId>
    <version>2.7.2</version>
    <configuration>
      <skip>true</skip>
    </configuration>
    <executions>
      <execution>
        <id>First Surefire run</id>
        <goals>
          <goal>test</goal>
        </goals>
        <phase>test</phase>
        <configuration>
          <skip>false</skip>
        </configuration>
      </execution>
      <execution>
        <id>Second Surefire run</id>
        <goals>
          <goal>test</goal>
        </goals>
        <phase>test</phase>
        <configuration>
          <skip>false</skip>
        </configuration>
      </execution>
      <execution>
        <id>Third Surefire run</id>
        <goals>
          <goal>test</goal>
        </goals>
        <phase>test</phase>
        <configuration>
          <skip>false</skip>
        </configuration>
      </execution>
    </executions>
  </plugin>
  <!– other plugins here –>
  </plugins>
<!– other build info –>
</build>

Run mvn test and you’ll see Surefire run three times.  (There’s actually a MUCH shorter way to accomplish this trivial three-run configuration, but it won’t aid our ultimate cause, so I’ve opted to stay verbose here.)

We’ve told Surefire that by default it should skip running.  Then we’ve provided it with three executions (identified with <id> elements so we can tell them apart).  Each execution is structured to run the test goal during the test phase, and tells Surefire that it should not skip.

It’s important to note that the <configuration> element, when contained in an <execution>, applies only to that <execution>, and overrides any related settings in the default <configuration> (housed immediately under the <plugin> element), if there is one.  We’re going to make heavy, heavy use of this fact.

So hopefully you can start to see that the looping of the test cases can be controlled by Maven.  We know that we’re going to run Surefire three times–one time for each JPA provider–and now we want to make sure that each iteration is in its own process.  Let’s enhance the default configuration of this skeletal block a bit:

<plugin>
  <artifactId>maven-surefire-plugin</artifactId>
  <configuration>
    <skip>true</skip>
    <forkMode>always</forkMode>
    <useFile>false</useFile>
  </configuration>

We’ve told Surefire to always fork, and (purely for convenience) told it to spit any errors to the screen, not to a separate file.  So this solves the test looping problem.  We have our skeleton.  On to part 2.