JBPM Process Migrations

We are using the JBPM workflow library (version 3.2.2) on a project here at Intelliware. After some analysis, we chose JBPM as our process modeling tool because it was open source and it was easy to integrate into our technology stack (Java 4, Hibernate for persistence).

Over time, a process definition needs to change. Usually these changes reflect new business requirements, but they can also be related to a bug fix or an improvement in the existing process. So when we release a new version of a software product, we may need to update the older process instances in the database. Rather than provide a mechanism to migrate process instances, the JBPM library supports multiple versions of a process definition simultaneously:

Process instances always execute to the process definition that they are started in. But JBPM allows for multiple process definitions of the same name to coexist in the database. So typically, a process instance is started in the latest version available at that time and it will keep on executing in that same process definition for its complete lifetime. When a newer version is deployed, newly created instances will be started in the newest version, while older process instances keep on executing in the older process definitions.

— JBPM User Guide

This wasn’t very appealing to us. Our application has processes that can be resumed at anypoint in the future (potentially years later) so by following the JBPM prescribed approach our developers would have to support – and the QA folks would have to test – outdated process instances for years to come.

What we wanted was the ability to migrate outdated process instances to the current process definition. The JBPM documentation addresses this approach:

An alternative approach to changing process definitions might be to convert the executions to a new process definition. Please take into account that this is not trivial due to the long-lived nature of business processes. Currently, this is an experimental area so for which there are not yet much out-of-the-box support.

As you know there is a clear distinction between process definition data, process instance data (the runtime data) and the logging data. With this approach, you create a separate new process definition in the JBPM database (by e.g. deploying a new version of the same process). Then the runtime information is converted to the new process definition. This might involve a translation cause tokens in the old process might be pointing to nodes that have been removed in the new version. So only new data is created in the database. But one execution of a process is spread over two process instance objects…

— JBPM User Guide

Unfortunately, JBPM doesn’t provide a tool to do this. Fortunately, the JBPM code is open source and well documented, so we built a migrator ourselves.

The Migrator

Given an old process instance, the migrator is responsible for transferring data to the latest process instance. The migrator transfers:

    • All of the tokens. This is facilitated through the use of mappings.
    • All persistent and transient variables.
    • It adds a migration memo (a String in the persistent variables map) to the new process instance which records info about the migration (the current date, the old process definition version#, the old process instance id , etc).

The migration is performed recursively, so each sub-process is migrated according to the above steps.

Mapping Token Nodes

One of the key challenges when migrating a process instance is the renaming or removal of wait state nodes. Wait state nodes (the green boxes in the diagram below) are where tokens reside when a process instance is persisted. Determining where a token should be placed is facilitated through a migration. A Migration contains a map that tells the Migrator where to put a token from a deprecated wait state node in the current process. Consider the following three versions of a Process called ‘Application’:

For the Application Process, two migrations would be written 1:

    • Migration #1 (maps tokens from version #1 to version #2):
{'init' => 'start', 'invalid' => 'Requires Review', 'end' => 'application completed'}
    • Migration #2 (maps tokens from version #2 to version #3):
{'start' => 'application received'}

Our migrator takes these individual migration maps and creates a composite map. With the composite map, the migrator can map a token from any outdated wait-state node directlyto the appropriate node in the current process. The composite map of these two migrations would be written as:

{'init' => 'application received', 'start' => 'application received', 'invalid' => 'Requires Review', 'end' => 'application completed'}

Note that the map only needs to explain what to do with tokens on deprecated nodes (e.g.init, start, invalid, and end). No mapping is required for non-deprecated nodes (e.g.managerial audit, application completed, and Requires review). By default, if no mapping exists for a wait state node (i.e. it is not deprecated) the migrator will attempt to move the token to a node with the same name in the new version.

The example I am using includes a discrete migration for each version of the Applicationprocess, but this will not always be the case. Depending on the changes being made to the Process Definition, it is possible that the developer will not be required to include a migration at all. This is a good thing. It means that we don’t have to write a migration for every single process definition change and the same set of migrations can be applied in the test environment and in the production environment (where the number of definitions is sure to differ).

But there is a cost to this approach. Deprecated nodes can never be used in future definitions of your process (perhaps terminated would have been a more accurate term than deprecated). So with our Application Process, the init, start, invalid, and end nodes can never be used again in the process definition (at least, not as wait states). Doing so would break the migrator.

Defining a Migration

Migrations are written as Java classes. The class must implement the Migration interface, it must not be abstract, and it must contain a default constructor. The Migration interface declares one method that must be implemented:

public StateNodeMap createNodeMap();

Here is how you would express the first migration for ‘Application’ Process Definition example:

public class Applica
tionProcessMigration001 implements Migration{
     public StateNodeMap createNodeMap() {
         return new StateNodeMap(new String[][]{{"init", "start"}, {"invalid", "Requires Review"}, {"end", "application completed"}});

Defining a Migrator

How do we create the migrator and use it to perform a migration? Like this:

Migrator migrator = new Migrator(“ApplicationProcess”, jbpmContext, “com.foobar.ApplicationProcessMigration”);
ProcessInstance newProcess = migrator.migrate(oldProcess);

The parameters used to create the Migrator instance are:

    • The name of the Process Definition that it will be migrating.
    • A JbpmContext instance. The migrator requires this to look up the latest Process Definition.
    • The Migration base class name. The migrator assumes that your migrations use the patternpackage.ClassName{migration#}. For the base Class name “com.foobar.ApplicationProcessMigration”, the migrator will attempt to load and instantiate classes named “com.foobar.ApplicationProcessMigration001”, “com.foobar.ApplicationProcessMigration002”, etc, until it can’t find any valid classes.

The third point is another example of convention over configuration, resulting in less maintenance and tedium on behalf of the developer.

Unit and Integration Testing

I’m putting this section last, but it was one of our top concerns when considering an approach to migrations. We debated a number of approaches to testing and most of them were deemed to be too complex and error prone.

We already unit tested our JBPM process definitions to make sure that transitions point to valid nodes and that all actions declared in the process were available on the Classpath. With regards to the migrations, we have a base test that asserts that:

    • A developer has not introduced a deprecated node into the current process definition.
    • All current nodes in the composite map exist in the process definition.
    • All current nodes in the composite map are valid wait state nodes.

Our application is deployed several times a week to a test environment where QA folks test functionality and validate stories. This provides a great opportunity to discover problems with our application early, and the migration code is no exception. We write migrations throughout the entire development cycle (not just when we release) to make sure, so if a QA person loads up an outdated process from from last week, the migrator will be run. This gives us a chance to find runtime bugs that the unit tests can’t locate.

  1. I’m using the Ruby literal syntax for a Hash because there isn’t one in Java.

This post was originally published on Google Docs

It's only fair to share...
Share on Facebook
Tweet about this on Twitter
Share on LinkedIn

Leave a Reply