BPM in a Microservice World — Part 3

Many BPM practitioners are used to utilizing a software suite that has some sort of Process Manager component that has control of the transaction as it progresses through activities. The process is generally authored and visualized graphically in BPMN or BPEL. When applying BPM in the microservice world we don’t have that visibility or control.

A microservice architecture, more or less, forms a web where many services can call each other in an ad hoc manner. Such an architecture is rarely designed visually like we are used to in BPM. That will likely change as the MSA tools and frameworks mature but for now, each service is relatively independent and less attention is given to how the entire solution behaves as a whole.

Business processes that are realized with a traditional architecture or a microservice architecture can still benefit from the practice of Process Management. There still can be resource constraints, rework, SLA violations, lack of auditing, etc. The problem is that we can’t easily see and understand what is happening visually as we would with a traditional solution.

To solve this, we can apply a concept called Process Mining. We can actually create the kind of process diagrams we are used to in BPM by collecting event logs from the MSA. We then apply various algorithms to the events that can then discover a process diagram.

The logs can be in any format however there is a standard called XES that can be used to represent the data needed to produce process diagrams. Generally we need to know the resources that were involved in an activity, it’s start and stop time, as well as some kind of identifier that can be used to correlate related activities. The identifier is the hard part since you won’t likely want to force microservice designers to accommodate this need. There are some creative ways to impute such an identifier by the proximity of execution time along with some other datum.

Once the logs are accumulated, they can be transformed to XES format such that they can be imported into existing process mining tools for analysis. I’ve used two such tools. There is an open source tool called ProM and a commercial tool called Disco. ProM isn’t very easy to learn, but once you do, it is quite powerful. It can produce a BPMN diagram that you can then import into your traditional BPM Suite so that process simulation can be done against the transaction logs.

In doing this you may find that the solution could benefit from more instances of a particular microservice. You may see that there are many messages traveling through just a few services and perhaps they can be broken down more. You may also find that human resources are causing a backlog. Perhaps transactions that originate in Europe are being processed in the US and could benefit from having a node in the cluster local to the originator.

This is all stuff that we traditionally do in process optimization. By applying Process Mining, we can now do the same with processes running over microservices.

BPM in a Microservice World — Part 2

Back in the early days of “workflow” we had control of the transaction, usually a document, from the start of the process to the end. As IT evolved into the SOA/ESB era, we had a little bit less control but for the most part the process engine orchestrated everything.

There were frequent hand-offs to message queues but normally the message would come back to the process engine so it would continue to orchestrate the process.

The microservice world is different. Instead of having a process engine or an ESB controlling a small number of large services, we have many small services that can potentially send and receive messages or respond to events from any of the other services.

It’s more like a web. One initiating message or event to a particular service could affect the exchange of many hundreds of messages between the microservices before the initial request is considered complete. That can make BPM practitioners a bit uneasy due to the loss of control.

We may not have control any longer but we still can have visibility into the process. We can still apply our usual patterns for SLA and exception management, and human and compensating workflows. This can be accomplished through what I call a “tracking” process.

I have a process running today that interacts with microservices written with the microservices framework, Vert.x.

Vert.x includes an Event Bus and a cluster manager, among other features. A Vert.x cluster is made up of one to many nodes. A microservice is packaged as a jar module that includes a number of what they call Verticles (British spelling I guess). The verticles are deployed to any number of Vert.x nodes.

Once the verticles are deployed, the Event Bus manages the flow of messages and responses throughout the cluster. This all happens asynchronously, so there is no way us to control that flow from the process manager.

We can still create a process in BPMN that looks like the traditional process. Here is an example.

Opportunity Intake Process
Opportunity Intake Process

This is a simplified version of a real process that’s been running for a couple of years on Vert.x. It receives business opportunities from an outside source. Once one is received, we need to save it locally. Then we run it through a machine learning classifier to see if it is the type of opportunity the client might be interested in. If it is, then a human needs to have a look at it. Otherwise, it is rejected.

We receive thousands of these every day. Due to the parallel nature of Vert.x we are able to spawn many requests over the cluster and get this work done quickly.

The persistence part a quite performant so we don’t need many instances of that verticle in the Vert.x cluster. The classification part is slow and requires more resources. So, we have many instances of that verticle over the cluster.

The process above looks like a traditional process but in fact, we are not in control of the transaction here. In each activity, we are sending a message using the Vert.x Event Bus and then waiting until an event happens at a future time. Once that event is received we move on the the next activity which does the same.

Unfortunately, the classification activity doesn’t always complete in a timely manner. In this example we added a boundary timer so that if the classification takes too long, we notify a user and then terminate the process.

The activities that involve microservices in the main process are modeled as subprocesses. Here is an example of the Persist Opportunity subprocess.

screen-shot-2016-09-16-at-12-00-37-pm
Persist Message that calls Vert.x

 

The first activity is a custom work item handler I created for Vert.x. It will send a message to the Vert.x cluster using the Event Bus.

That message may cause a number of other services to be called within Vert.x. We don’t care about that, all we need to know is when it’s all finished. I created a customization for Vert.x so that the process manager will be sent a signal when a particular Vert.x service is complete. When that happens the Catch Signal will be executed. At that point, control will be returned to the calling process which can move on the next activity.

So, there you go! We can model processes as we are accustomed to even though we are not in control of the transaction as it moves through the various microservices. You can definitely use these pattern to combine microservice activities with traditional ones and apply our usual process management patterns to all of it.