Book Image

Mastering JBoss Drools 6

By : Mariano De Maio, Mauricio Salatino, Esteban Aliverti
Book Image

Mastering JBoss Drools 6

By: Mariano De Maio, Mauricio Salatino, Esteban Aliverti

Overview of this book

Mastering JBoss Drools 6 will provide you with the knowledge to develop applications involving complex scenarios. You will learn how to use KIE modules to create and execute Business Rules, and how the PHREAK algorithm internally works to drive the Rule Engine decisions. This book will also cover the relationship between Drools and jBPM, which allows you to enrich your applications by using Business Processes. You will be briefly introduced to the concept of complex event processing (Drools CEP) where you will learn how to aggregate and correlate your data based on temporal conditions. You will also learn how to define rules using domain-specific languages, such as spreadsheets, database entries, PMML, and more. Towards the end, this book will take you through the integration of Drools with the Spring and Camel frameworks for more complex applications.
Table of Contents (18 chapters)
Mastering JBoss Drools 6
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface
Index

Why do we use rules?


At this point, you might be still a bit puzzled about why rules are something useful. If we think of it, in terms of one rule or a few, we might consider it better to do it directly on the imperative code like Java, for example. As developers, we're used to break down the requirements into a list of steps to be followed and having to give away that control can be something intimidating.

However, the main strength behind business rules doesn't come from one rule or a small group of rules, it comes from a large, ever-changing group of rules that define a system so complex that it would require extensive work to maintain it if we did it with regular code.

Many rules can work together to define complex systems as the growth of the business rules code base happens organically. Whether we need to implement new requirements, modify existing ones, replace parameters, or change the structure of our system behaviour in new unexpected ways, the only thing we will need to do with the rules is implement new rules that now apply and remove the ones that don't apply anymore. This is possible as business rules work on the following principles:

  • They're independent

  • They can be easily updated

  • Each rule controls minimal amount of information needed

  • They allow more people of different backgrounds to collaborate

Rules independence

A Business Rule, all by itself, can't do much. The biggest strength of a business rule-based system is created by having a lot of rules interacting with each other. This interaction, however, is not something the rule should directly know most of the time. That's what we mean when we say rules should be independent. Each rule should be able to detect a particular set of circumstances and act upon it without needing anything other than the data of its domain.

When we think about it, this is the usual way the rules exist once we start formalizing them. Take any law book that you can find and you will see them represented as a group of rules, each one in the form of a clause. Most of them just present a scenario and any specific action or interpretation of that scenario. Most of these clauses won't mention any other clauses. There are a few that do; however, they tend to be the exception. There is a reason for it to be this way and it is to make the rules easier to understand, define, and make them less prone to misinterpretation.

The same principle applies when we define business rules for an organization. Each rule should try not to depend on any other specific rule. Instead, rules should depend only on the data provided by the domain. This allows a rule to be able to make sense by itself, without having to create any other explanation besides the content of the business rule.

However, sometimes rules do depend on others in an indirect way. The assumptions we make on one rule can be used in the conditions of another one. These data creations, through assumptions that a Business Rule engine can make, are called inferences and they are of great use to extend the usability of our rules.

Rule execution chaining

As we mentioned in the previous section, a good Business Rule is an independent entity, depending on nothing but the domain data to make sense. This doesn't mean that each rule should work on completely different data structures. Otherwise, you might end up with very complex rules that would be hard to maintain.

If a rule is too complex, it can be divided into smaller rules; however, even in said case, the independence of rules is still important and you shouldn't have to explicitly invoke rules from each other. That would imply control of the sequence flow and we've already stated that declarative programming doesn't allow this.

Instead, we can split complexity by defining the rules that make assumptions about the base domain and add information to the domain. These assumptions are called inferences. Later on, other rules can use this new information, regardless of how it is determined, as a part of their conditions. Let's see the following example to completely understand this splitting of rules:

  • When we get a signal from a fire alarm, we infer that there is a fire

  • When there is a fire, we call the fire department

  • When the fire department is present, we let them in to do their work

Each one of these three rules can be condensed into a single, more complex rule: when we get a signal from a fire alarm, we call the fire department and let them in to do their work. However, by splitting the rules into simpler components, we can easily extend the abilities of our rule engine. We could reuse the first inference that we make—about there being a fire—to trigger other actions such as activating the emergency sprinklers, disabling the elevators, or calling our insurance company.

When a rule no longer makes sense, we can remove it from the rule engine. If a new rule is required, we can create it and take advantage of the already available inferred data. As the sequence flow will be controlled by the engine, we don't have to worry about the order in which things are going to be executed or where the new rules fit among the rest of the existing rules.

Atomicity of rules

As we can create more rules that take advantage of already established inferences, the simpler our rules are, the more extensible they become. Therefore, another principle of good rule writing establishes that we should try to make our rules as simple as possible to the point that they cannot be divided into anything smaller, which could still be considered a rule. This principle is called Rule Atomicity.

The atomic rules are simple to understand. They are usually designed with minimal amount of conditions to take an action or infer the occurrence of a situation. As they are independent, they still make sense by themselves. Rule atomicity, rule independence, and inference capabilities together make business rules the simplest component that we can use to define the behaviour of any of our systems. Simplicity allows a clear understanding of why the decisions are made in the system, making rules self-explanatory and allowing us to keep a track of every rule that intervened in a specific decision. This is the reason why laws have been the building blocks of the society's internal regulations for thousands of years.

Ordering of rules

We've already mentioned that rules don't follow one specific order. Sequence flow is determined by the rule engine, which means the rule engine will have to decide, based on the available data from the domain, which rules should fire and in what order. This means the order in which the rules are defined is not important, only the data in their condition is required to match a specific rule.

There are ways of ordering rules that are competing for execution under the same conditions being met in the domain. This ordering works as a second-level prioritizing for rules, with the data in the domain model being the first one needed to determine a rule to be activated. These ordering mechanisms, which we will discuss later in more technical chapters, should be for special cases only. Exceptions to the common way we define rules instead of the norm. If we find ourselves controlling every single rule and the order in which it should fire, we should rethink of the way we're writing our rule definitions.

This is something difficult to absorb by the developers first getting a glance at declarative programming. Nonetheless, it provides a lot of improvements in the way that we can accelerate both our runtime and development efforts mainly based on the fact that if the order doesn't matter, we can add rules wherever we prefer:

  • Collaboration between rules becomes simpler to manage

  • Conflict avoidance is simpler

  • More people can work on the development of rules, which makes inclusion of other areas a very real possibility

Rule execution life cycle

The rule engine optimizes the evaluation of conditions and makes sure that we determine the rules to fire in the fastest way possible. However, the rule engine doesn't execute our business rules immediately at a condition's detection unless we specify so. When we reach a point where we find a rule evaluation to be true for a group of data, the rule and the triggering data are added to a list. This is a part of an explicit rule life cycle, where we have a clear splitting between rule evaluation and rule execution. Rule evaluation adds rule actions and the data that has triggered them to a component that we will call the Agenda. Rule execution is done on command. The moment we notify the rule engine, it should fire all the rules that we have in the said agenda.

As we stated earlier, we don't control the rules that are going to be fired. It's the engine's responsibility to determine this based on the business rules that we create and the data that we feed to the engine. However, once the engine determines the business rules that it should fire, we have the control over the time when they should be fired. This is done through a method invocation to the rule engine.

Once the rules are fired, each rule that matches in the agenda will be executed. Rule execution might modify the data in our domain and if these modifications cause some rule to match with the new data, new rule matches can be added to the agenda or if these modifications cause a match to no longer be true, it will be cancelled. This full cycle will continue until no more rules are available in the Agenda for the available data or the rule engine execution is forced to stop. The following diagram shows how this workflow is executed:

This execution life cycle will continue firing all the rules that the rule engine has decided to add to the Agenda based on the rule definitions and the domain data we feed to it. Some rules might not fire and some rules might fire multiple times.

During the following chapters, we will learn how to control the rules that should fire; however, we will always maintain the principles for Business Rule writing that we already established—independence and atomicity. The more we learn about the configuration of the rule engine, the more we will trust it to do its job. For the moment, it will be a leap of faith; however, with every step, we will learn how to control the rule engine until we can be 100% sure that it will do exactly what we expect of it.

Collaboration with Rules

As the sequence flow is beyond our direct control when creating business rules, one main advantage we have is that we don't have to worry about the code placing. As all rules are independent and the sequence flow is determined by the engine at runtime, it doesn't matter where we place the rule.

With common, imperative programming languages such as Java, each instruction will happen at a specific moment in the program execution and finding said specific point in the code, where we need to add our modifications, involves reviewing the whole set of code. Entire design patterns have been created around managing this limitation in ways that we can collaborate between developers while working on the same system. Every major design pattern works on splitting the code base in groups such as modules, methods, and classes to manage these collaborations between developers with ease.

However, the main limitation with the imperative code is that once the system has been designed, we cannot break beyond the limit that we used to split our code base easily. We are forced to foresee the probable changes that might be added in the future when we create the design—something which can be very difficult to achieve. If we fail to do so and many developers have to modify the same code sections due to the different requirements, their code will be prone to conflicts.

This limitation can be avoided by declarative programming because the specific order of the rules doesn't matter. Collaborations between different people defining different aspects of a same domain module can be done without conflicts as a good place to add another Business Rule is anywhere between the existing business rules. The output execution will be relatively same, regardless of the order.

Let's take a look at the following pseudo code section comparison between the Imperative and Declarative code. When we have to add any modifications to an imperative block of code, we cannot just do it at any place. There are specific places to add a specific correction and if you place them in a different spot, it either doesn't work as expected or is not as performing as it could be, you can the comparison as shown as follows:

The business rules, on the other hand, define each rule as an isolated block of code. People could add work in any part without any problem. This makes application development with business rules easier in collaborative environments as it is far less prone to conflict problems, as shown in the following image:

Having less chance for conflict, we can concentrate our time and energies on the solution that we are trying to build instead of worrying about merging the solutions between different components or within a same component.

The increased possibility of having more points on which to add code without conflicts opens the door to have more people involved in the development life cycle. This can help speed up the development and update of our software solutions dramatically.

Involving more people with Rules using a BRMS

Thanks to the increased collaboration business rules provides us during the development time, we can have an increased amount of people working on defining the decisions for our systems. The immediately subsequent bottleneck that we usually face at this point is finding more people who understand how to write the rules.

Writing rules is, at least in the beginning, a technical task. It requires a certain level of knowledge about how to define conditions and actions—topics that we will cover in detail in the next chapters—and getting more people to learn how to write these rules takes a little time.

Even if we get technical people to learn how to write rules quickly, it is usually not enough. It's not due to a technical limitation but mostly due to the people who hold the practical knowledge that we need to write as business rules not being the most available or tech-savvy group of people. It could be the case, of course, and you may have probably found one of the best groups to work with Business Rule-based systems. However, for most of the cases, they will have the practical knowledge but not the time or desire to learn how to write technical rules.

For these groups of business experts, there are platforms that allow them to access rule writing in a more user-friendly way. These platforms are a composition of user-friendly editors, with versioning and publishing capabilities, called Business Rule Management Systems (BRMS). Basically, business experts will be able to create rules using the same everyday language that they are familiar with and use for thinking definitions for decisions. You will learn more about these user-friendly ways of writing rules in Chapter 5, Human Readable Rules. For now, let's just mention that we can define business rules in a natural language using editors that allow business experts to work directly on the rules in a very similar speed to how technical experts define business rules.

The following is a small screenshot where we can see one of these editors in the KIE Workbench, a Drools based BRMS:

Letting the rule engine do its job

So far, we've covered an introductory explanation about the structure of business rules. Whenever we had to explain how the rules were executed, we simply said that the rule engine will take care of it. When we use business rules, we trust a rule engine to determine the rules that should fire, based on the domain data that we send to it. We will, at this stage, try to define how the rule engine will define the rules that should be fired and when.

In the previous sections, we saw a brief display about how rules can be translated to execution trees, where decisions are taken based on the data, following a declarative paradigm approach. In this section, we will try to explain how this structure helps in creating the most performing execution possible based on our rule definitions.

Rule engine algorithm

The rule engine transforms the business rules that we define to an executable decision tree through a specific algorithm. The performance of the execution tree will depend on the optimization mechanisms the algorithm can generate. The Drools 6 framework defines its own algorithm focused on higher performance. This algorithm is called PHREAK and was created by Mark Proctor. It is based on a series of optimizations and redesigns of a pre-existing algorithm called RETE, created by Charles Forgy. PHREAK is one of the most efficient and performing algorithms implemented as open source to the date.

In the generated execution tree, every condition in our rules will be transformed to a node in the tree and how the different conditions connect to each other in our rules will determine the way these nodes will be connected. As we add data to our rule engine, it will be evaluated in batches, flowing through the network using the most optimized paths possible. The execution tree finishes when the data reaches a leaf, which represents a rule to be fired. These rules are added to a list, where a command will be called to fire all the rules or a subgroup of rules.

Due to this continuous live evaluation of the condition of rules, this rule engine bases its performance on having all the data for rule evaluation available in the memory. The details about how the algorithm builds a decision tree will be introduced later in this book.

Each time we add more data to the rule engine, it is introduced through the root of the execution tree. Every optimization on this execution tree works according to the following two main focus points:

  • It will try to break down all the conditions to the smallest amount of units in order to be able to reuse the execution tree as much as possible

  • It will try to make only one operation to go to the next level below, until it reaches a false evaluation of a condition or a leaf node, where a rule is marked for execution

Every piece of data is evaluated in the most-performing way possible. The optimizations of these evaluations are the main focus of the rule engine. In the following chapters, we will discuss how to make rules that take advantage of each of these advantages in order to make our business rules as fast as possible.