Hypercube Graph Decomposition for Boolean Simplification : An Optimization of Business Process Verification

This paper deals with the optimization of business processes (BP) verification by simplifying their equivalent algebraic expressions. Actual approaches of business processes verification use formal methods such as automated theorem proving and model checking to verify the accuracy of the business process design. Those processes are abstracted to mathematical models in order to make the verification task possible. However, the structure of those mathematical models is usually a Boolean expression of the business process variables and gateways. Thus leading to a combinatorial explosion when the number of literals is above a certain threshold. This work aims at optimizing the verification task by managing the problem size. A novel algorithm of Boolean simplification is proposed. It uses hypercube graph decomposition to find the minimal equivalent formula of a business process model given in its disjunctive normal form (DNF). Moreover, the optimization method is totally automated and can be applied to any business process having the same formula due to the independence of the Boolean simplification rules from the studied processes. This new approach has been numerically validated by comparing its performance against the state of the art method Quine-McCluskey (QM) through the optimization of several processes with various types of branching. Keywords—Business process verification; minimal disjunctive normal form; Boolean reduction; hypercube graph; Karnaugh map; Quine-McCluskey


I. INTRODUCTION
Business processes are key assets of any organization or information system [1], [2].They are the communication interface and the medium of exchange between the organization stakeholders [3].
BP describe the core business and govern the operation of a system.Business Process Model and Notation (BPMN) is the wide used standard for modeling BP in view of its simplicity and usability [4], [5].Nevertheless, BP may contain structural flaws [5] due to poor design or human errors.Hence, the verification task is a crucial step between the modeling and the execution phases of any BP.The complexity of reallife BP and the use of automated modeling tools often lead to complex models called "spaghetti" process models [6], [7] where manual verification is difficult to perform [8].Therefore, automated formal methods are used instead.Automatic verification includes: Model Checking (MC) [5], [9] and Automated Theorem Proving (ATP) [10], [11].
The MC approach uses software called model checker to exhaustively check whether an abstraction equivalent structure of the BP satisfies some properties expressed in temporal logics.Simple Promela INterpreter (SPIN) is a widely used model checker that verifies if a model writen in a C-like modeling language called Process Meta LAnguage(Promela), meets properties expressed as Linear Temporal Logic (LTL) formulas [12], [13], [14].Although this method has the advantage of indicating the counter example violating the checked propriety, it suffers from the state explosion problem [12] since its complexity is too high and the number of states grows exponentially.
The ATP (or automated deduction) is a subfield of mathematical logic dealing with automatic (or semi-automatic) proving of mathematical theorems.The computer programs allowing this task are called theorem provers [15].First-order theorem proving is one of the most mature subfields of ATP thanks to its expressivity that allows the specification of arbitrary problems [16].However, some statements are undecidable [17] in the theory used to describe the model.thereby, current research [18], [17], [19] deal with the challenge of finding subclasses of first-order logic(FOL) that are suitable and decidable in the mapping of such models.
Higher order logics are more expressive and can map wider range of problems than FOL, but theorem proving for these logics is not as developed as in the FOL [20].
Regardless the used approach to verify a BP, its logical structure is deducted as a propositional logic formula written in Disjunctive Normal Form (DNF) [2], [7].The DNF can be reduced to a minimal form in order for the manipulation and practical implementation to become more efficient.Thus, an optimization of the PB verification is achieved.
Since the simplification of Boolean expressions is extensively used in the analysis and design of algorithms and logical circuits, several methods were developed to perform this task: − The algebraic manipulation of the Boolean expressions aims at finding an equivalent expression by applying the laws of Boolean algebra.However, for such methods, there is no fixed algorithm to be used to minimize a given expression.Thus, choosing which Boolean theorems to apply is left to the expert's ability.
− The Karnaugh map which is a pictorial and straightforward method [21].First, a grid of the truth table of the function to minimize has to be drawn.The minterms of this grid have to be arranged in Gray code which makes each pair of adjacent cells different only by the value of one variable.The problem is then converted into finding rectangular groups of adjacent cells containing ones, these groups should have an area that is a power of two (i.e., 1, 2, 4, 8 . . .).Consequently, unwanted variables are eliminated.This method is easy to understand, however it is a manual process which is not practical when dealing with more than six variables [22].
− The tabulation method (also known as Quine Mc-Cluskey algorithm) [23] is a useful minimization algorithm when dealing with more than 4 variables.It has a tabular form that makes it easy to implement in computer programs.It consists of finding all prime implicants of the function to minimize, and then tries to find the necessary ones that cover the function.
Although this method is more practical than the previous ones, it is impaired by the redundancy during the search of prime implicants.Moreover, the application of Petrick's method [24] in a second phase is required to define essential prime implicants and resolve the cyclic covering problem.
This article introduces a novel technique to optimize the verification of a BP by simplifying its equivalent logical formula written in the Disjunctive Normal Form (DNF).This new simplification algorithm searches for the largest hypercubes of lower dimensions (called elements) that are enough to cover all vertices in a partial cube graph mapping of the BP.A minimal equivalent DNF is then expressed as a disjunction of the necessary hypercube abstractions in this elements coverage.
The rest of this paper is structured as follows: Section II describes how the BP is modeled in BPMN.Section III presents the main Boolean algebra simplification rules as well as the hypercube properties that are used in the developed algorithm.Section IV explains in details the simplification algorithm and goes throw the used speedup tweaks.Our findings are presented and discussed in Section V. Finally, a conclusion is given.

II. BUSINESS PROCESS MODELING AND NOTATION
The most used business process modeling standard is Business Process Model and Notation (BPMN).It is a specification of the Object Management Group (OMG) [25].The modeling is done by interconnecting standard graphical symbols grouped in five categories: The Swimlanes and Artifacts categories are used to group objects into lanes and to provide additional descriptions.The Data elements category is used to describe the flow of the data through the process.
The main role of the three categories above is to increase readability of the model without effecting its execution.There- fore the whole BP flow can be described with the remaining two categories: Flow Objects and Connecting Objects [25].
The Event elements indicate the various incidents that can occur during the process execution.Three main type of events can be distinguished according to their trigger time: 1) Start Events, 2) End Events, and 3) Intermediate Events.They indicate the beginning or the end of a process or simply any event that may arise in-between.
The Activity elements are used to indicate any performed task in a process.Depending on the level of abstraction, an Activity may be compound or atomic.
The Sequence flows are the arcs connecting related events and activities.They define the chronological order of the elements within a process.If the activation of a sequence flow depends on some condition, then a Boolean variable is defined above it.Thus the immediate successor element is activated only if this condition is considered to be true.
The Gateway elements are used to indicate any divergence or convergence in a Sequence Flow.Depending of their behavior, the five types of Gateways are: Exclusive, Inclusive, Parallel, Event-Based, and Complex.They determine the branching, forking, merging, and joining of paths.
The graph composed of Flow objects and their Sequence Flows connections describes the eventual executions of a BP.Each path of the graph going from the start to the end events indicates a single execution scenario.As an example, Fig. 2 shows a simplified payment/delivery BP.
Once the modeling of the BP is done, the designer must choose which verification method to apply.The structure of the BP model is then extracted as a mathematical expression that depends on the used gateways and the sequence flows branching.The next section will present the necessary elements used to map the logical structure of a BP and the main rules used to simplify its equivalent formula.

III. BINARY REPRESENTATION AND REDUCTION RULES
A. Definitions 1) Boolean variable: A Boolean variable is a variable that takes only one of the logical values: either 1 (meaning T rue) or 0 (meaning F alse).The complement of a variable A is denoted A and has the opposite value of A. A literal is either the logic variable A or its complement A.
For example, m 6 is the short hand notation of ABC because 110) 2 = 6.
3) Disjunctive Normal Form (DNF): A logical formula is considered to be in Disjunctive Normal Form (DNF) if and only if it is a disjunction (sum) of one or more conjunctions (products) of one or more literals [26].A DNF formula is in full disjunctive normal form if each of its variables appears exactly once in every conjunction (minterm).The only propositional operators in DNF are and (denoted with .or ∧), or (denoted with + or ∨), and not (denoted with ¬A or A).
The not operator can only be used as part of a literal, which means that it can only precede a propositional variable.The following formula of three variables A, B, and C is in DNF: It can be written in shorthand notation as follow: B. Boolean Algebra 1) Boolean algebra identities: In Boolean algebra, there are four basic identities for addition (logical or) and four for multiplication (logical and) that holds true for all possible values of a Boolean statement variables.Table I gives a summary of those identities: 2) Boolean algebra properties: In Boolean algebra, there are three basic properties: commutative, associative, and distributive.Table II gives a summary of those properties: 3) Boolean simplification rules: By using the identities and properties of Boolean algebra, a Boolean statement can be simplified by reducing the number of literals using the following rules: A + AB = A (4)

C. The Hypercube Graph Representation
A Boolean statement of n variables can be written in DNF with at most 2 n minterms of n literals.By creating a vertex for each minterm m i and linking each two vertices when their binary representations differ in a single digit (the Hamming distance of their minterms is one), a hypercube graph (noted n-cube or Q n ) is created [27].Fig. 3 gives a flat representation of the hypercube graph Q 4 .
A hypercube graph of n vertices can be viewed as the disjoint union of two hypercubes Q n−1 if an edge is added from each vertex/minterm in one copy of Q n−1 to the corresponding minterm/vertex of the other copy.As shown in Fig. 4, the joining edges form a perfect matching between the blue and black vertices.In fact, every hypercube Q n of n > 0 is composed of elements, or n-cubes of a lower dimension, on the (n-1)dimensional surface on the parent hypercube.The smallest elements are the vertices (points).There is 2 n of them.
In general, the number of m-cubes on the boundary of a given A partial cube is an isometric subgraph of a hypercube.The distance between any two vertices in the subgraph is the same as the distance between those vertices in the hypercube.
Lemma III.1 Let Q n be a hypercube graph with n > 0 minterms m i where i ∈ [0, 2 n [.Let f be a DNF formula given by the disjunction of all Q n minterms.Then n variables of f can be simplified.The abstracted equivalent formula is easily obtained by identifying the common literals between the minterm with maximum shorthand notation value (denoted m max ) and the one with the minimum shorthand notation value (denoted m min ).This abstraction is chosen to be called: abstraction m max with filter m min .
Proof: For instance, if n = 1 then Q 1 is composed of two minterms m 0 and m 1 of one variable v 0 .By applying the identity v 0 + v 0 = 1, an abstraction of the variable v 0 is given (abstraction m 1 with the filter m 0 ).
If n = 2 then Q 2 is composed of four minterms {m 0 , m 1 , m 2 , m 3 } each one is composed of two variables v 0 and v 1 .By applying the same identity to two opposite sides of Q 2 an abstraction of the variables v 0 and v 1 is given (the abstraction m 3 with the filter m 0 ).In fact: Let us assume that the lemma III.1 is correct for any n > 0. Let Q1 n and Q2 n be two hypercubes that their disjoint union form the hypercube Q n+1 .Each minterm m x = m VnVn−1...V2V1V0)2 in Q1 n forms a perfect matching with another minterm m y = m VnVn−1...V2V1V0)2 in Q2 n .m x and m y can be abstracted to m x because they differ by the value of a single variable v n .In fact: which gives an abstraction of the variable V n .As a result, the hypercube Q n+1 gives an abstraction of n + 1 variables: n variables with the hypercube Q1 n plus that of V n .In the next section, an explanation of how the lemma III.1 can be used as a key stone to perform the simplication of any formula written in DNF is given.

IV. SIMPLIFICATION ALGORITHM
In order to simplify a Boolean expression written in DNF, its expression is represented as a partial cube P Q n of the hypercube graph Q n , with n the number of variables in the DNF formula.The developed algorithm consists in finding the largest elements (hypercubes) Q m , with m ≤ n, so that their disjoint union covers all vertices of the partial cube P Q n .The fewer is the number of necessary hypercubes Q m , the more abstract is the equivalent formula.As an example, the following DNF formula can be considered: This formula is represented as a partial cube P Q 4 with vertices m 1 , m 3 , m 4 , m 5 , m 7 , m 9 , m 11 , m 12 , m 13 , and m 15 .Fig. 5 shows that the vertices of P Q 4 (green and yellow vertices) can be covered with the disjoint union of two hypercubes Q 3 and Q 2 .methods verification algorithms suffer from the high complexity since the problem they try to solve is NP-hard, hence the necessity to reduce the problem size by minimizing the number of literals.
In this paper, a novel technique of business processes simplification has been presented.A simplification tool that performs literals reduction using hypercube decomposition has been built.Moreover, the simplification algorithm was entirely automated which makes the optimization task accessible to the regular BP designers.Promising subject of research can be explored in further depth, such as how machine learning algorithms could be used to accelerate the simplification algorithm, how the algorithm can be modified to reduce the spatial complexity, and finally, the possibility of adapting the algorithm, view its characteristics, for quantum computing.

Fig. 2 .
Fig. 2.An Example of a Simple payment/delivery BP.

2 )
Minterm: A Minterm is a (conjunction) all the variable literals.For instance, for three Boolean variables A, B, and C the expressions ABC, A.B.C, and A ∧ B ∧ C denote the same minterm.It means that C has the value 0 and both A and B have the value 1.By assigning a power of 2 to each variable of a minterm the shorthand notation is m d where d denotes the decimal value of the binary expression

Fig. 5 .
Fig. 5. Reduction of a full DNF of 4 variables to hypercubes Q 2 and Q 3 Using lemma III.1, three variables A, B, and C can be reduced with the hypercube Q 3 composed of vertices {m 1 , m 3 , m 5 , m 7 , m 9 , m 11 , m 13 , m 15 }.Thus Q 3 is reduced to m 15 with the filter m 1 which is equivalent to the expression D since it is the only variable that remains with the same value in all minterms of Q 3 (we have m max = m 15 = m 1111)2 and m min = m 1 = m 0001)2 the abstraction is − − −1) 2 ).The hypercube Q 2 , composed of {m 4 , m 5 , m 12 , m 13 }, gives an abstraction of tow variables A and D. Thus Q 2 is reduced to m 13 with the filter m 4 which is equivalent to the expression BC (we have m max = m 13 = m 1101)2 and m min = m 4 = m 0100)2 the abstraction is −10−) 2 ).