Improved-Node-Probability Method for Decision Making in Priority Determination of Village Development Proposed Program

This research proposes a new method, the probability of nodes (NP) and the cumulative frequency of indicators within the framework of Bayesian networks to calculate the weight of participation. This method uses the PLSPM approach to examine the relationship structure of participatory factors and estimate latent variables. Data were collected using questionnaires involving participants offering proposals, the village residents themselves. The participation factors identified in this research were divided into two categories, namely, internal factors (abilities) and external factors (motivation). The internal factors included gender, age, education, occupation, and income, while the external factors included motivation relating to economic, political, sociocultural, norm-related, and knowledge-related issues. Moreover, there are three factors directly affecting the level of participation, they are: the level of attendance in meetings, participation in giving suggestions, and involvement in decision making. The test results showed that the application of participation weight in decision making priority of proposal of village development program give change of final rank of decision with test result as: recall 50%, precision 80% and accuracy 50%. Keywords—Bayesian networks; PLS-PM; participation weight; decision making; village


I. INTRODUCTION
In the last decades, there has been an increasing interest among the community in decision making [1]. Community participation has become part of the various environments for the implementation of decisions made, such as in the sectors of government [2], integrated watershed management [3], [4] development in agriculture [5], environmental management [6] forest management [7], and planning [8].
The significance of participation is asserted by Conyers [9], who states that first, community participation is a tool to collect information about the conditions, needs, and attitudes of the local community, without which development programs and projects will end in failure; second, people tend to have a higher level of confidence in particular development projects or programs if they feel involved in the process of preparation and planning of such projects or programs as this makes them know more about the project and develop a sense of belonging towards the project; third, engaging the community in the development of their own community constitutes a right acknowledged in democracy.
Decision making relating to determination of proposed village development programs taking priority falls into the category of group decision-making. In the group decisionmaking, community participation can be seen in the process of proposing programs and making decisions. In fact, decisionmaking through participation does not work properly. This is because the role the government plays in the implementation remains centralistic with top-down planning, thus both the aspirations and the resulting proposals lack quality, decision making is dominated by the village elite, are regular annual routine, and cannot accommodate the needs of the community.
This research aims to identify factors affecting participation in village development program planning and quantify them in the form of participation scores. Then, those participation scores were used in decision making to determine the rank of the proposed development programs in order of priority.

A. Factors Affecting Participation
There are many factors affecting community participation in the village development process. Factors classified as internal factors or abilities included gender, age, education level, income rate, and occupation. Participation of a man and that of a woman in development are different because of the established social system that differentiates the position between men and women. Such differences in position and degree will lead to differences in rights and duties between www.ijacsa.thesai.org men and women, where men have a number of privileges compared to women. Thus, men will tend to contribute more [10]- [15].
Age is a factor that influences one"s attitude towards the existing social activities. The middle-to-upper age group with moral attachment to the values and norms of the community which is more stable tends to have a higher level of participation than the other age groups [10], [15]- [18].
Education is considered to affect the way one behaves towards his/her environment, an attitude necessary for improving the welfare of the whole community [10], [14], [16]- [19].

B. Bayesian Networks
Bayesian networks [27] are a state-of-the-art model for reasoning under uncertainty in the machine learning field. They are especially useful in real-world problems composed by many different variables with a complex dependency structure. Examples of areas where these models have been successfully applied include genomics, text classification, automatic robot control, fault diagnostic, etc.
Every Bayesian network has a qualitative part and a quantitative part. The qualitative part (i.e., the structure of the Bayesian network) consists of a directed acyclic graph (DAG) where the nodes correspond to the variables in the domain problem and the edges between two variables correspond to direct probabilistic dependencies. On the other hand, the quantitative part consists of the specification of the conditional probability distributions that are stored in the nodes of the network [28].
DAG describes the relationship between attributes and consists of nodes and arcs, where each arc describes a probabilistic dependence. If an arc is drawn from A to B, then A serves as the parent or immediate predecessor of B and B serves as a descendant of A. The DAG illustration can be seen in Fig. 1.
In the illustration below, the arc displays the causal relationship-related information. For example, the node (attribute) C results either from the existence of the attribute A or not, and likewise, it may result either from the existence of the attribute B or not. It can be seen that the attribute D is independent of the attributes A and B. This implies that when the result of the attribute C is generated, attributes A and B do not provide additional information about whether the attribute D occurs or not. Suppose data X = (x 1 , ..., x n ) are data with attributes Y 1 , ..., Y n . To calculate the possibility of a variable, (1) below is used: To calculate P (B | A), Bayes" theorem is used, which calculates the probability of an attribute based on a particular attribute. The formula of Bayes" theorem can be seen in (2): The method used in the weighting calculation (level of importance) of participation is using the Partial Least Square Path Modeling (PLS-PM) method and the Bayesian networks. The PLS-PM method is used to estimate the value of latent variables. The latent variable is a variable that cannot be measured directly and is measured through the indicator variable. In addition, PLS-PM is also used to examine the relationship structure of factors that influence participation built on expert opinion. Bayesian networks method is used to www.ijacsa.thesai.org construct DAG structure and calculate the probability node (node probability) of each indicator variable.

1)
Collect data using questionnaires on the participants of a particular community, which in this case the village community.
2) Then, identify parameters consisting of indicators in each of the factors affecting participation.
3) Afterwards, build a model illustrating the relationship between those factors affecting participation in the form of a Directed Acrylic Graph (DAG) structure of Bayesian networks. The initial DAG structure was developed based on experts" views derived from previous research and interviews with participants.
4) Estimate the score of latent variables and test the structure of the DAG model already built using PLS-PM. The test results will determine whether the constituent parameters of the model structure built will change or not.
5) The DAG model structure that already had a complete data set was then used as a model structure to calculate the Bayesian network inference using the complete data sets. Furthermore, results of the NP and f calculation were saved as "a reference value" used as a guideline in the calculation of score for the participation interest of each participant. The calculation of participation scores was undertaken using two variables, namely NP and f of the indicators for each factor of participation. The score calculation for the participation interest of each participant was undertaken using (3), namely: Where, W p refers to participation score, NP refers to an indicator"s node probability score of the indicator, and f i refers to an indicator"s frequency score.
After the participation score had been obtained, score normalization was undertaken. Normalization is a technique to standardize or make the data range equal, thus no attribute is too dominant over the other attributes. The normalization process was undertaken using (4), namely: The participation score that had undergone normalization was then used in the calculation to determine the proposed village development programs taking priority.

A. Establishing the Structure for the Relationship between Factors Affecting Participation
This study used questionnaire data from 130 participants, consisting of 3 latent variables and 13 manifest variables (indicators). Parameters identified in the study are divided into two types of parameters, namely internal parameters (ability) and external parameters (motivation). Internal parameters are: gender, age, education, occupation, income, while external parameters are: economic, political, socio-cultural, norms and knowledge motivation. In addition, there are also three parameters that directly affect the level of participation, they are: attendance meetings (meeting), give suggestions (proposal), and involvement in providing decisions (decision).
The first step was to the structure for the relationship between factors affecting participation based on experts" views as illustrated in Fig. 2. Afterwards, the model illustrating the structure of the relationship between the factors influencing participation was tested using PLS-PM.
The Outer evaluation of this model specifies the relationship between latent variables and their indicators. or it can be said that the outer model defines how each manifest variable (indicator) corresponds to its latent variable. Test on outer model for formative indicator that is: -Significance of weights. The weight value of the formative indicator with its construct should be significant.
-Multicolliniearity. Multicolliniearity test is done to know the relationship between indicators. To find out if the formative indicator is having multicolliniearity by knowing the VIF value. VIF values between 5 to 10 can be said that the indicator occurs multicolliniearity.
The test result shows that weigth value almost all indicator variables produce significant weight value, that is not less than 0,1, that is allowed limit value [29]. Only one indicator variable whose value is less than 0.1 is a gender variable with a value of 0.048, so the gender variable can be excluded from the model. The result of coefficient path test can be seen in Fig. 3.  The manifest variable in a formative block must be tested for its multicollinearity. Multicollinearity testing among indicators in a formative block uses the value of variance inflation factor (VIF). If a VIF value of > 10 occurs in the form of collinierity between the indicators in one such formative block [30]. Test results show all VIF indicator values less than 10 ( Fig. 3), so it can be concluded that there is no collinierity between indicators.
After assessing quality of the measurement model, the next step was to assess the structure. To examine results of each regression in a structural equation, it is necessary to display the results contained in the inner model. In addition to the results of the regression equation, quality of the structural model was evaluated by examining three quality indexes or matrices, namely the coefficient of determination R 2 .
The coefficient of determination R 2 is the coefficient of determination of endogenous latent variables. For each regression in the structural model, the matrix R 2 was used which was interpreted in the same way as in the multiple regression analysis. R 2 indicates the number of variances an endogenous latent variable has which is described by its independent latent variable. The R 2 value generated in this research is equal to 0.849.

B. Calculating the NP and Frequency of Each Indicator
Results of the testing using data obtained from questionnaires show that the factors of gender has no significant correlation so that the DAG structure used in inference calculation involved 12 indicators only. Afterwards, a DAG structure was developed based on the data set obtained from the testing results and PLS estimation (latent variables) undertaken. The DAG structure was built using expert approach as shown in Fig. 4.
The DAG structure (Fig. 4), illustrates a graphical representation and a combination of probability P (age, education, occupation, income, politics, economy, socioculture, norms, knowledge, proposals, meetings, decisions, motivation, abilities, and participation) that can be factored as a set of conditional independence relations expressed as follows (1) Fig. 4, it can be seen that 12 (twelve) nodes are nodes with a conditional independence relation. Those twelve nodes are age, education, occupation, income, politics, economy, socio-culture, norms, knowledge, proposals, meetings, and decisions. The score of each node can be calculated based on its indicator, which in this research is called node probability (NP). The following is an example of the calculation of the node age with age between 18 to 40 years as the indicator, where the NP is calculated as follows: P(Ag=18-40) = P (Ag=18-40 The prior probability score or the confidence value of the participation variable is the resulting score to explain the level of confidence of each participation variable. Furthermore, inference Bayesian networks with DAG structure built based on data that has been tested and estimated using PLS. Probability inference in Bayesian network was calculated so as to determine the Node Probability (NP) and the probability of showing up/ frequency (f) of each indicator as shown in Table 1, which were then used as a guideline in the calculation of the score of participation interests of the participants.

C. Calculating the Participation Score of the Participants
The example of the data on the indicators of the factors of proposal makers" participation with the input data for Proposal Maker 1 (P 1 ) is presented in Table 2. The participation score was calculated by referring to the "data reference" of the NP and frequency scores generated from the calculation in Table 1.
The example calculation of the participation interest score used (3), using the indicators of participation factors P 1 in Table 2 adjusted to the NP and f scores in Table 1, the participation score (W p ) can be calculated.

1) Score normalization
The normalization process was done by calculating the highest participation score W p max using (5) and the lowest participation score W p min using (6). The data on the lowest and highest NP and frequency scores can be seen in Table 3. W p max can be calculated by multiplying the NP by the frequency of each indicator with the highest score. Conversely, W p min can be calculated by multiplying the Np by the frequency of each indicator with the lowest score. The calculation of the lowest participation score using (5) is described as follows: The calculation of the highest participation score using (6) is described as follows: The calculation results for the score of proposal makers" participation interest (W p1 ) that had undergone normalization to equal to 0.12, which is the score of participation interest for the first proposal maker (W p1 ). In the same way, the score of participation interest for the subsequent proposal maker can also be calculated.
The score of participation interest for the subsequent proposal maker (W p1 ) was used to calculate the score of DM"s preference in relation to the alternatives according to the alternatives proposed by each proposal maker.
The score of participation interest for the subsequent proposal maker (W p1 ) can be used to determine the ranking of decisions relating to village development planning programs. The W p calculation results were then tested by applying them to the current decision-making model.

2) Implementation of participation scores in multiplecriteria decision making
The current decision-making model relating to determination of proposed village development programs taking priority involves many criteria and decision makers. Those decision makers consist of several people, ranging from 7 to 11 persons, and commonly are referred to as the team of 7 or 11 persons. These teams are considered as representatives of all stakeholders in the village. All decision makers use the same criteria in making a decision, namely felt by many people (C1), extremely serious (C2), better income (C3), the number of occurrences (C4), potential support resources (C5). Such criteria are used to assess programs proposed by the community. To help illustrate a problem, the attributes of the problem can be represented by the following notations: a) DM = {dm 1 ,...dm n } refers to decision makers, i.e. the persons who will make decisions b) A = {a 1 ,...a n } with n ≥ 2, refers to a program proposed by the community, which is a group of alternatives to be ranked. c) C = {c 1 ,...c n } with n ≥ 2, refers to a group of criteria, i.e. the criteria taken into account in the decision-making process. d) T = {t 1 ,...t n } refers to the final goal, which is the resulting ranking in the form of a sequence of alternatives decided by decision makers.
The hierarchy of the decision making relating to determination of proposed village development programs taking priority is illustrated in Fig. 5.
In the decision-making process with a hierarchy as shown in Fig. 5, decision makers use the same criteria without considering the score of each criterion. The data used to test the proposed model consisted of data of nine decision makers, namely dm 1 ..dm 9 and were associated with factors influencing participation. The data on proposed programs (a) data consisted of 10 proposals, namely, a 1 ,..a 10 , where each alternative had its own score for participation of proposal makers (W p ). The data on the scores for participation of proposal makers used are presented below: Each criterion has the same score and thus the total participation score (T model ) was calculated by multiplying each participation score W p by the total initial score (T initial ), (7) as follows: T model = T initial * W p (7) Results of the score calculation using the participation score (T model ) was compared with program realization as shown in Table 4.
Afterwards, testing was done using a confusion matrix to calculate accuracy, precision, and recall. Results of the calculation are presented in Table 4 and summarized in Table 5.
Calculation of the confusion matrix is described as follows: A model is deemed good if it has high precision and recall values. Results of the test calculation using a confusion matrix generated scores for recall, precision, and accuracy by 50%, 80%, and 50%, respectively. These results are not too ideal for a model because the decision to realize a program within the government does not only depend on whether the program will facilitate development or not but also on the various interests other than objectives of the development.

V. CONCLUSION
Research conclusions are presented as follows: 1) Community participation in development planning programs is influenced by the factors of interests of the respective participants. The model structure of the relationship between those participation factors can be constructed using the PLS-PM approach with latent variables 2) The interest factors affecting participation can be quantified in the form of a participation score. This participation score can be calculated using the DAG structure and inferred from Bayesian networks, namely the calculation of probability nodes and the cumulative frequency of each indicator.
3) The participation interest score can be used to represent participants" interests with regard to decision making. In the case of for decision making priority determination of proposed program for village development program, the confusion matrix testing generates accuracy by 0.5, precision by 0.8, and recall by 0.5.