Assessment of Surface Water Quality on the Upper Watershed of Huallaga River, in Peru, using Grey Systems and Shannon Entropy

The assessment of the quality of surface water is a complex issue that entails the comprehensive analysis of several parameters that are altered by natural or man-made causes. In this sense, the Grey Clustering method, which is based on Grey Systems theory, and Shannon Entropy, based on the artificial intelligence approach, provide an alternative to evaluate water quality in an integral way considering the uncertainty within the analysis. In the present study, the water quality on the upper watershed of Huallaga river was evaluated taking into account the monitoring results of twenty-one points carried out by the National Water Authority (ANA) analyzing nine parameters of the Prati index. The results showed that all the monitoring points of the Huallaga river were classified as not contaminated, which means that the discharges, generated by economic activities, are carried out through of treatment plants meeting the quality parameters. Finally, the results obtained can be of great help to the ANA and the regional and local authorities of Peru in making decisions to improve the management of the Huallaga river watershed. Keywords—Grey clustering; Huallaga river; Prati index; Shannon entropy; water quality


I. INTRODUCTION
The Huallaga river watershed is one of the main watersheds of Peru and with a great potential of water resources, this due to the existence of a large number of lagoons, rivers, streams and springs, an important source of resources natural resources, food and work for the native communities and populated centers of the area being the main economic activities: agricultural, industrial, energy, mining and fishing for direct human consumption [1]. In addition, the benefits provided by the watershed have been diminished, due to its waters have been polluted by domestic wastewater, wastewater municipal, solid waste, as well as mining environmental liabilities, being a risk to public health [2]. This watershed, located on the Atlantic watershed, is one of the largest tributaries of the Marañon River and It is made up of the Lower Huallaga Inter-watershed, Paranapura, Middle Lower Huallaga Inter-watershed, Mayo Watershed, Middle Huallaga Inter-watershed, Biabo Watershed, Middle Upper Huallaga Inter-watershed, Huyabamba Watershed and Upper Inter-watershed Huallaga [1]. The assessment of the surface water quality will be carried out in the upper part of the watershed due to its environmental importance.
For the development of the assessment we will use the Grey Clustering method, as well as the Shannon Entropy. Flock Clustering is a method that is based on the theory of Grey systems, an approach within what is called Intelligence Artificial, so it has a great variety of applications [3]. For the case to be studied, we will use the center-point triangular whitenization weight functions (CTWF) method, which is applied in studies on water [4] or in the assessment of urban transport [5]. On the other hand, the weight method Entropy, based on Entropy of Shannon, is also an approach within artificial intelligence developed initially by Claude E. Shannon [6], this method was used to calculate the weights objectives of the assessment criteria within the CTWF method.
Therefore, our specific objective in this study is the classification of 21 monitoring points on the upper watershed of Huallaga river according to the water quality criteria, using the Grey Clustering method and the Entropy of Shannon.
In the present study, Section II details the CTWF method. Section III describes the case study, followed by the results and discussions of Section IV. The conclusions will be presented in Section V.

II. METHODOLOGY
In this section, we describe the CTWF method, which can be described as follows: first, suppose the area is set of m objects, a set of n criteria, and a set of s Grey classes, according to the sample value (i = 1, 2,..., m; j = 1, 2,…, n). Then, the steps of the CTWF method can be developed with the following points according to different research [3], [7] and [8].

A. Step 1: Determination of Center Points
The ranges of the criteria are divided into 5 Grey classes, and then their central points are 1 , 2 , … , , this is determined by the Prati index.

C. Step 3: Determination of the Triangular Functions and their Values
The Grey classes are expanded in the addresses of each parameter used and for this the index will be used as a reference Prati, who provided the data to measure quality, in this research Prati provides us with five 5 levels of quality for each parameter, so there will be five (5) functions for each parameter. The new sequence of points central is 1 , 2 ,…y 5 . For the class k = 1,2,3,4 and 5, j = 1, 2, …, n, for an observed value xij. The calculation is displayed of the CTWF by means of Equations 2, 3 and 4; and Fig. 1 shows the graph of the triangular functions.

D. Step 4: Determination of the Weight for each Criterion
In this step, the Shannon Entropy weight method is used. For everything it is considered within a distribution of probability, Shannon developed the measure H, which satisfies the following properties [3], [7] and [9]: • H is a positive continuous function Shannon shows that only the functions that satisfy this condition are calculated by Equation 5.
Around the entropy weight methodology, it can be demonstrated according to the following definition [3], [7] and [9]. As shown above, m objects are displayed for assessment, and n assessment criteria, which form the following matrix = � ; = 1,2, … , ; = 1,2, … � . After that, the following stages continue.
2) The entropy of each criterion is calculated by Equation 7, which was constructed based on Equation 6.
Where, K is a constant, k= (ln(m)) -1 3) The degree of divergence of the intrinsic information in each criterion is calculated by Equation 8.

4)
In the weight entropy of each criterion is calculated by Equation 9.
Where, w j is equal to n j

E. Step 5: Determination of the Clustering Coefficient
The clustering coefficient by objeto i, i = 1, 2, …, m, respect to the Grey classes k, k=1, 2, …, s, is calculated by Where =1 � � is the CTWF of the k-th grey class of j-th criterion, y is the weight of criterion J, establish said weights the Shannon Entropy method will be used.

F. Step 6: Results using the Maximum Clustering Coefficient
Finally, we have the calculation of 1≤ ≤ � � = , We decide which object belongs to Grey class k*. When there are several objects in Grey class k*, these objects can be ordered according to the magnitudes of their grouping coefficients integral.

III. CASE STUDY
The analysis of the surface water quality was carried out in the upper part of the Huallaga river watershed, which is located in the central zone of Peru has an area of 89,416 km2 and a length of 1,168 km in a direction from south to north [1], which is represented in Fig. 2. 438 | P a g e www.ijacsa.thesai.org

A. Definition of Study Objects
For the assessment of the quality of the surface water of the upper watershed of Huallaga river , information was collected of 21 monitoring points obtained from the seventh monitoring of surface water quality carried out on 19November to December 20, 2019 by the Huallaga Water Administrative Authority and the Local Authorities of the Alto Huallaga, Tingo María, Huallaga Central, Alto Mayo and Tarapoto [1]. Which will be detailed in Table I and represented in Fig. 3.

B. Definition of Assessment Criteria
The assessment criteria for the present study are determined by the water quality parameters which are presented in Table II.

C. Definition of the Grey Classes
The classes for assessment are five and are based on the levels of water quality according to the Prati index, which are presented in Table III.

D. Calculations using the CTWF Method
The calculations based on the gray clustering method are presented below: Step 1: Based on the Prati quality index, the central values of the parameters to be analyzed are obtained. These values are shown in Table IV. 2) Step 2: The non-dimensioned standard values for each parameter, according to the Prati index, were determined through the (1). These values are presented in Table V. Similarly, based on the results of the participatory monitoring report of surface water quality in the Huallaga river watershed, developed by the National Water Authority (ANA), the values without dimension were obtained for each parameter of the 21 selected monitoring points. These values are presented in Table VI.   Table III in (2)-(4), the triangular whitening functions of the five Gray classes were obtained for each parameter. As an example, the functions corresponding to the second parameter (BDO) are shown in (11)-(15) and Fig. 4. Then, the values in Table VI were evaluated in the triangular whitening functions of the five classes Grey for each parameter. The results obtained for the first five monitoring points are shown in Table VII.    Table  VIII.
b) Substep 4.2: The entropy Hj of each criterion Cj was calculated through (7). The results are presented in Table IX.
c) Substep 4.3: The degree of divergence of each criterion Cj was calculated through (8). The results are shown in Table X. d) Substep 4.4: Finally, the entropy weights wj according to (9) and were equated to the grouping weights nj of each parameter. The values are presented in Table XI.

5)
Step 5: The values of the clustering coefficients( ) were calculated using (10). The results of the the first five monitoring points are shown in Table XII.

6)
Step 6: Finally, the condition was applied: if � � = * , it is decided that the object i belongs to the Grey class k*; for each monitoring point.

A. Results on the Case Study
It is showed, in Table XIII, that the 21 monitoring points resulted in an uncontaminated water quality, however, a quality level comparison can be made according to the maximum clustering coefficient (max. σ i k ), like to shown in Table XIV and Fig. 5.
It is observed that the monitoring point P1 presents the best water quality and, the point P8, the lowest water quality. This happens because point P1 is on the beginning of the Huallaga river and point P8 is in a lower zone, it means that the quality of Water decreases along the river depending on the activities that take place, such as mining and hydroelectric plants [1]. 442 | P a g e www.ijacsa.thesai.org The reason why the water is uncontaminated may be because mining and hydroelectric companies, which operate in the upper part of the Huallaga river, they treat their industrial effluents adequately in accordance with the regulations national. This water body is classified as suitable for the irrigation of vegetables and animals [1], but according to the results of the water quality assessment, it can also be considered as water that can be made drinkable with conventional treatment.
In relation to other studies, Fu and Zou [10], applied the Grey Clustering method to evaluate the water quality of the Yellow River, the results also showed good river quality. In the assessment of the water quality carried out by Liping et al. [11], applied the Grey Clustering method for the assessment of the quality of the Fenchuan river of the Yan'an Baota area in China, however, considered the arithmetic mean for the determination of the weights of clustering. In this case, Shannon Entropy could have been applied alternatively, as was done in the present study, to calculate these weights in an objective and precise way. Similarly, in the study carried out by Wang et al. [12] the clustering weights could be obtained through the Shannon Entropy method and be complemented by the Single Factor method used in this study.

B. Discussion on the Methodology
The Grey Clustering method is the most appropriate in high uncertainty issues [3] such as assessment of surface water quality where each parameter varies depending on environmental conditions, in comparison of classic multicriteria assessment methods such as Delphi [13] or the Analytical Hierarchy Process (AHP) [14] which do not consider uncertainty in their analysis. In Peru, the Grey Clustering method is not very widespread compared to other logic methods Aristotelian or statistical models [7] which means a limitation for its application with the national water quality standards, which are not determined based on any quality index.
Finally, the Shannon entropy method is well suited for evaluating water quality because allowed to determine the grouping weights (η j ) for each parameter in an objective way, without the need to ask to an expert and this reduces assessment costs. In addition, this method has multiple applications as in studies of social conflicts or assessments of social impact [15], due to its great capacity to process information and reduce subjectivity in assessments.

V. CONCLUSIONS
The surface water quality of the upper watershed of Huallaga river could be evaluated using Grey Clustering method and Shannon Entropy, so it was possible to classify the 21 monitoring points in this area. The results obtained in this study can be useful to the regional and local authorities of Peru, as well as to the National Water Authority to make better decisions regarding the management of this important watershed.
According to the methodology, the Grey Clustering method can be more effective than other classical methods due to that considers the uncertainty within the analysis, regarding the Shannon Entropy it allows to calculate the weights objectively of the criteria without resorting to expert judgments. Another important point is that when using the Prati index, there is an advantage when we need to compare if the water quality is affected by the activities carried out in the watershed, due to the parameters used. Finally, in future research, the efficacy of the Grey Clustering method should be compared with the methodology established by the National Water Authority (ANA) for the determination of the Quality Index Environment of Surface Water Resources (ICARHS). In case the results are indistinct or very similar, the use of the Grey Clustering method could be extended to those rivers where the data is insufficient to apply the ANA methodology.