Evaluation of Water Quality in the Lower Huallaga River Watershed using the Grey Clustering Analysis Method

Currently, the evaluation of water quality is a topic of global interest, due to its socio-cultural, environmental and economic importance, but in recent years this quality is deteriorating due to inadequate management in the conservation, disposal and use of water by the competent authorities, privatestate entities and the population itself. An alternative to determine the quality of a water body in an integrated manner is the Grey Clustering Method, which was used in this study taking as an indicator the Prati Quality Index, with the objective of making an objective analysis of the quality of the water bodies under study. The case study is the Lower Watershed of the Huallaga River, located between the region of Loreto and San Martin, along which 12 monitoring stations were established to evaluate its surface water quality, through the analysis of 7 parameters: pH, BOD, COD, Total Suspended Solids (TSS), Ammonia Nitrogen, Substrates and Nitrates. Finally, it was determined that the water quality of eleven monitoring stations in the Lower Huallaga River Watershed are within the "Uncontaminated" category, while one monitoring station is within the "Highly Contaminated" category of the Prati Index, this due to its proximity to a landfill. The results obtained in this study, could be useful for the authorities responsible for the protection and sustainable conservation of the Huallaga River Watershed, in order to propose appropriate measures to improve its quality, additionally, this study could be a reference for future studies since the proposed method allowed to prioritize the quality level of the water bodies and identify critical areas. Keywords—Water quality; prati index; grey clustering method; protection and sustainable conservation


I. INTRODUCTION
The various activities that have been carried out in the lower watershed of the Huallaga River have motivated the evaluation of water quality [1], since this watershed is used to supply the indigenous communities and surrounding urban areas [2].
The present study is carried out in the lower Huallaga River Watershed, located in the districts of Loreto and San Martin in which twelve monitoring points have been taken from the "Participatory Monitoring Report on Surface Water Quality in the Huallaga River Watershed" [2], which were taken into account due to the proximity of some activity carried out within the area of the lower Huallaga River Watershed. For the calculation of the water quality index, the Prati index [3] was chosen and for the discussion of results it was compared with the national standard ECA agua [4].
The water quality evaluation was done using the Grey Clustering methodology based on the gray system theory [5], which due to the scarcity of data uses an artificial intelligence approach [6]. This methodology, because it solves problems with scarce data, allows its application to other fields of research [7].
The objective of this study is to evaluate the water quality in the lower Huallaga river watershed using the Grey Clustering methodology [5], which allows us to fully assess the 7 parameters of the Prati index [3] considering 12 monitoring points that cover the area of the lower Huallaga river watershed [2]. This study has the following structure, it begins in Section I with the introduction, in Section II is detailed the literature review, after that is visualized in Section III, the Grey Clustering methodology, continues with Section IV that describes the case study. Section V presents the results and discussions. Finally, Section VI explains the conclusions.

II. LITERATURE REVIEW
In the research work entitled "Evaluation of water quality in a watershed in Cusco, Peru using the Grey Clustering method", they analyzed water quality in an area of mining influence zone located in the Chonta and Milos micro watershed using the Grey Clustering method, for which they established six monitoring stations. The parameters evaluated were pH, OD, STS, iron and manganese. It was concluded that only one monitoring station was contaminated despite being a discharge of treated industrial water from a cyanide destruction plant [8].
In the article entitled "Application of fuzzy logic to determine the quality of water bodies in the Rimac River Watershed", they analyzed the quality of five water bodies in the Rimac River Watershed which belong to Category 1 A2-Population and Recreation using the Grey Clustering method. To select their monitoring points they took data from a Technical Report on Water Quality Monitoring in the Rimac River Watershed prepared by the ANA in 2013. After evaluating the parameters of ph, %O2, BOD, COD, STS, NH3 In the research paper entitled "Application of Grey Clustering Method Based on Improved Analytic Hierarchy Process in Water Quality Evaluation", they proposed a Grey Clustering method based on an improved analytical hierarchy process to evaluate the water quality of the Qingshui River in Duyun City, by sampling three water periods (periods of abundant, normal and deficient flow) in 4 sections of the river. It was concluded that the water quality of the river belongs to the superclass III according to its regulations, and according to this the contamination is not serious [10].
In the article entitled "Research on Comprehensive Evaluation of Air Quality in Beijing Based on Entropy Weight and Grey Clustering Method", they proposed a Grey Clustering method with entropy weight to evaluate air quality in Beijing, in order to obtain more objective results. The parameters evaluated were PM2.5, PM 10, NO2 and SO2 for three consecutive quarters. It was concluded that Beijing air quality in the first quarter is better than the second and third quarters and that the entropy weight enriched and improved the Grey Clustering method [7].
In the research work entitled "Environmental conflict analysis using an integrated grey clustering and entropyweight method: A case study of a mining project in Peru", they proposed an approach for ECA using the Grey Clustering and entropy weighting method to evaluate the social impact of a mining project in northern Peru. Information was collected through interviews with three groups: rural population, urban population and specialists. Three levels of social impact were established in the surveys: positive, negative and normal. It was concluded that for the urban population, rural population and specialists groups, the project would have a positive, negative and normal impact, respectively. In addition, it was concluded that the proposed method showed practical results and potential for application to other types of projects [11].

A. Choice of Index
The Patri Water Quality Index was chosen because it considers criteria to evaluate physicochemical parameters, which are relevant to determine contamination in water bodies. Seven of the 13 parameters included in the Prati Index will be evaluated. This index also has ranges for each variable. Table IV shows the range of the Prati index of the seven parameters evaluated and Table II shows the levels of contamination according to the Prati scale [3].

B. Grey Clustering Analysis Methodology
This new methodology focuses on the problems that exist with small and scarce data, thus avoiding uncertain information. The method is based on developing functions of Whistenization of Grey Cluster [5].
It is developed in several areas of research; in this case, the methodology will be used to determine the quality of the water.
Step 2: Determine the sized values of the sampling data and the standard criteria data.
Step 3: Determine the triangular functions of Whitenizacion for each criterion.
The number of triangular functions is related to the water quality index levels for this case. Five functions are proposed since they are five classes on the Prati scale (λ1, λ2...and λ5), which are obtained from Eq. 1, 2 and 3, and in addition Fig. 1 shows the graph of the triangular functions.  Step 4: Determine the weight of the criteria by using the harmonic mean, which will be calculated with Eq. 4. ∑ (4) Step 5: Find the clustering coefficients using Eq. 5.
Step 6: Find the maximum clustering coefficient to define which class each station belongs to, applying Eq. 6.

{ }
IV. CASE STUDY This study will focus on the evaluation of the surface water quality of the Huallaga lower watershed, located between Loreto region and San Martín as shown in Fig. 2. 483 | P a g e www.ijacsa.thesai.org In recent years, its water quality has significantly deteriorated, generating conflicts between the population and the responsible authorities. Given this situation, since 2013 the ANA has been identifying the main polluting sources, for subsequent monitoring of water quality with the active participation of the population [3].
The main sources of contamination are associated with the domestic sewage discharge, industrial and untreated municipal wastewater, and also with the bad disposal of solid waste, product of the development of agricultural, energy, industrial activities and Wastewater Treatment Plant (WWTP) [14].

A. Definition of Study Objects
The monitoring points were obtained from the "Participatory Monitoring Report of the quality of surface water in the Huallaga river watershed" carried out in the period February -March 2019 by the National Water Authority [2]. In Fig. 2 shows the location of the monitoring stations.
From the participatory monitoring report that was carried out in the Huallaga river watershed [2], for this case study, 12 monitoring points were chosen, located in the lower Huallaga river watershed, which are shown in Table I.

B. Definition of Evaluation Criteria or Parameters
This study will evaluate 7 water quality parameters of the lower Huallaga river watershed, in the different monitoring stations, previously identified by National Water Authority (ANA by its Spanish acronym). Table II describes these parameters.
The field data were obtained from the "Participatory Monitoring Report of the quality of surface water in the Huallaga river watershed" [2] according to the monitoring stations mentioned in Table I. Table III details the data of the seven parameters in each monitoring station.

C. Definition of Grey Classes
The water quality of the lower Huallaga river watershed will be evaluated, under the contrast with the regulations of the Environmental Quality Standards for Water (ECA by its Spanish acronym) established in DS 004-2017-MINAM [4] in category 3 and category 4, which corresponds to irrigation of vegetables and drinks of animals and conservation of the aquatic environment, respectively.

L-Azul6
Approximately in the center of the Blue Lagoon 365497 9259275

R-Hual36
Río Huallaga -Approximately 200m downstream from embarkation port of the town Papaplaya 413176 9313998 In this sense, the Prati Index will be used, which originally includes 13 parameters, but in this study only 7 parameters will be used.
In addition, this index establishes six levels of water contamination, but due to lack of information, only five levels will be used in this study, which are described in Table IV and  the range of criteria in Table V. D. Calculations using Grey Clustering 1) Step 1: Determination of center points: The central point of the semisum of the range of the Prati Index will be determined for the five classes (Not contaminated, Acceptable, Moderately contaminated, Contaminated and highly contaminated) of the parameters pH, BOD, COD, NH3, SS, NO3 and Cl (see Table VI). 2) Step 2: Data dimensioning: As the evaluated parameters are in different units, the data must be standardized or normalized to homogenize the work. The standard Prati data will be sized and then the sampling data.   a) For Prati Standard Data: The mean of the standard data is obtained for each parameter, which is detailed in Table VII.
Each value is divided by its respective mean, obtaining Table VIII. b) For Sample Data: In the same way, the sample data is dimensioned. In this case, the data of the parameters are divided by the mean of the data correctly, which were calculated in the dimensioning of the Prati data, obtaining Table IX and Table X.

3) STEP 3: Determination of triangular functions:
The Grey Clustering method is applied to analyze the different criteria and comprehensively evaluate the water body. They are triangular functions and are divided into five classes: λ1, λ2, λ3, λ4 and λ5. The following describes for the criteria "pH" its triangular functions with their respective correspondence rules and their graph (see Fig. 3).
In the same way, the functions and graphs are proposed for the remaining criteria. Next, Table XI shows the result of the data evaluated in each of the triangular functions of the criteria, for each sampling station.

4) STEP 4: Determination of the weight of the criteria or parameters:
Objective weights are assigned through the use of "Harmonic Mean".
From the standard dimensioned data, this data is inverted and added for each class (lambda 1, 2, 3, 4 and 5) as a result, Table XII is generated and the weights of each parameter criterion are shown in the Table XIII.

5) STEP 5: Determination of the clustering coefficient:
Now each value of the parameter is multiplied by its respective weight for each class (lambda 1,2,3,4 and 5) and the total of the function is added. For each point, 5 values will be obtained, one per function and the result is shown in     It was possible to obtain the results of the evaluation of the water quality in the lower basin of the Huallaga river, which from the monitoring points P1, P2, P3, P4, P5, P7, P8, P9, P11 and P12 shown in Table XVI, have the category level "Unpolluted", these have the lowest water quality rating of the Prati index. However, point P6 shown in Table XVI is in the "Highly contaminated" class, which is why it turns out to be the worst quality monitoring point according to the Prati scale.
In addition, taking into consideration the "Participatory Monitoring Report on the quality of surface water in the Huallaga River Basin" [2], it results in exceeding the ECA 3subcategory D1 [4], in two parameters (BOD, COD), from which it is concluded that this methodology used is reliable. E The order of contamination of the 12 monitoring points, from the highest to the lowest according to the results obtained in Table XVI is shown below. P6> P12 > P4 > P9 > P2 > P5 > P10 > P8 > P11 > P7 > P3 > P1. Furthermore, considering a study conducted in Cuzco [8], which mentions that the anthropogenic activity generated contamination of the surrounding banks through the effluents emitted, in the case of study, and mainly at point P6, there is a landfill of "Fundo 3 hermanitos" and this is probably the cause of the contamination of this study area.
Finally, according to the evaluation of water bodies in the Rimac River basin [9], where 5 representative monitoring points are mentioned "downstream", then, from that, in the case of study, the evaluation of 12 representative points will be carried out, since as more monitoring points the uncertainty about the water quality in the study area decreases considerably and an approximate result of the quality is evidenced. Of the water of the conditions of the section of interest.

A. About the Methodology
Multi-criteria analysis approaches such as Delphi [12] [13] and AHP [14][15], do not consider a degree of uncertainty, because of the importance of the criteria they take into account for the analysis.
In addition, in the assessment of water bodies in the Rimac River [9], they mention that the monitoring points belong to Category 1 A2-Population and Recreation using the Grey Clustering method, so the methodology is similar to the Peruvian RCTs [4], Therefore, in our study, when the results of the application of the Grey Clustering and the results of the participatory monitoring in the 12 monitoring points were compared, in fact it was evident that they complied with the RCT on water [4] in 11 of the points, and in P6 they did not comply with national standards, thus inferring that the results obtained were much more reliable. Finally, this method was used because according to the study conducted on the water quality of the Qingshui River in the city of Duyun, China [10], and compared to the results produced by other methods, it turns out to be more scientific and reasonable and can provide a basis for the evaluation of water quality and the management of the water environment in any space where it is carried out.

VI. CONCLUSION
The water quality of eleven monitoring stations in the Lower Huallaga River Watershed is in the "Uncontaminated" category of the Prati Index, according to the following hierarchy from highest to lowest quality: P-12 > P-4 > P-9 > P-2 > P-5 > P-10 > P-8 > P-11 > P-7 > P-3 > P-1, deducing that the development of economic activities surrounding these water bodies is not significantly affecting their quality. While the P-6 monitoring station is in the "Highly contaminated" category, this is due to its proximity to a dump, where domestic, municipal and industrial waste is generally disposed of, the most contaminating being hospital waste.
In this work, the Grey Clustering methodology will be used, since the evaluation of water quality is the result of multiple criteria and in many cases we work with little data and little information, therefore, statistical methods are not suitable for this type of evaluation because they present a certain degree of uncertainty, however, Gray's method works by prioritizing the criteria, that is, it calculates weights to the criteria.
Grey Clustering gives us more reliable results, so it is beneficial to apply this methodology in studies of water quality, air, soil, biodiversity, landscape; as well as the application to other fields of study such as economics, sociology. However, it is laborious to increase the number of monitoring points, so it could be simplified using a programming language.