Information Retrieval Method of Natural Resources Data based on Hash Algorithm

www.ijacsa.thesai.org


INTRODUCTION
Due to the fact that natural resource data has not yet formed a complete and clear hierarchical directory and corresponding data items, and has not been marked according to business scenarios. This affects the business department's grasp of the actual availability of data results, which can easily lead to data duplication and low reuse rates of results. Comprehensive data mining analysis is not sufficient to effectively meet the needs of decision support [1][2][3][4]. The application development of the upper layer is "strongly coupled" to the data platform of the lower layer. The data service provision of "standard components" is to customize their own data service platform from top to bottom, cooperating with different upper layer applications, thereby disrupting the normal formation of the basic data directory service system. Any changes in the underlying data environment will directly affect the normal use of other business applications. Such as the entry and exit of data, i.e. the definition of data lineage and data flow. Therefore, the research on automatic retrieval of data resources is of great significance [5].
Currently, there are many studies related to automatic retrieval methods. Study [6] proposes a mental health management model for college students based on wireless sensor networks, which uses wireless sensor systems to complete the statistics of college students' mental health data, thereby achieving automatic retrieval of information. Research [7] proposes a clustering analysis algorithm in the analysis of college students' mental health education, according to which the relevant data of college students' mental health education can be calculated, Provide a foundation for automatic information retrieval.
To solve the problems in the above methods, this paper proposes a natural resource data information retrieval method based on hash algorithm. This article uses data center technology to solve problems such as information source localization, data catalog organization, data semantic definition and expression, and relationship construction between data entities in natural resource data. Combining the distribution of resource data streams, hash algorithms are used to restructure the information structure and encrypt data in the natural resource data center, establish information quality control model parameters for the natural resource data center, organize and manage the data in the natural resource data center through data governance rules, and store them in the release library. Through various data processing tools, various data are processed [8], cleaned up, and reconstructed. Through hash algorithms and data aggregation processing, information detection in natural resource data is achieved. Experimental results show that this method has excellent performance in improving the information retrieval ability of natural resource data.

A. Information Storage Structure Model of Natural Resources Data Center
In order to realize the system design of information retrieval in natural resource data based on hash algorithm, a block-by-block control model of information retrieval in natural resource data is constructed by combining bidirectional reference control and fuzzy retrieval, and the information processing terminal of information retrieval in natural resource data is established by combining semantic similarity fusion and database natural language fast retrieval method [9][10][11]. The semantic detection and data comprehensive management of information retrieval in natural resource data are realized by www.ijacsa.thesai.org adopting expert system identification method, and the overall structure of information retrieval in natural resource data is obtained (see Fig. 1 The overall structure of information retrieval in natural resource data mainly includes semantic detection module, data management module, report management module and visual management module. Using the embedded B/S framework method, the program control of the information retrieval system in natural resources data is carried out, and the optimized storage structure model of information retrieval in natural resources data is constructed by combining the dictionary ordering storage mechanism [12]. The distribution of storage nodes of information in natural resources data is shown in Fig. 2.  According to the storage node model of information in natural resources data in Fig. 2, based on data storage and distributed clustering, the information retrieval in natural resources data is divided into five layers. They are: 1) Data source layer. All kinds of heterogeneous databases are compatible and used as data sources. In the process of data warehousing, they are divided into intermediate database and publishing database. Organize the source data by standardizing the data collection rules and store it in the intermediate database; through data governance rules, the data is structured and managed, and stored in the publishing library. This system provides complete and effective data support for the whole data center.
2) Data resource layer. Through all kinds of data processing tools, all kinds of data are processed, cleaned, reconstructed, etc., and finally the data formats and data structures required by data service, application and tools are formed.
3) Platform service layer. Provide unified data scheduling service, support all kinds of services, applications and tools, and manage all information of service layer in the form of service bus. 4) Application resource layer. Provide various data-related applications, such as data query, visual analysis, label management, knowledge map, knowledge search, etc. At the same time, according to the relevant service specifications, the secondary development interface is provided to lay the foundation for future expansion.
5) Portal layer. According to the user's rights and responsibilities, it provides quick and humanized operation mode, and provides corresponding data management tools and applications for people with different rights. The management end also provides simple and effective management mode correspondingly [13].
According to the above-mentioned rule set distribution and the optimized design of the retrieval system, the structural model of the information retrieval system in natural resources data is constructed [14].

B. Data Fusion of Information Retrieval in Natural Resources Data
The dictionary storage mechanism is used to construct the optimized distribution structure model of the information in natural resources data [15]. Combined with the analysis of the storage structure of the information in natural resources data, the distribution order of the information in natural resources data is m i uR  through fuzzy matching and hashing algorithm. Similarly, in the conversation component, the conversation protocol of the information retrieval in natural resources data is constructed, and the distribution set of related features of the information retrieval in natural resources data is obtained under the guidance of the retrieval mode as follows: is the dimension of fuzzy comprehensive clustering, L is the sample length of information retrieval in natural resources data, and n is the sampling sample sequence. The matching model of information retrieval in natural resources data is constructed by using Observer coprocessor, and the element combination www.ijacsa.thesai.org parameters of information retrieval in natural resources data are obtained by semantic degree analysis in file t x , when: x is the length of hash algorithm and tn x  is the regression distribution parameter. In the storage node S set of station information in natural resource data, the edge feature distribution set is satisfied. The semantic correlation dimension t B x  of natural resource data is defined. Based on the fusion of large data sets, the feature quantity 1 2 1 , , , resource data association rules about i q vector combination is obtained. By using the method of identifying natural resource data information, the association of natural resource data information is carried out in the cluster center. Combined with the fuzzy comprehensive decision method, the retrieval control of natural resource data information is carried out, and the association rule items are obtained as 12 ( , ,..., ) The dense subgraph of natural resource data information is.
i ij g q d is the nearest neighbor feature distribution set, and () ij fd is the joint autocorrelation matching set of station information in natural resources data, thus realizing the data fusion of station information retrieval in natural resources data.

A. Characteristic Clustering of Information Retrieval in Natural Resources Data
The association rule set of information distribution in natural resources data is constructed, and the control time of information retrieval in natural resources data is S by using hash algorithm. By using multi-table connection and semantic matching, the optimal feature solution set of information in natural resources data is obtained:

Let max
 be the correlation attribute set of information retrieval in natural resource data, max  is the fuzzy matching coefficient, i Q is the correlation rule coefficient, K is the detection statistical feature quantity, and T is the semantic adjacent parameter, and the attribute value of the distribution set i a in natural resource data is    is the association rule information of information retrieval in mn  -order natural resource data, j h is the dimension of second-order hash algorithm, and j s is the statistical distribution set. The characteristic clustering model of information retrieval in natural resource data is: autocorrelation coefficient of information in natural resources data. According to the above analysis, the clustering analysis of information retrieval in natural resources data is realized by using hash algorithm and rough set clustering.

B. Retrieval and Output of Information in Natural Resources Data
The hash algorithm is adopted to realize iterative fusion and adaptive control in the process of information retrieval in natural resources data, and the fuzzy parameter distribution www.ijacsa.thesai.org domain of information retrieval in natural resources data is constructed as follows: Wherein, node N is the dimension of information retrieval nodes in natural resources data, and () r node Ni is the detection statistical component corresponding to information retrieval nodes in natural resources data [16][17][18].
Combined with the correlation of information retrieval in natural resources data [19][20][21][22][23], the fuzzy decision model of information retrieval in natural resources data is constructed, and the Transport/Session transmission protocol and session management protocol of information retrieval in natural resources data are designed [23][24][25][26],. The output of the optimized retrieval model is as follows: . , , Wherein, ˆc ji g is the input joint parameter of information retrieval in natural resource data, T is the sampling sample width of information in natural resource data, and c i  is the ambiguity detection coefficient. The ambiguity detection is carried out on the bidirectional index channel of information in natural resource data. According to the above analysis, the association rule set of information distribution in natural resource data is constructed, and the iterative fusion and adaptive control in the process of information retrieval in natural resource data are realized by using hash algorithm.

Hadoop
Database retrieval task 1 Database retrieval task n JPA Java Database connection (JDBC) Dadabase Fig. 3. Implementation process of information retrieval in natural resources data center.
According to Fig. 3, the training data (hdfs) and test data (hdfs) are input to Hadoop Company and transported to the database retrieval task through Hadoop Company 1; the database retrieval task, so as to complete the information retrieval of the natural resource data center.

A. Experimental Environment Settings
In the test platform, according to the overall structure, the middle platform includes seven modules: catalogue information system, data resource management, electronic license management, interface management, system management, system monitoring and system tools, and supports specific functions such as metadata management, catalogue management, atlas management, automatic generation of electronic licenses, resource statistics and application monitoring. The Natural Resources Department has more than 66.93 million data records, covering 23 business offices; there are about 22 million data records in other departments and bureaus, covering 19 industries, sharing 163.9TB of aggregated spatial data. The distribution set of statistical features of central information retrieval in natural resources data is 1,206, and 29 categories of thematic data and 1,600 element layers are aggregated. See Table I for statistical characteristics of station information distribution in natural resource data.

B. Discussion of the Experimental Methods and the Results
According to the parameter distribution in Table I, search the information of natural resources data, and store it in the form of key values to assist label management and quick resource positioning. At the same time, the index of conditional retrieval label is added to support full-text retrieval. On the basis of the category system of natural resources data labels, perfect data query and guidance functions are established to provide support for data sharing and data trading. Query entities, labels, relationships and portraits that support management, add functions such as data type inference, condition retrieval, paging query, expression query, etc., introduce the intelligent guidance concept, combine hash algorithm, and mark feature points to analyze the reliability of www.ijacsa.thesai.org information retrieval in the middle station. According to the sample test, the sample feature distribution of information in the middle station of natural resources data is shown in Fig. 4. (a) Test data.
(b) Training data. According to the sample distribution structure of natural resources information in Fig. 4, the information of natural resources data is searched and the confidence level distribution is shown in Fig. 5. According to the distribution characteristics of frequency domain, a knowledge map of the relationship among data clouds, networks, systems, events and data resources is established, and the use of resources is monitored by distribution and exchange. The visual display mode is adopted to display the overall situation of resources, the real-time status of data sharing, and the ranking of resource applications, to make statistics and measurement of data usage, and to carry out early warning and monitoring of important indicators. It is concluded that the clustering of the information retrieval in natural resources data by this method is good, and the recall rate of different methods is tested. The comparison results are shown in Fig. 6, and the analysis of Fig. 6 shows that the recall rate of the information retrieval in natural resources data by this method is high. From the analysis of Fig. 6, it is known that the method in this paper has a high recall rate and a good retrieval performance. At the same time, establish data inflow and outflow ledger, assist resource managers to master and understand the weak links of individual work, and promote the resource construction of the platform. According to the data security level, the data will be managed hierarchically and desensitized, and the abnormal behavior, data access, operation of data resources and metadata of users in the data center will be monitored, counted and risk analyzed to ensure the security of government data of natural resources.

V. CONCLUSION
Systematic management of natural resources data is a longterm process, which needs to be carried out from methodology, standards and technical realization. The construction of data platform can open up the links of data acquisition, data processing, data service, etc. through the double model of business and data, which can further improve the way of data sharing and exchange, and form an open, flexible and extensible unified natural resources data management mode. At present, there are many factors that restrict the governance of government data of natural resources based on the technology of data center. First, the industry lacks norms and standards for the construction of data center, and the specific form, sharing mode and service standard of data center for the future development and construction of natural resources are not defined. Second, the data return mechanism has not been improved. Limited by network, security, technology, etc., the Sample Amplitude/V www.ijacsa.thesai.org data generated and output by the same level, higher and lower levels are not effectively shared. For example, local governments use superior systems to report data, and in the process of reporting, a large number of real, effective and structured data have been sorted out and produced.
However, in actual work, due to the lack of data return mechanism and sharing channels, these reported data have been managed and used by the superior departments. If local departments need to use or connect them to their own systems, there will be a second job. Next, this study will further upgrade and optimize the data middle station on the basis of in-depth exploration of the types, nature, collection cycle and collection mode of business data and continuous improvement of data management and service mechanism. At the same time, on the basis of data resource service convergence, make full use of big data analysis technology to explore the construction of knowledge map of natural resources, find out the relationship between entities, better analyze the problems in natural resource management, and provide practical and valuable reference for administrative decision-making.