A Bird’s Eye View of Natural Language Processing and Requirements Engineering

Natural Language Processing (NLP) has demonstrated effectiveness in many application domains. NLP can assist software engineering by automating various activities. This paper examines the interaction between software requirements engineering (RE) and NLP. We reviewed the current literature to evaluate how NLP supports RE and to examine research developments. This literature review indicates that NLP is being employed in all the phases of the RE domain. This paper focuses on the phases of elicitation and the analysis of requirements. RE communication issues are primarily associated with the elicitation and analysis phases of the requirements. These issues include ambiguity, inconsistency, and incompleteness. Many of these problems stem from a lack of participation by the stakeholders in both phases. Thus, we address the application of NLP during the process of requirements elicitation and analysis. We discuss the limitations of NLP in these two phases. Potential future directions for the domain are examined. This paper asserts that human involvement with knowledge about the domain and the specific project is still needed in the RE process despite progress in the development of NLP systems. Keywords—Automated text understanding; natural language processing; requirements engineering; requirements elicitation


I. INTRODUCTION
Natural Language Processing (NLP) has a significant functional value in many application fields. NLP is especially useful in the requirements engineering (RE) domain. RE is a vital part of software engineering and considered as the first phase in the software development life cycle. It consists of several activities, including elicitation, analysis, documentation, validation, and management of requirements [1]. RE is a complicated process that is both time-consuming and error-prone, especially for large projects [2]. The RE process defines all of the requirements that a new system needs to complete to be successful. In addition, RE process collects the necessary and appropriate domain knowledge that comprises the requirements of the stakeholders (customers, end-users, domain experts). The requirements elicitation and requirements analysis tasks are performed incrementally and iteratively to accomplish this goal. These tasks use both informal natural language (NL) and formal modeling language. Using NL helps communicate with the stakeholders. It is the universal format language understood by end-users and stakeholders from all involved disciplines [3]. However, NL can be ambiguous and result in misunderstandings concerning the definition of requirements.
NLP can improve the communication between requirement engineers and domain experts (i.e., the users) by creating suitable RE specification documents [3]. NLP can also improve computer understanding of natural language text written by humans [4].
This paper presents a survey of how NLP supports current RE approaches. Specifically, the following research questions are addressed: Q1: What is the current state of the practice for elicitation and analysis phases in RE using NLP as the platform? Q2: What are the activities for the requirements elicitation and analysis in RE using NLP as the platform?
Q3: Are NLP systems improving the requirements elicitation and analysis for RE? Q4: What are the current limitations of using NLP in requirements elicitation and analysis?
A literature review was conducted to address these research questions. This literature review summarized the research reporting use of NLP to support RE activities. The literature review was divided into data preparation, data collection, and data analysis stages. First, the literature search criteria were developed based on the research questions. Second, literature searches were conducted over a predefined collection of databases, including Springer Link, Scopus, IEEEXplore, Google Scholar, Science Direct, and ACM Digital Library. The literature search results were evaluated by title and abstract for all four study questions. This literature review was not an exhaustive review. Instead, this literature review provides a snapshot of the state-of-the-art practices based on the Kitchenham guidelines [5] for conducting systematic literature reviews.
This paper provides an overview of the current state of practices and challenges associated with the elicitation and analysis phases of RE employing NLP techniques. This study focused on three important aspects: • Providing an overview of the challenges facing the requirements elicitation and the requirements analysis (i.e., ambiguity, inconsistency, incompleteness, and requirements classification).
• Providing an overview of the state-of-the-art approaches to NLP support of current RE operations. Precisely, this study focused on how the available NLP tools and techniques support requirements elicitation and requirements analysis.
• Providing an overview of the limitations of NLP use in requirements elicitation and requirements analysis.
This paper is organized as follows. Sections II and III provide background discussion about NLP and RE. Section IV provides an overview of RE activities where NLP was used to support the requirements elicitation and analysis processes. Section V discusses current requirements elicitation and analysis practices in NLP. Section VI discusses the current limitations of using NLP in requirements elicitation and analysis. Section VII provides general discussions. Finally, Section VIII concludes the paper.

II. REQUIREMENTS ENGINEERING
The elicitation, analysis, and management of requirements on the basis of semantics during the system development process are difficult due to a large number of requirements for large system engineering projects. Experts normally face numerous constraints during the processes of elicitation and analysis. These constraints include time restraints, insufficient human cognitive capacity to understand the full scope of the processes, and the volume of data to be processed [6]. These considerations make it difficult to manage and maintain the quality of the software requirements specification (SRS) document.
Requirements elicitation involves understanding the objectives and motivation for proposed system software. This phase usually begins with fundamentally informal knowledge and involves unfamiliar people with the processes for developing a software system. Thus, data interpretation is influenced by misunderstandings between the analysts and the consumers [7]. In RE, it is essential to have an excellent semantic understanding of the situation before beginning the process of requirements elicitation [8]. Traditionally, requirements were communicated to the elicitation team using NL text to prevent misunderstandings related to variations in terminology. However, ambiguity is a problem related to NL [9]. The elicitation team may misinterpret or misunderstand the stated requirements specified using NL due to the method used to communicate these requirements. Additionally, problems can occur that are associated with the elicitation of functional requirements (FRs) and the numerous subcategories of the non-functional requirements (NFRs) [10]. These problems stem from differences in computer jargon and terminology to describe the requirements between the stakeholders and the requirements engineers [11]. The lack of consistency in the requirements documentation process makes the requirements classification process difficult and prone to error [12]. Cordes and Carver (1989) [13] stated that requirements are not created by a single human individual but are the result of common needs from multiple communities. These requirements can introduce uncertainty and inconsistencies. They further state that different participants in the requirements elicitation process have different interpretations of the meaning of the requirements. The participants have different opinions about the design of the new system based on their interpretation of the requirements. The resulting requirements will be ambiguous, contradictory, and incomplete if the participants in an elicitation process do not have a common semantic understanding of the requirements.
Requirements analysis involves understanding and assessing the documented requirements. This phase is concerned with checking the set of elicited requirements for qualifying conflict, omission, duplication, ambiguity, and inconsistency criteria. Common requirements analysis practices involve using checklists for analysis, prioritization, and sorting of requirements, using interaction matrixes to define differences and overlaps, and developing a risk evaluation of the requirements [14].
Requirements elicitation and requirements analysis are usually interlinked processes. Requirements are identified during elicitation, and then analyses are performed. If issues are identified, they are addressed and solved using the source of the requirements [14]. Once a requirement has been elicited, modeled, and analyzed, the SRS document should be clearly and unambiguously documented [15]. The SRS is part of a contract and must simply, accurately, and unambiguously define the requirements of the user and the system. An SRS that is inconsistent, unmanaged, vague, incorrect, or unclear inevitably leads to cost overruns and missed deadlines [16], [17] and [18]. A noteworthy research issue in RE is ambiguity which is described as "a statement having more than one meaning." Ambiguities may be lexical, syntactic, semantic, pragmatic, vague, generic, or linguistic [19].
The use of NLP in RE is critical because requirement specifications are developed in collaboration between the software analysts and the end users. End users, consumers, and customers will not sign a contract if the requirements are written in the formal language. [20].
Several projects have demonstrated that the RE process can be automated or semi-automated by using NLP [21], [22], and [23]. Furthermore, NLP can support requirements elicitation and requirements analysis by automatically eliminating the ambiguity barrier.

III. NATURAL LANGUAGE PROCESSING
NLP is technically one of the sub-fields of artificial intelligence. NLP implements computational and linguistics techniques to assist computer understanding. Additionally, NLP can create human language in the form of texts and speech/voice [24]. The processing of NL is difficult and involves different techniques from those used in artificial languages [25]. NLP approaches are usually based on machine learning (ML). For NLP, the ML process is composed of two tasks: natural language understanding (which is the task of understanding the text) and natural language generation (NLG) (which is the task of generating text with a syntax that is widely used by humans) [26] and [27]. Another study [28] identified three types of NLP technologies: NLP techniques, NLP tools, and NLP resources. An NLP technique is a functional method, approach, process, or procedure for conducting a specific NLP task. NLP techniques include part-of-speech (PoS) tagging, parsing, and tokenization. An NLP tool is a software system or software library that continues to support one or more NLP techniques. Examples of NLP tools include Stanford CoreNLP7, NLTK8, and OpenNLP9. An NLP resource is a linguistic data resource that assists NLP techniques or tools. An NLP resource can be a language lexicon (i.e., dictionary) or a corpus (i.e., a collection of texts).
A lexical analysis can be included in a requirements document to specify pre-built dictionaries, databases, and rules. The goal of lexical analysis is to analyze the meaning of specific words. Five key techniques can be used in a lexical analysis: sentence splitting, PoS tagging, tokenization, morphological analysis, and parsing. Sentence splitting involves separating the text into different sentences. During this process, the NL text is evaluated to determine sentence boundaries.
Tokenization involves dividing a sentence into meaningful components, called tokens. Depending on the form of the text, which is partially determined by the sentence splitting, the tokens are assigned to a category, including punctuation marks, numbers, symbols, and words. The PoS tagging method involves tagging each token with its grammatical group, depending on its meaning and context. Each token is designated with a tag, including noun, verb, adjective, or determinant [29].
The morphological analysis is the initial stage in syntactic analysis. The goal of syntactic analysis is to define the origin of a compound word. Compound words are quickly stemmed and lemmatized. Stemming is the process that reduces a compound word into its ground form or origin. Lemmatization is the process that searches for the ground form of a word [29].
Parsing is a process that analyzes a sentence by taking each word and determining its structure based on its constituent sections. Two components are required to parse a piece of text: a parser and a grammar. Ambiguous sentences may require several different types of analyses in the grammar of NLs [30]. There are two main parsing approaches: dependency parsing and phrase structure parsing. Dependency parsing focuses on the connections between the words in a sentence. Phrase structure parsing involves construction of a parse tree using probabilistic context-free grammar.
The output of a lexical analysis serves as the input for a syntactic analysis. This method performs a sentence analysis of the words to determine the grammatical structure of the sentence. It requires both grammar and a parser. This level of processing results in a representation of the sentence that shows the structural relationship of the dependence between words [4].
Semantic processing defines the potential meaning of a sentence by focusing on associations between word-level meanings in a sentence [4]. Semantic processing builds a description of the objects and actions identified in a sentence and include the details given by the adjectives, adverbs, and prepositions [31].
The goal of categorization is to automatically assign new documents to categories that are already defined [32]. In RE, the NLP method is used to collect requirements from a text; analyze the consistency, linkages, similarities, and ambiguities in the text; and automatically group the text. It also classifies requirements for specific purposes that may be useful during software development. Work associated with the classified requirements may be split between different software development teams, with each team assigned a different class of requirements [33].
IV. NLP FOR REQUIREMENTS ELICITATION AND ANALYSIS Traditionally, requirements elicitation and requirements analysis are manually processed, expensive, time-consuming, and resource-demanding [17]. Researchers have applied NLP techniques and tools to support a range of linguistic analysis tasks performed at various phases to produce complete requirements documents without ambiguity and inconsistency [34]. The requirements can be illustrated for the stakeholders in a semi-automated or automated way [35], [36], and [37]. Requirements may appear in different forms, including lists of single words, phrases, paragraphs, short texts, and special characters.
Generally, requirement engineering problems are primarily caused by heavy dependence on humans use of NL [38]. NL is syntactically ambiguous and semantically inconsistent. A systematic analysis of literature from 1995 to 2016 indicates that assembly of ambiguous requirements remains one of the most critical problems in software engineering [39]. In response, researchers have attempted to use NLP systems to solve the ambiguity challenges of NL. NLP systems have also been used to support the communication process between system users and stakeholders during the development stages of a system [15]. Communication techniques may focus on pre-selected tools (e.g., Stanford Parser), preferred methods (e.g., rule-based and ontological-based), or degree of automation. The work of [40] provides a detailed discussion about the current approaches to ambiguity in the field of requirements. This paper evaluates empirical work on NLP tools and techniques for dealing with different types of requirement ambiguity [40]. These studies indicate that a significant number of current software implementations solely rely on ambiguity recognition. However, compensation is the responsibility of the stakeholders.
An interesting research area is using NLP for eliciting and analyzing domain requirements based on developed domain ontologies. Ontologies provide a standardized means of organizing information among stakeholders in RE. Thus, ontologies may significantly enhance the quality of the elicited requirements [41]. For example, [42] used a domain ontology and meta-model requirements to generate and elicit requirements. Similarly, [43] describes three core features of domain ontologies ideal for elicitation requirements: explicit relational expression, competent relationship recognition, and explicit temporal and spatial expressions. For the creation of certain domain ontologies, A rule-based approach is recommended for the creation of certain domain ontologies from NL technical documents [44]. The research [45] used NLP to derive formal representations of the requirements based on object-oriented designs using intermediate models.
83 | P a g e www.ijacsa.thesai.org NLP is recognized as general assistance in analyzing requirements for ambiguity defection [46]. NLP techniques were used to retrieve information and synthesize models. For example, [47] produced unified modeling language (UML) templates (e.g., use-case, analysis class, collaboration, and design class diagrams) from the requirements of natural language using a collection of syntactic reconstruction rules. In addition, [48] proposed a tool-supported approach to promote the process of requirements analysis and the retrieval of class diagrams from textual requirements that support NLP and domain ontology techniques.
Emerging software paradigms, including social networks, mobile computing, and cloud computing, has expressed a growing interest in using NLP techniques. Additionally, NLP techniques are being explored for extensive data analysis to enable data-driven RE [49] and crowd-based RE [50]. Requirements articulated in user stories have been presented as an interesting application of NLP to support agile methodology [51].

V. CURRENT STATE OF PRACTICE OF NLP IN RE
Applying NLP to RE is an area of research and development that implements NLP tools, techniques, and resources to a range of requirements documentation to facilitate various linguistic analysis activities performed at different RE phases. These tasks include detecting language problems, defining core domain terms, and creating traceability links between requirements [28].
Currently, most NLP tools are used for solving problems in the elicitation phase. NLP tools are also used to extract NL text by the process model, based on parsers and tagging [52]. For example, [53] used 2PoS tagging for preprocessing during the development of conceptual models. This paper proposed an automated solution called Visual Narrator based on NLP. Visual Narrator derives a conceptual model based on user story requirements. In this process, the PoS tagging is used to define the linguistic pattern of the sentence. If the PoS tagging is determined to be a requirement, it is collected from the text corpus and gathered for the next steps in the methodology. The automated approach enables identifying dependencies, redundancies, and inconsistencies between requirements based on a comprehensive and understandable view created from long textual requirements.
The latest trend in requirements elicitation uses NLP is to mine accessible databases (e.g., social media, requirement documents, or Apple Store feedback). The mining process is carried out with the help of ML techniques, NLP, and text mining [54]. Recently, a growing body of research has assessed the use of NLP techniques to extract requirements based on different types of user feedback for the requirements elicitation process [55]. For example, [56] created a tool for detecting ambiguous words in translated SRS.
Currently, an approach has been used to automate requirements elicitation and classification criteria which utilizes an intelligent conversational chatbot. For example, [57] used ML and artificial intelligence to develop a chatbot that interacts with stakeholders using NL and creates formal system requirements based on conversation. This chatbot then classifies the elicited requirements into functional and nonfunctional system requirements. Additionally, chatbots are widely used in web applications to provide help or information requested by users. For example, CORDULA is a framework that uses chatbot technology to establish contact with end-users for requirements elicitation and understand users' needs. CORDULA guides the users to their desired outcome with minimal effort required by the end-user [58].
Domain ontology has been widely used to improve the elicitation and analysis of functional and NFRs. For example, [59] used NLP to extract NFRs for natural language documents. Furthermore, [60] used an ontology-based approach to support the collection of knowledge to identify possible solutions for eliciting NFRs. Additionally, NLP has been used in similarity analyses to identify functional and NFRs from user app reviews [61] and [62]. The study of [63] proposed an ontology-based approach to support software requirements traceability, which makes it possible for a development team to effectively manage the evolution of the requirements for a software product.
New requirements analysis tools based on NLP are emerging. These tools should significantly reduce the cost of fixing requirement errors by faster identification, thus freeing domain experts from tedious, time-consuming tasks. For example, QuARS (Quality Analyzer for Requirements Specifications) is a tool that analyzes NL requirements in a comprehensive and automated manner using NLP techniques. QuARS emphasizes detecting potential linguistic weaknesses (i.e., ambiguity) that can create issues with interpretation at the next stage of software development [64]. This tool partially assists with the analysis of accuracy and completeness by grouping requirements based on specific concerns. However, user interaction is recognized as a crucial factor adversely affecting the performance and approval of the entire processing. The study of [65] proposed an NLP technique that uses a classification method to automatically handle redundancy and inconsistency problems in a requirement document.
An annual workshop called the NLP4RE (Natural Language Processing for Requirements Engineering) was established to explore interests in NLP applications related to RE issues [66]. The goal of NLP4RE is to help requirements analysts perform multiple linguistic analysis activities for RE phases. This workshop has produced numerous publications and gained broad interest from diverse cultures. About 42.7% of NLP4RE studies focused on the analysis phase. These analysis phase studies used detection as the core linguistic analysis activity and requirements specifications as the processed document type [28].
In the industrial sector, many companies have begun developing NLP tools for RE. For example, Qualicen developed Requirements Scout, a tool that analyses requirement specifications, detects requirement ambiguity, and requirement "smells." The ThingsThinking system offers several tools under the brand name, Semantha®. This system includes a tool that classifies requirements and identifies the associated risks. It also has a tool that performs document comparison on a semantic level, which can be used for 84 | P a g e www.ijacsa.thesai.org analysis of requirements documents created by multiple stakeholders. QRA Corp developed QVscribe, a tool that checks the quality and consistency of requirements documents. OSSENO Software developed ReqSuite, which is a tool to support the writing and review of specifications. IBM recently developed the IBM Engineering Requirements Quality Assistant, which is an application that leverages the advanced NLP capabilities of IBM Watson for automated requirements analysis and management.

VI. CURRENT LIMITATIONS OF NLP IN REQUIREMENTS ELICITATION AND REQUIREMENTS ANALYSIS
An automated means of enabling software engineers and project managers to develop and refine their NL requirements is needed in RE. NLP can reduce the human effort in making NL requirements clear, consistent, unambiguous, and easy to understand by all stakeholders before moving into modeling and design phases. There are still numerous limitations on the capabilities and rationales for using NLP techniques, despite research developments on NLP for RE. Various challenges for NLP use within requirements elicitation and analysis still exist, including the followings: • Coreference resolution: Coreference resolution is the task of extracting several expressions in a sentence or text that refer to the same entity/actor in a requirements document. It is especially employed at the semantic/pragmatic level when two nouns are treated the same [67]. Coreference resolution is a key challenge of NLP, not only in English but also for all other languages [68].
• Emotion Detection (ED): While the use of NLP system in requirements elicitation and feedback techniques are well defined, there are no current state-of-the-art techniques that combine emotionally driven features and the capture of user feedback on these features [69]. Emotion recognition may also be used to evaluate social media data or to spot fake news [70]. A major challenge in ED is that the cultural affiliations of an individual may significantly impact their expressed feelings in a situation [69]. However, progress is being made as several methods have been developed to solve this problem. These methods include the use of a knowledge-enriched transformer [71], focusing on latent representation [72], and building new datasets that focus on emotions [73] and [74].
• Unimodal LNP: Current NLP systems are primarily unimodal. Thus, they are limited to process and analysis of linguistic inputs [75]. However, humans are multimodal. They use diverse combinations of visual, auditory, tactile, and other inputs. Humans do not handle each sensory model in isolation but rather simultaneously. This process incorporates each sensory model to enhance the quality of awareness and understanding [76]. Therefore, from a computational point of view, NLP needs to have these same abilities to achieve human-level ground and understanding in a variety of AI tasks. NLP must be assisted by multimodal control interfaces, identification and understanding of human behavior, and collaborative decision-making between the system and individuals or groups to understand the requirements of the customer and other stakeholders [77]. Visual question answering is a method that addresses the challenging unimodal aspect of NLP systems [78]. Many other methods are used to integrate multimodality into NLP structures, including declarative learning-based programming [79], multimodal datasets [80], procedural reasoning networks [81], and unified attention networks [82].
• Ability to recognize requirement sentences that contain contextual information rather than merely describing the process steps [28]. The inherent ambiguity of NL can lead to differing interpretations of the same sentence [83].
• Domain ontologies approach: [43] found that requirement analysts were more likely to misidentify concepts and relationships when using a domain ontologies approach. Thus, domain ontologies need to be investigated to develop a deeper understanding of the requirements and their respective relationships [84].
• NLP accuracy in extracting the correct requirements must be improved. NLP must be enhanced by other methods (e.g., ML) to eliminate errors. Accuracy must be substantially improved if NLP to be seriously considered for use with RE [85].
• The algorithms for detecting ambiguity need improvement and fine-tuning while simultaneously avoiding over-fitting. These improvements are needed to evaluate whether the use of domain ontologies can lead to a deeper understanding of the requirements and their relationships [86].
• PoS detection is generally considered a challenge that has been resolved. However, there are still issues with incorrect POS tags [87].
Attempting to resolve all ambiguities in a requirements specification is a time-consuming process that cannot be fully automated. Human interaction is needed to overcome dynamic ambiguities that are dependent on domain knowledge. Controlled language is helpful in identifying or avoiding ambiguities in SRS. However, the input must be written in the constraint language, and lexical and syntactic ambiguities must be addressed. Furthermore, methods that use knowledgebased, ML, and ontology techniques may produce precise outcomes by detecting semantic ambiguities in the requirements specifications [88], [89], [90].

VII. DISCUSSION
Our research focused on a specific field and evaluated a range of trends explained and summarized in this section. The objective of this research was to provide a state-of-the-art summary of NLP performed in various RE activities. This research is intended to be an overview for domain experts and serve as an entry point for researchers in this area. We present results based on the conducted literature review. As previously discussed, the literature review was not entirely systematic. 85 | P a g e www.ijacsa.thesai.org Thus, our findings may be revised and/or expanded by future studies within this domain. Table I and Section IV provide answers to RQ1 ("What is the current state of the practice for elicitation and analysis phases in RE using NLP as the platform?") and RQ2 ("What are the activities for the requirements elicitation and analysis in RE using NLP as the platform?"). Twenty-five articles (i.e., the "Contributions" column in Table I) were closely analyzed to assess the state-of-the-art of NLP the use NLP in RE activities. We assume that most studies are preliminary proposals because there are more academic research papers than actual software projects for industrial applications. As listed in Table I, the "NLP Tasks" and "RE Support. Activities" columns address RQ2 and describe the various RE tasks that can be assisted by NLP techniques. These RE tasks include traceability, ambiguity detection, and requirements classification. The available techniques and tools developed to support each RE task are presented (e.g., PoS, tagging, and tokenization). The "Contributions" column in Table I also includes a partial response to RQ 3 ("Are NLP systems improving the requirements elicitation and analysis for RE?"). The answer to RQ3 appears to be preliminary; as indicated by the lack of comparison with state of the art in Table I. Table I also include information related to the quantity the NLP data and user expectations, emotions, and experiences. This rich data set may be used by software developers to assess better their product users' needs, experiences, and sentiments. Mining NLP, especially user opinions, can yield valuable information for product upgrades by software development organizations. However, it is often difficult to extract user requirements from massive amounts of data. Opinions are often shared without regard to grammar or style. This issue has recently caused problems with corpus processing. As a result, we assume that the collected data are unstructured. Software developers focus on user feedback for requirements elicitation and analysis. However, the trustworthiness of comments, tweets, or feedback remains a major problem for the software development community. In section VI, we discussed a variety of limitations in this domain. Other challenges or limitations that the RE community faces by using NLP as the source for eliciting user requirements include user privacy and personalization [55]. These findings address RQ4 ("What are the current limitations of using NLP in requirements elicitation and analysis?"). Table I lists the NLP methods used in the articles reviewed in this paper. Most researchers use NLP techniques for the identification of ambiguity in RE. Furthermore, due to its wide role in RE, the analysis process was the phase with the most attention in the research. We observed that NLP was primarily used in the preprocessing phase to convert data into a format that was consumable by all stakeholders. Most of the papers in our survey claim that the vast amount of imprecise data generated by NL users may provide tremendous benefits to software development organizations if processed with an NLP system. Most of the articles also indicate that NLP use with RE is still in its early stage; however, this research topic is rapidly expanding. Although NLP can compensate for many of the requirements' ambiguity, inconsistency, and incompleteness, there are still circumstances where interaction with end-users is required for clarification [58]. This finding is supported by entries in the "User Interaction" column in Table  I. Most of the papers about the use of NLP in requirements elicitation and analysis indicate that parsing requirement texts 87 | P a g e www.ijacsa.thesai.org and classifying the information stored in them are difficult for humans. Thus, these activities should be automated as much as possible.
Furthermore, we discovered that most of the analyzed studies obtained their requirement datasets from external sources (e.g., Twitter, or an app store) rather than from existing documents for requirement elicitation. We found that the current state of the art in this area indicates that the first two phases prioritize researchers and practitioners. Table I shows most of the NLP techniques employed parsers and taggers to explain the tools used in text processing. This finding is supported by the results from [53]. In that study, NLP techniques are applied to sentence segmentation, tokenization, PoS tagging, shallow parsing, dependency parsing, word stemming, lemmatization, and role labeling.
Within the RE field, NLP is primarily used to analyze requirements and schedule them for further processing. This is primarily focused on developing models from elicited requirements and improving the consistency of the SRS. Both SRS and NLP share three common tasks: PoS tagging, rulebased analysis, and syntactic parsing. Two areas that lag behind the other RE (sub-)phases in the scope are requirements documentation (e.g., drafting of the SRS) and requirement prioritization. We assume that the writing requirements are supported by the NLP, including implementation of a specific template, spell checking, and explicitly resolving any possible ambiguities. These writing requirements may be beneficial to requirements engineers. However, since these tools necessitate live contact with requirement engineers, this research is not regularly performed than less advanced tools (e.g., solely for ambiguity checking).
Classification, model extraction, and detection seem to have more advancement than other areas of research, based on the number of publications and reported NLP tasks. This assumption is also valid for ambiguity detection.

VIII. CONCLUSION
This paper reviewed the current status of using NLP and its limitations in requirements elicitation and analysis processes. With the need for faster speed, lower cost, and higher quality in software engineering, there is an increasing need for automated support for all processing requirements of elicitation and analysis in the RE artifacts. While the pressure from industrial customers is obvious, extensive work is still needed to create automated NLP-based processes in RE. NLP tools and techniques have been proposed to automatically or semi-automatically detect syntactic, semantic, and pragmatic ambiguities in the requirements. Many solutions have been proposed from academia and industry to evolve the use of NLP in RE. Despite progress in NLP, there are still limitations to the NLP system. There is extreme pressure from the industry to improve the accuracy of NLP used to extract system requirements. Therefore, NLP must be enhanced for the elimination of residual errors. The accuracy must be substantially increased if it is to be seriously considered for use in RE. Based on the results from this paper's review and due to its limitations, NLP systems cannot be considered as a solution that can fix all RE issues. Nevertheless, NLP can be used to assist RE analysts. Findings from this study indicate that NLP can be used in real-world applications. Future research may produce more specialized NLP tools that can help consolidate the domain model and serve as translators between different RE documents and structured models.
These findings provide an understanding of the state of the art in this field of study and are useful for developing an analytical framework for complete systematic literature reviews. From the standpoint of RE, it may be important to investigate how NLP task combinations can be streamlined to fully perform additional tasks. The same is true for NLP in requirements management where requirements are managed within a software system, from elicitation to implementation to reuse. The NLP tasks described in this paper is not a comprehensive list. A possible extension of this research may examine NLP activities that have not yet been used in the field and how they may be used in the future applications.