Question Answering from an Answer Retrieval Point of View : a survey

Arabic Question Answering (QA) is gaining more importance due to the importance of the language and the dramatic increase in online Arabic content. The goal of this article is to review the state-of-the-art of Arabic QA methods, to classify them into different categories from an answer retrieval viewpoint and to present their applications, issues and new trends. The main components of question answering systems are also presented. Finally, this survey provides a comparative study of systems of each type of QA based on several criteria. Keywords—Question answering; Information retrieval; Answer retrieval; Arabic NLP


I. INTRODUCTION
Nowadays, the Internet has become the main source of information and search is a daily activity for many people throughout the world.The need to retrieve related information for request became increasingly important.It became necessary to find useful and accurate information from large amounts of information.The current techniques in Information Retrieval (IR) allow a user to retrieve only the relevant documents which match a given query.Then, the users look for the information they need within the relevant documents.Therefore, a new need emerged: the possibility of obtaining a brief and accurate answer has motivated the interest in question answering (QA) systems.
QA is a special and sophisticated form of information retrieval.It has been created to automatically satisfy a specific need of information which requested by users who are looking for answer using a natural language question [1].QA systems are composed of three main subtasks: question analysis, passage retrieval, and answer extraction.Most QA systems follow these tasks but it may differ in the way of implementation of each sub-task.
Arabic language is the 6th important language in the world with more than 300 million speakers [2].Moreover, it is the language of the Quran, so it has a great attention from Muslims all over the world which means about 1 billion people around the world may be interested in it.Arabic QA gains more and more attention due to the dramatic increase of the Arabic content on the Internet.
QA systems are classified into two main categories.Opendomain QA which deals with questions about nearly everything.The second category is closed-domain QA which deals with questions in a specific domain (Fatwa, weather forecasting, medical applications etc.) [3].
QA from answer retrieving viewpoint can be divided into two subcategories: the first one is QA by extracting an answer from unstructured documents such as web pages or via generating it; where the answer is drawn from multiple sentences or multiple documents [4].The second type is QA based on Frequently Asked Questions (FAQ).In this type, a user query is matched with already existed questions which are associated with their answers in a database to retrieve the closest possible answer to a given question.
Actually, few surveys have been published that investigate Arabic QA such as [1], [5].Unlike the previous surveys, in this survey a new taxonomy for QA is presented.In such a taxonomy, both types of QA (QA where the answer is generated and QA based on FAQ) have been investigated.Moreover, our survey has shown a comparative study of the systems of each type of QA based on several criteria as shown in the upcoming section.
A. Challenges of Arabic QA Design of QA system for Arabic language has become a greater challenge because of the nature of the language.There are some characteristics in Arabic language which slow down the progress in Arabic Natural Language Processing (NLP) [2], [6], [7]: • Arabic has a very complex morphology (inflectional and derivational characteristic).
• The absence of diacritics in the written text creates many ambiguity problems to question analysis and answer extraction.
• The absence of capitalization which makes problems in Named Entity Recognition (NER).
The major challenges that are faced by Arabic QA are: • The lack of accessibility to Arabic linguistic resources such as WordNet.
• The lack of technologies like basic NLP tools (tokenizers, morphological analyzers, information extraction tools) The rest of this paper is organized as follows: Section 2 introduces the-state-of-the-art of both types of Arabic QA.Section 3 presents the applications of Arabic AQ.In Section 4 some issues and expected future trends of Arabic QA are exposed.And finally, The conclusion are introduced in Section 5. www.ijacsa.thesai.org

II. THE-STATE-OF-THE-ART
As it has been aforementioned, QA form answer retrieving perspective can be divided into two subcategories: QA based on FAQ and QA where answer is generated from raw text.The-state-of-the-art of the two types are investigated in the following sections: A. QA where answer is generated from raw text In this type, the answer is generated and formulated from unstructured documents such as web pages.The answer is drawn from single or multiple documents.In this section the general architecture of QA is discussed.Moreover, the section discusses the approaches used to implement each task in this architecture.
1) The general architectures of a typical question answering systems: A typical QA system consists of three distinct modules: question analysis, passage retrieval, and answer extraction.Figure 1 shows the general architecture of typical QA system.Fig. 1: Generic architecture of the Typical question answering system Question Analysis: Question analysis is the first module in QA which identifies the aim of the question, classifies the question types, derives the expected answer types and performs the query expansion.Moreover, the question analysis module determines the named entities appeared in the question [8].Question analysis could be considered as the most important step in a QA system.Since a right classification of the question will allow to limit the candidate answers to be considered.
In AQuASys [9] , question analysis module identifies the type of the expected answer by defining a number of rules for question types identification and their expected answers recognition.Additionally, it classifies the question words into three groups: interrogative noun, questions verb and questions keywords.Then, the additional keywords are generated and added to the questions keywords.Finally, the system performs stemming on question's and documents words.Kanaan et.al.In [10] use a set of Natural Language Processing (NLP) tools in this module which tokenize and tag the text, identify some features of the tokens and identify proper names in the question.In DefArabicQA [11], the module of question analysis is performed by identifying the topic of question and determining the expected answer type.The question topic is identified by using two lexical question patterns: (Who+be+¡topic¿) and (What+be+¡topic¿).The expected answer type is deduced from the interrogative pronoun of the question.
Question analysis In QARAB [6] is achieved by performing tokenization and stop-words removing.Then, The remaining words are tagged for part-of-speech.Also, this module includes the following task: identifying proper names, identifying the type of the expected answer, applying query expansion to achieve better results, classifying the question and tagging the question keywords for part of speech.
In ArabicQA [12], question analysis module determines the type of the given question, the question keywords and the named entities in the question.for example: if the question is: (When Sudan became independent?), the question type is: Time, the question keyword is: (Become independent: Verb), the name entity is: (Sudan) .In this work the authors have used the NER system which has been built by the same authors.
In JAWEB [13], The question analysis module identifies the type of the question.For example (if the question begin with " ", "where"; the question type will be location).In addition, it has five sub-modules: tokenizer, answer-type detector, question keyword extractor, extra keywords generator and question words stemmer.
The question analysis module in QASAL [14] starts by using a set of linguistic resources in NooJ 1 that is applied to the given question to analyze and annotate it.Then, NooJs graph editor was used to carry out some local grammars.These grammars translate each question into one or more regular expressions and that helps to represent the pattern of the answer corresponding to this question.The question analysis module allows the generation of all the candidate answer pattern regular expressions.
In Yes/No Arabic QA system [15], The question analysis module is performed by applying the following tasks: removing a question mark, removing an interrogative particle, tokenizing, normalize the (Alef) letter, removing the stopwords, removing the negation particles.In addition, apply tagging to determine the type of a word and obtain its root.Moreover, this module apply parsing and query expansion.The query expansion retrieves a list of synonyms and antonyms.Finally, the system represents a question using logical representation .The authors create 12 Logical representations for the Nominal and verbal sentences for both affirmative and negated questions. Figure 2 shows an example of logical representation of verbal sentence.Fig. 2: An example of logical representation by [15] Passage Retrieval: Passage retrieval is considered as the core of QA system.In this module the passages of documents that are relevant to the answer is retrieved.Usually, a 1 NooJ is an NLP environment.For more information http://www.nooj4nlp.netwww.ijacsa.thesai.orgdistance between question and documents is calculated.based on this distance and weighting schemes such as tf-idf [16], document retrieval systems supply a set of ranked documents.Most QA systems are based on IR methods that have been adapted to work on passages instead of the whole document [8].
Kanaan et.al.In [10] performs this module by applying information retrieving system which uses Saltons vector space model to measure the similarity between the query and the document.They use a database to store the information that is related to a word, query weight and the similarity of the query.then, a list of ranked documents that may contain the answer is generated.While in AQuASys [9] ,the passage retrieval module filters the sentences based on the number of question keywords they contain.
In DefArabicQA [11], this module returns the top-n snippets which are retrieved by the Web search engine.Then, this system performs definition extraction task.In this task, it identifies and extracts candidate definitions from the snippets by using lexical patterns.Finally, the candidate definitions will be Filtered by using heuristic rules.The passage retrieval system in QARAB [6] is based on Saltons vector space model.First, an inverted file system is constructed from the collection of text.Then use relational database management system (RDBMS) to hold and retrieve the passages.
In ArabicQA [12], the passage retrieval module (JIRS) 2retrieves the passages which are most probable to contain the answer.JIRS relies on using an n-gram model.This module apply three steps: First, it searches the relevant passages and assigns a weight to each of them.Second, It extracts the necessary n-grams from each passage.Finally, it compares between the question and the passage n-grams using the density distance model.
The passage retrieval module In JAWEB [13], retrieves the candidates answers from the corpus by retrieving set of passages.These passages are characterized by that they contain sentences which contain a pattern that matches words from the question keywords and extra keywords.
In QASAL [14], the passage retrieval module starts to select one or more regular expressions from those which are generated in the previous module.Then, applies those expressions to the answer text in order to identify the potential answers.And finally, the detected answers are displayed in a concordance table to be used later.
In Yes/No Arabic QA system [15], The passage retrieval module is performed by applying two levels of IR techniques: the first is on documents and the second on paragraphs.The paragraphs technique splits the documents into paragraphs and retrieve the top 5 paragraphs regardless of from which document they are, according to some indexing scheme.The document technique, in turn, retrieves the top 5 documents after they are ranked, then use the first indexing scheme to retrieve the top 5 paragraphs.
Answer Extraction: Answer extraction is the final module in most QA systems.This module distinguishes between QA systems and the usual sense of text retrieval systems [3].
In answer extraction module, the final answer to the question is extracted from the passages or documents that retrieved in the previous module.The search operation in this module is performed using information retrieved from the first module (question analysis).This information includes the focus and the target of the question.The answer extraction module extracts a list of ranked relevant answers, and finally returns the most probable one(s) [8].
In AQuASys [9] ,the answer extraction module uses a sophisticated scoring formulas are based on a large number of parameters.The aim of these formulas is to calculate the similarity between the given question and the candidate sentences.Finally, the answer that gives the highest score will be selected.
Kanaan et.al.In [10] choose the most appropriate document according to the similarity values, which are calculated by the IR system.Then,the system generates the answer.While Trigui et.al In DefArabicQA [11], strat to rank the definitions.Such ranking is achieved by applying a statistical approach.They use a global score to rank the candidate definitions.The global score is a combination of three criteria: pattern weight ,snippet position, and word frequency criterion.Finally, The first top-5 candidate definitions will ranked according to their global scores.
In QARAB [6], the input to the answer extraction module is the question words and the top ranked relevant documents.The authors assume that the best answer usually includes most of the words which appear in the query.Therefore, the system retrieves the answer that contains most words appear in the question in addition to the proper nouns that should appear in the final answer.
To extract a list of candidate answers from the relevant passage in ArabicQA [12], This module takes into consideration the type of expected answer and performs the following steps.First, It tags all named entities within the relevant passage.Second, It performs pre-selection of the candidate answers.Finally, It decides the final list of candidate answers by means of a set of patterns.Additional module is applied in this system, answer validation module , which estimates the probability of correctness for each of the candidate answers and ranks them.
In JAWEB [13], the answer extraction module consists of three sub-modules: answer keywords stemmer, answer similarity checker and answers ranker.The answer keywords stemmer returns the roots of the keywords in the retrieved answer.The answer similarity checker measured the similarity by counting the number of matching keywords between the question and retrieved answer.The answers ranker sorts answers according to their relevance.Then, return the top relevant answer.
To extract the answer in QASAL [14], in the previous module a concordance table of the potential answers is generated.So, in this module use such a concordance table to automatically extract the answer to the given question .
In Yes/No Arabic QA system [15], the answer extraction module applies the following steps: Split the paragraphs into www.ijacsa.thesai.orgtheir sentences.The module focuses in topic when it deals with normal sentences and it focuses in subject when dealing with verbal sentence.Then, the module looks for the remaining terms that derived from the question in logical representation , assigns those indexes according to their position in the sentence.So each sentence will have its own rank.After that, look for negation particles in the selected answer.Finally, use the selected answer and the logical representation of the question to generate yes or no as follows : return "Yes" if the question and the answer are affirmative .The question and the answer are negated.Or return "No" if The question is affirmative and the answer is negated.The question is negated and the answer is affirmative.
A comparative study of these eight systems is presented in table I

B. QA based on FAQ
The previous section has introduced an overview of those approaches that based on retrieving the documents that contain the answer and then extract and craft this answer.However, in this section we will shed some light on those approaches that depend on FAQ.Indeed, what we mean by this is that there is a bank of questions and associated answers and the systems are going to receive a user query to be answered and looks for the most appropriate answer(s) and retrieves them.
According to the prediction of Boris Katz et al. in [17], the next generation of search engines would be based on question answering in which users would receive explicit answers extracted from documents to their natural language queries instead of the current ones which retrieve only the relevant pages.
A tremendous amount of questions and associated answers are available online and one can get almost any answer to their questions based on previous answered ones.Therefore, this type of question answering systems are one of the important areas of research in IR.Several research studies have been conducted and reported in the literature to facilitate this type of IR.Unfortunately, a majority of these studies were not oriented to Arabic Language.However, in the past few years several researchers have tried to find clues to this problem in Arabic.Actually, we can differentiate between two type of researches that have been conducted to do such a task.The first and most common one is called answer selection and the second is answer extraction based on FAQ.
The former task is interested in identifying pertinent answers from a pool of user-generated comments related to a question.This means that all answers in the pool are related to the question with different degrees.Moreover, the pool of answers which are going to be nominees are very small.Though this type receives significant interest in several research [18] [19] in CLIF 2012 and [20] in CLIF 2013 [21]- [23], the most real-life QA-based-on-FAQ applications is not working like that.Instead, it looks for an answer from a flat huge collection, thousands, tens of thousands or maybe millions, of question-answer pairs.Several studies have been conducted to resolve such a problem using different technologies will be demonstrated in the following: • Using Textual Case-based Reasoning in Intelligent Fatawa QA System [24] In this work Elhelwany et.al.have proposed an Arabic Fatwa Intelligent system based on textual case based rezoning which was firstly used in [25].In their system, they started by extracting a representative term for each cluster which were later called clusters attractors.Then, the cases clustered around these attractors.Eventually, they used Jensen-Shannon divergence to assign a newly posed question to its appropriate cluster and, subsequently, to find the closest possible question among questions in such a cluster.
• Enhancements to knowledge discovery framework of SOPHIA textual case-based reasoning [26] In [26] The same authors of the previous system have enhanced their previous study by adding one tier in the middle.This tier was created by manually clustering the dataset into several groups so the SOPHIA will be applied to each group separately and that what make an enhancement as they reported.
• A Case Based Tool As Intelligent Assistance To Mufti [27] Nabila Nouaouria et.al in [27] have designed El Bayane.In their system, the authors started by representing cases manually in the following structure ¡product features, exceptions, product type¿ and in hierarchal order (most general to most specific).Their system is closed on the field of drinking and smoking in the Islamic legislation.To answer a new question, the system requests selection for specific predefined parameters as a representation to the question:¡question type, action, product name, product type, features, exceptions¿.Finally, they look for the most similar cases and return associated answer.This system is very domain-dependent because it is limited to a subdomain of drinking legislation in Islamic fatwa, limited on factoid questions, require exhausting manual work and usually designed for the situation where all cases have similarly structured content.
• Intelligent Tool for Mufti Assistance [28] In [28] Amari et.al start to organize already answered cases in a cases memory which lately will be known as case-base.They represent a case using two dimensions: case discretion ¡action type, product name, product type, features exceptions¿ and case solutions.
As it was reported, they use a new way to represent cases based on constructing a problem neighborhood to ease the retrieval of the cases later.In the test phase they introduce a system of five modules to retrieve the similar cases to a query: neighborhood computation, associative access, adaptation, validation and storage.Like the previous system, this system constraint a user to formulate predefined questions and require exhausted manual work to build the case-base.Unfortunately, none of the previously mentioned works have reported the evaluation approach they used or reported results.
• Answer Extraction System Based on Latent Dirichlet Allocation [29] In [29] Ali et.al have proposed a new system that based on Latent Dirichlet Allocation 481| P a g e www.ijacsa.thesai.org[13] provide short answers for Arabic natural language questions Close domain an extended version of the Arabic corpus developed by [9] The system provided 15-20% higher recall The system focused only on factoid questions An Arabic Question-Answering system for factoid questions [14] provide short answers for Arabic natural language questions Close domain a collection of Arabic text documents containing factoid questions as well as their different answers not presented the system focused only on factoid questions Development of Yes/No Arabic Question Answering System [15] design a formal model for a semantic based yes/no Arabic question answering system based on paragraph retrieval Open domain 20 documents which used to test the system and a collection of 100 different yes/no question The results of using documents technique:85% when 20 documents are used.The result of using paragraphs technique: 88% when 20 documents are used The system focused only on yes/no questions.and the corpus size is small (20 documents) (LDA) [30] and word tow vector space word representation [31].In their work the authors started to cluster the cases (documents contain questions and associated answer) into similar thematic groups.Then, to reply to a new query they started to assign this query into appropriate cluster and subsequently retrieve the most suitable answer to this question.As they reported, they achieved accuracy of 83.6 %.
A comparative study of these five systems have been presented in table II III.APPLICATIONS OF ARABIC QUESTION ANSWERING Question answering has many applications.Arabic language is the 6th important language in the world with more than 300 million speakers [2].due to the dramatically increase of the Arabic content on the internet, the increase of Arabian internet users, and increase demand for information that traditional information retrieval methods can not satisfy, an inevitable need for an effective information retrieval system is required.
Distance education also is gaining a lot of attention and has become a popular research topic.No matter where or when the teacher or student is, the communication between students and teachers is a very important.However, face to face communication is not possible.Question answering system based on FAQ is the solution in this case.Where the student asks a question, the answer is retrieved, if the question-answer pair is already in the database or it would be answered by the teacher later and saved in the corpus as well.Arabic Question-Answer pair also available in many and many Arabic websites such as Islamic Fatwa websites, Arabic medical websites, distance educational systems etc.

ANSWERING
According to the prediction of Boris Katz et al. in [17], the next generation of search engines would be based on question answering in which users would receive explicit answers extracted from documents to their natural language queries instead of the current ones which retrieve only the relevant pages.Moreover, we will have to deal with that queries that will posed by voice.That means, we will need more sophisticated speech recognition algorithms and techniques to deal with different Arabic dialect.Accordingly, more challenges will be posed to deal with detected users queries with different accents.We recommend [32] and [33] for further information in a speech based question answering.
482| P a g e www.ijacsa.thesai.orgAdditionally, a tremendous amount of information is available in the form of video, images, maps and sounds.We need new tools in the next generation of question answering to deal with this multimedia.These tools have to organize, search, understand and extract the answers out of this diverse representation of information.
In spite of this promising future of QA, it still has several limitations and issues that could be improved in the future.Some of these issues is due to the challenges of Arabic language that aforementioned in section 4.However, the others are associated with the system itself.Information retrieval models, for instance, still fails to return an appropriate set of answers at an acceptable level of precision and recall.The current systems cope with this problem by query performing expansion for those queries which got too many candidates to answer or by removing and eliminating some words from those queries which got too few candidates to answer [34].Another problem with question answering systems is that they still has limitations to capture semantic content for queries and answers.

V. CONCLUSION
In this article, an extensive survey of Arabic QA is presented and a new taxonomy for QA is introduced.This taxonomy is based on an answer retrieval perspective.QA is categorized into two groups: QA where the answer is generated and QA based on frequently asked questions.Note that all these classes involve retrieving the answer to a newly posed question using natural language.From the analysis of the relevant literature, we have organized and compared the studies of the first group based on the three tasks of the general structure of a typical QA (question analysis, passage retrieval and answer extraction).In this survey, the studies have been investigated and compared based on several criteria such as, aim, domain, datasets and performance results of the systems.
Despite the efforts that have been made in the field of Arabic QA, there are still some issues and limitations that have not yet been addressed.Some of these important issues have been presented above.Finally, applications and the expected future directions of Arabic QA have been discussed.

TABLE I :
Comparative study of eight QA systems when the answer is generated

TABLE II :
Comparative study of five QA-based-on-FAQ systems