Arabic Dialogue Processing and Act Classification using Support Vector Machine

— Text classification is the technique of grouping documents according to their content into classes and groups. As a result of the vast amount of textual material available online, this procedure is becoming increasingly crucial. The primary challenge in text categorization is enhancing classification accuracy. This role is receiving more attention due to its importance in the development of these systems and the categorization of Arabic dialogue processing. In the research, attempts were made to define dialogue processing. It concentrates on classifying words that are used in dialogue. There are various types of dialogue processing, including hello, farewell, thank you, confirm, and apologies. The words are used in the study without context. The proposed approach recovers the properties of function words by replacing collocations with standard number tokens and each substantive keyword with a numerical approximation token. With the use of the linear support vector machine (SVM) technique, the classification method for this study was obtained. The act is classified using the linear SVM technique, and the anticipated accuracy is evaluated against that of alternative algorithms. This study encompasses Arabic dialogue acts corpora, annotation schema, and classification problems. It describes the outcomes of contemporary approaches to classifying Arabic dialogue acts. A custom database in the domains of banks, chat, and airline tickets is used in the research to assess the effectiveness of the suggested solutions. The linear SVM approach produced the best results.


INTRODUCTION
Arabic is recognized as a challenging dialect by non-native speakers of the language in the field of automated language processing. The recognized dialogue acts provide accurate information when the user ask questions and remain silent are typically utilized as an input to the Dialogue Manager component to assist the system decide what to do next. Depending on the dialogue system domain, the Dialogue Acts taxonomy varies [1]. To identify speech acts in dialogues, a variety of techniques have been employed, ranging from rulebased approaches to deep learning and machine learning techniques. To overcome the lack of appropriate training data, novel language interpretation methods have been created [2].
The area of artificial intelligence is employed rarely. Several computer researchers and scientists are engaged in the humanities in order to create computer programmes that accurately scan Arabic writings and convert them into digital formulas. This is because Arabic and artificial intelligence are closely related fields of study. With the advent of new approaches like Intersent, Elmo, wordszvec and Use, AI was impacted by the introduction of some sophisticated methods for presenting the semantic meaning of texts [3]. In text extraction and processing, conventional ML techniques are typically employed. These techniques include Naïve Bayes, Stochastic Gradient Descent (SGD), and SVM with a linear kernel. The STS problem is addressed using these techniques. Utilizing statistical-based, string and character-based and distance-based similarity metrics, several features are extracted and utilized [4]. The several perspectives for a specific speech sound have typically been represented by statistical methods like Gaussian mixture frameworks [5]. These methods can be employed in a variety of tasks since they are in fact rich in mathematical techniques.
The simplest language communication unit, or dialogue acts (DA) are those that convey the intention of the speakers. Automated dialogue act identification is a crucial indicator for many activities, including subject detection, machine translation, summarization, dialogue systems and human conversation interpretation. Annotation and segmentation are the two key subtasks in dialogue act detection [6]. These two phases can be completed separately-segmentation comes first, then annotation is joined in a single stage. In the context for recognizing dialogue acts, the annotation task is crucial. A label describing the user's intention for every segmented statement during the dialogue is assigned. ML techniques have been tested in studies to detect new DAs. Linear methods and vector-based methods are examples of supervised modeling techniques that are often employed [7]. Limited vocabulary limits are utilized to achieve high speech recognition rates. In reality, it is thought that this outcome is adequate for the majority of voice activation devices to be implemented. As a result, both the learning time and error rate are gradually reduced. Additionally, this rate is plainly erratic and is influenced by lexicon and language. Nevertheless, the creation of advanced ASR systems with increased speech recognition rates has emerged as an intriguing topic of interest for all academics working in the field of speech recognition. In fact, a number of categorization and parametric techniques have surfaced to complete the task. Various categorization techniques, including Gaussian mixture model, Decision tree, machine learning via ANN, K-Nearest Neighbor, Dynamic Time Warping (DTW) and HMM, have been utilized in a number of publications for isolated-words identification systems [8]. But these are inefficient and consume time while attaining the results. www.ijacsa.thesai.org techniques to recognize spoken words. Because of the considerable advancements made in this area, Automatic Speech Recognition (ASR) technique is widely utilized. Voice recognition technology called ASR has been designed to assist individuals to communicate better with one another. By utilizing these ASR technologies, it is possible to overcome the challenges imposed on the world's various dialects. As a result, a number of strategies have been developed in the ASR field. An effective method for dialogue processing is the Support Vector Machine [9]. Support Vector Machine (SVM) is used as an estimator of posterior probabilities to improve the performance of identification systems as it has a strong predictive capacity and discrimination. They also have a structural risk minimization (SRM) foundation, where the goal is to create a classifier that minimizes a limit on the predicted risk instead of the empirical risk [5]. Segregating classifier of SVM is considered to have a few practical applications. An algorithm of machine learning for binary categorization is SVM categorization [8]. For multiclass assignments in the real life, it is necessary to change the system's decision-making portion.
SVM is initially intended to function in situations when data has only two classes. In other terms, binary classifier issues can be solved using it. The goal of the multiclass SVM issue is to label cases by selecting labels from a limited number of various elements. Reducing the single multiclass issue into many multiple binary classification issues is the conventional method for utilizing SVM to solve this issue. Building one-versus-all classifiers and selecting the class that correctly categorizes the test examples with the largest margin of error is the most widely used strategy in practice [10]. The presented method first transforms the reader's sound waves into Mel-Frequency Cepstrum Coefficients (MFCC) features before generating a features vector matrix. To train the modified SVM-based technique, portions of the retrieved features are utilized. The trained SVM performed really well when tested twice with the other portion of the collected information. The suggested SVM-based technique was examined employing real-world data. When assessing and contrasting the findings, the tests had very good outcomes. Utilizing the identical datasets of gathered waveforms, the outcomes of the presented SVM-based technique were evaluated with those of other methods for comparison study. The developed SVM-based method surpasses competing methods in terms of accuracy.
Background: The situation is complicated, nevertheless, by the degrading language used in speeches delivered by professionals. As a result, it becomes vital to create new tools to aid in the comprehension of flawed Arabic discourses. Speech that is illogical will then be automatically corrected [11]. In Arabic, there are four divisions: Model Standard Arabic (MSA), which is used in newspapers, books, broadcasts, documentaries, formal situations, or when the reader or audience is from a different nation of Arabic, and Dialectal Arabic (DA), which is a dialect of everyday experiences, are all still in use today. Ancient Arabic is no longer in use [12]. In formal situations like journalism, presentations, and courtrooms, the MSA is widely trained in schools and colleges. All countries that speak Arabic acknowledge MSA as their primary language. Additionally, people frequently use their own languages in everyday conversation. These languages are taught in schools, but there is no organized written record of them. Arabic dialects have extremely distinct morphology, vocabulary, phonology, and syntax from MSA. Dialectal Arabic is the language that is organically spoken in daily life (DA). It varies from one country to the next and can be seen within a country [13]. The user's inquiries and silence are often used as input to the Dialogue Manager component to help the system decide what to do next. Recognized dialogue acts give accurate information when the user asks questions and remains silent.
The remainder of this strategy is distributed into the succeeding divisions: Section II presents the pertinent works and provides a comprehensive analysis of them. Section III describes the problem statement. A thorough review of the proposed method and analysis for the prediction of act classification is provided in Section IV. The results of the experiment are given, examined, and in-depth assessed in Section V, along with a comparison. Finally, the paper is concluded.

II. RELATED WORKS
The paper [14] utilized DL-based method to show how the meaning of the original and suspect reports was identical. By projecting each potential investment onto its neighbors, the word2vec method did in fact identify the pertinent information. Sentence relating to various could then be produced by combining the acquired dimensions. CNN was then employed to gather more environmental data and determine the level of measuring the similarity. Due to the dearth of accessible to the public material, an edited collection was created using the skip gramme approach. It performed better when substituting an original term with one from language skills that was most comparable and belonged to the same grammatical class. In terms of precision (85%) and recall (86.8%), the work improved efficient environmental connection recognition between Arabic materials compared to earlier studies. for measuring semantic similarity and document modelling, CNN model with various statistical regularities was utilized. A word2vec-based Arabic paraphrased corpus was created for the trials due to the dearth of Arabic paraphrased materials that were accessible to the general audience. Every word from the OSAC source corpus was replaced with its vocabulary counterpart that was the most similar and shared the same grammatical class.
The automatic comprehension of Arabic dialogues for the Egyptian dialect at the utterance level is a challenge that addresses in the paper [15] utilized a machine learning approach, called YOSR. The extractor has been tested on a dataset of impromptu conversations and instant messages for the Egyptian dialect. The results from the YOSR classification are really encouraging. These are the first outcomes for comprehension of the Egyptian dialect that have been recorded, as far as aware. As a viewpoint, we intend to enhance YOSR by including standard call-center regarding the service, context-based characteristics, regional phrase modifications, and morphological information such the initial verb form and Lemma. In order to enhance the categorization Modern Arabic natural language processing, or ANLP solutions are being developed using machine learning techniques. ML algorithms are commonly employed in NLP because to their high level of accuracy regardless of how strong the input signal is and how simple they are to utilize. On the other hand, the method used in ML-based ANLP implementations entails a number of phases. This evaluation clarifies the concept in detail, illustrates how ML techniques were used to develop these tools, and identifies well-liked ANLP techniques. The paper [16] covers the importance and specifications of ANLP as well as the characteristics and challenges of the Arabic language. Arabic sentiment expressions in tweets can be difficult to distinguish in sentiments categorization software. The complexity of the Arabic script and the unorganized nature of Twitter usage may both contribute to this problem; however, Textual data usability evaluation still needs improvement.
The paper [17] utilized DNN framework was used for sentiment classification, text translation, and text analysis. To text categorization are assessed using a sequence-to-sequence encoder-decoder paradigm, which would consist of NN trained concurrently on outputs and inputs. DNN make use of huge datasets to enhance their accuracy. These linkages are supported by the method, which by identifying textual focal areas, may also be able to manage extensive texts more successfully. This work has re-implementing the basic text useful when analyzing and adapting the sequence-to-sequence structure to Arabic because Arabs have never seen the usage of this approach for text summarizing. About 300,000 elements make up the data set, and each one comprises the title that corresponds with the corresponding article information. After applying standard summarizing techniques on the prior data set, the results are contrasted using the ROUGE index. However, the volume of the AHS information created and raised to the Gig word collection, but it is not much larger than the CNN/Daily Mail database. Additionally, it might make use of other strategies that would be advantageous in the sequential design, such as See et AI demonstrations of a comprehensive approach that uses a various performance vector. The perfect place to concentrate on would be trying to infer new designs that are useful with this vocabulary knowledge in multiple languages, considering that Arabic is a particular textual phrase read from right to left. The research does not take the possibility of increasing the data set into account.
In [18], used an Arabic Alphabet Sign Language Recognition System that is vision-based. Four distinct stages make up the process. The technique can be used with three possible database types: data with hands bound and an alongside a dark tiled wood, information with bare hands and a white base, and data with face buried in a darker-collared glove. By using one of the offered methods in AArSLRS, a hand is first removed from the image and separated from the background before the hand features are eliminated using the search technique that was employed to eliminate them. For the classification of the 28-letter Arabic alphabet using 9240 pictures in this work used methods of supervised learning. To focused on categorizing the 14 special characters and using Quranic sign language to sign the first verses of the Qur'an. It has been proven that picking a resource allocation problem is the optimum choice and that the neighborhood k value of the K-mean clustering algorithm affects rating dependability. However, the method needs more time for the processing phase when compared with other methods.
For an experimental tests of the connection between traditional in the paper [19] employed SR corpus compensation approaches (feature vectors, data selection, sexual identity acoustic frameworks, and dialectdependent/register-dependent variance across Arabic ASR samples). The first interaction examined in the paper was that between intermittent syntax variance and auditory tracking performance. By removing speakers with inadequate teaching data and switching to grapheme-based acoustic models instead of phone-based ones, discrete specific language difference can be rewarded for. The latter method also helps to make up for poor capture performance, which is further made up for by removing delta-delta acoustic features. Together, the three methods decrease Word Error Rate by 3.24 percentage points to 5.35 percentage. Alteration in the perfectly alright acoustic speech sounds from each phoneme in the phrase is the second feature of regional and registration diversity to be taken into account. Building way of predicting and dialect-specific algorithms contributes to significant reductions in WER since empirical findings show that sexual identity and language are the primary contributors to diversity in speech. Cross-dialect investigations are carried out to gauge how far different Arabic dialects are in terms of the acoustic variations between phone models needed for each of them.
Natural Language Understanding has been vastly enhanced by DL methodologies such as word representations and DNN architectures. This [20] work provides a method for Arabic home automation using DL approaches for text categorization and text categorization recognition. In order to achieve this, to provide an NLU component that can be additionally connected with ASR, a dialogue administrator, and a natural language generator component to develop a fully functional making the change. The procedure of gathering and categorizing the data, constructing the purpose classification and concept extraction frameworks, and eventually the evaluation of these techniques against benchmark datasets are all included in the study. The benchmark results showed that the LSTM effectiveness, with an F-Score of 92.0, was marginally superior to the CNN achievement for both intent classification techniques. The paper employed a hybrid representation of word representations and personality language models, which is then fed to a Bidirectional LSTM network, to retrieve the user objectives and purposes from the information. A high F-Score of 94.0 for the BiLSTM with the Char Embedding's test suggests that its efficiency is extremely comparable to the most recent English Named Entity Recognition benchmarks. However, to create an undertaking dialogue system, the Natural Language Understanding module can be combined with automatic speech recognition and natural language generation modules. www.ijacsa.thesai.org III.
PROBLEM STATEMENT This issue may be exacerbated by the intricacy of the Arabic script and the disorganised Twitter usage. The evaluation of the usability of textual data still needs work, though. The potential for expanding the data set is not considered in the current research. When compared to alternative ways, the processing step of the current method takes more time. To determine how diverse Arabic dialects are from one another in terms of the acoustic differences between phone models required for each of them, cross-dialect examinations are conducted. However, the Natural Language Understanding module can be integrated with the automatic speech recognition and natural language creation modules to form a complex dialogue system. To develop a classification model for sentential act prediction, the existing technique used a selected group of features extracted from annotated utterances and applied in "YOSR", an SVM technique based on ML. SVM designs ought to be used as a result for improved processing.

IV. METHODOLOGY
The suggested approach is based on Arabic conversation processing and employs a machine learning algorithm to classify Acts. The contact among the operator and clients in various industries is the subject of the study. The data was gathered from the Jana Corpus in three distinct industries, including banking, travel, and chat. The research's major objective is to establish a connection with the speaker and operator and help them comprehend the dialogue, whether it's a request, an obligation, or something else entirely. Previous studies have used a variety of algorithms, including deep learning, gradient boosting, convolution neural networks, and natural language processing, but they have not been able to achieve higher accuracy rates. As a result, the research focused on interpreting Arabic dialogue without cues and makes use of chatbot technology and machine learning algorithms to increase productivity. The research is conducted in the stages listed below, and Fig. 1 shows the flow diagram based on the suggested method.

A. Dataset Collection
The data collection used in this study was compiled from a variety of sources, including chat, banks, and flights. The vast amount of material on the website is useful for any chatbot application. The CAMeL tool module for Python was used to parse Arabic content and retrieve data. To address a variety of issues, researchers present CAMeL Tools, a user-friendly Python toolkit for Arabic and Arabic dialect pre-processing, morphological modelling, dialect recognition, text categorization recognition, and sentiment analysis. The application programming interfaces (APIs) and command-line interfaces (CLIs) offered by CAMeL Tools cover these functions. This also processes Arabic dialogue using data from chat, planes, and banks. The bot can practise on a dataset, which is a collection of different input expressions and outputs. This effort will allow Arabic programmers to use more dialects of Arabic. The gathered data were split into training and testing samples, with training data making up 70% of the samples and testing data making up 30% of the samples. The sample's division for the testing and training processes is shown in Table I and Fig. 2. 183 | P a g e www.ijacsa.thesai.org

B. Preprocessing
Preprocessing's function is to add morphological and other additional context to the lemmatizes (and other following elements') input in order to simplify the lemmatization process. The preprocessing performed by the lemmatization technique is quick and light in comparison to cutting-edge tools. Part-of-speech tagging, performed as a quick machinelearning-based sequence labeler, is used to first enhance the tokenized input text. A very basic word segmented component is used to progressively segment it, reducing the complexity and ambiguity of the words. The procedure' simplicity, which is described in full, can be attributed to the POS tagger's output of a wealth of morphological features, which makes word segmentation essentially easy [21].
In order to make the lemmatization process simpler, prepressing's role is to add morphology and other contextual info to the lemmatizes (and other following sections') input. The algorithm performs tokenization, function word elimination, and lemmatization for each article in the database. There are numerous Arabic dialects. The analysis of dialect and vernacular terminology follows. Assessments are often written by researchers in their original tongues. Depending on the dialect, they use different terms to convey the same meaning. The solution to this problem was to create and use a dictionary of dialectical terms and their equivalents in standard Arabic [22].

C. Morphology Analyzer
Morphological analysis is the first stage of text preparation. Breaking down morphologically complicated words into their individual morphemes is known as morphological analysis, also known as analysis of structures (word meaning parts). The morphological analysis of these phrases is then carried out using the Alkhalil analyzer. As a consequence, researchers are provided with all potential lemmas and associated morpho-syntactic information for each individual word in the text that has been analysed out of contextual [23].

D. Normalization
Experts in Arabic dialogue processing typically employ orthographic normalization as a fundamental technique with the goal of reducing data noise. This is true whether the intention is to produce parallel texts for machine learning, information retrieval data, or computational linguistics text. Examples of normalizing include Tatweel elimination (an effort to eliminate the Tatweel sign), Diacritic removal, and alphabet normalization (variant forms to one form conversion) [24]. The tokenization procedure should become more confusing at this point even if these normalizations would help us in the search or discovering phase. As an illustration, if Ta-Marbuta ‫)ة(‬ and Ha ‫)ه(‬ are normalized, the latter could be tokenized as a pronoun. Because of this, normalizations are treated as a stepping stone for querying, finding, and other operations in the research.

E. Tokenization
Reducing the raw text into smaller, more manageable chunks is tokenization. It is not an easy problem to solve since tokenization is "closely related to the morphology". This is especially true for dialects like Arabic, which have rich yet complex morphologies. Flowing text is divided into tokens by tokenizers so that they can be processed further by www.ijacsa.thesai.org morphological transducers or POS taggers. The tokenizer recognizes numbers, word boundaries, clitics, multi-word phrases, and abbreviations. By doing this, the original text is divided into tokens, or words and sentences. These tokens support simulation techniques for various processes or contextual comprehension. Tokenization makes it easier to understand the text's significance by looking at the word order. Stop words are typically eliminated in the initial phases of tokenization even though there isn't however a defined list of Arabic stop words [25]. There could be several degrees of Arabic tokenization built, depending on how complicated the linguistic study involved is. Researchers worked with the Arabic grammar to develop three distinct strategies, or frameworks, for Arabic tokenization. In terms of durability, adherence to the modular concept, and ability to remove unnecessary ambiguity, these systems differ greatly from one another. The tokenizer uses white space and punctuation to distinguish among major tokens. However, when defining sub-tokens, the tokenizer needs more morphological information. This information can be provided either deterministically or in-deterministically by a morphology converter or a token forecaster. In the end, both main tokens and sub tokens are identified by the same token borders, which are denoted by the symbol "@" throughout the entire article. It is believed that separating the text into primary tokens using capitalization and white spaces is a straightforward strategy [26].

1) Utterance feature:
The smallest speech unit in spoken language analysis is termed an utterance. It is a continuous chunk of speech with a distinct pause at the start and the finish. Oral languages typically, but not always, have silence as their boundary. Only their depictions can be found in written language; utterances are absent. The following Table II shows the extraction of utterance.
The utterances' meta data could be used to help with dialogue act classification. Additionally, the classification procedure could benefit from knowing what happened before the current utterance. Researchers made use of: Utterance Speaking Style: The speakers or listeners of the current utterance may be able to discern the act of speech based on their speaking style. The act of "Service-Question," for instance, relates to the customer because they contacted customer service to ask about just one service offered, whereas the acts of "Other-Question" and "Choice-Question" relate to the control system since the operator chooses one of the services offered or asks the client's identity.
Previous Utterance Act: The classification may anticipate the act of the present utterance by understanding the sequencing of previous statements in the dialogue. For example, the act "Confirm Question" almost immediately follows the acts "Agree" and "Disagree".

F. Lemmatization
The main element of our lemmatization strategy is a classifier that uses machine learning. It accepts m-word segments as input along with their matching POS tag, taking context (utterances and labels) into consideration. The inherent ambiguity of Arabic words without diacritics, whose meanings are often inferred by both individuals and computers from context, justifies the learning-based method. The introduction of new cases necessitates retraining the analyzers. Another disadvantage is that classifications, like the one from OpenNLP that the researcher utilized, frequently settle on an output variable that may or may not be accurate. In this case, additional NLP processing methods from the entire collection of probable lemmas may be able to yield reliable results. To solve these situations, investigators supplement the learningbased lemmatize with a dictionary-based lemmatize. Fig. 3 illustrates the results of the lemmatization. The analyzer is first used by researchers to look at the words in the database's vowelized version. Once this has been done, researchers only save lemmas whose given lexical tags (clitics, root, and stem) coincide with the word's in-document clitics, root, and stem. The analyzer has located and examined the corpus's words. One potential lemma of the phrases was also discovered during this initial step's study into prospective lemmas. Following the identification of the likely lemmas for each word in the corpus, researchers asked two linguists to select the appropriate lemma from the available options. If the correct lemma is not given among the potential lemmas, the speaker provides the proper lemma to the phrase.
After being lemmatized, the text is again processed to provide the class feature. The class feature obtained in this process is split into final features and final labels, as shown in Fig. 4 and 5. Out from clean text, the first word is acquired, the first verb is filtered and the class feature is extracted, and eventually the class feature is again filtered to obtain the last feature and final labels, as shown in Fig. 4 and 5. As part of this procedure, the provided data were divided into training and testing phases, which were used to classify the data using machine learning.   (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 14, No. 1, 2023 186 | P a g e www.ijacsa.thesai.org

G. Multilayer outcome and Three Layer Hierarchical
Classifications A kind of machine learning called multi-output classification predicts many outcomes at once. When generating any prediction in multi-output classification, the system would generate multiple or more outcomes. In other classes, the system usually forecasts just one result. Three models-one for each layer-are primarily used in the construction of the suggested technique. As a result, researchers believe that proposed method of classification is quicker than existing classification and may be a more effective dialogue act classification model in real-time systems. The second distinction is that, while "this approach" does not change with the number of dialogue acts or classes, the variety of models does change when binary categorization is utilized, as in the researchers' strategy [27].

1) Three layer hierarchical:
Traditional classification requires the construction of 26 categories, one for each dialogue act, as opposed to hierarchical structure categorization, which is separate. Every word stated in a discussion is communicated to succeeding tiers to determine the next course of action, which is the fundamental tenet of a hierarchical organization. The recommended method classifies dialogue utterances using a three-layer hierarchical framework. In the first layer, the utterance had already been assigned to one of the three fundamental categories-request, answer, or other. The speech was categorized into various discourse actions in the second layer based on its primary category, which was established or categorized in the first layer. Question, turn management, social responsibility, social courtesy, argumentation, response, and discourse structure are some of them. Fig. 6 illustrates the hierarchical classification that may be made using the multi-class classification technique.  It is a trim, efficient categorization device.
Local Classifier per Node:  Each node should have a binary classifier. www.ijacsa.thesai.org  Multi-labeling comes easy to this strategy.
Local Classifier per Parent Node:  Every parent node receiving a single multi-class classifier.
 This is the most logical method.
 This strategy is far more frugal than the prior one.

H. Classification
The categorization process is conducted out following the lemmatization technique. In earlier studies, the majority of researchers classify the text using NLP, deep learning, SVM classifiers, gradient boosting, etc. Linear Support Vector Machine is utilized in this study to categorize the provided data. The given dataset was divided into test and training data, and the classification method was applied to the data.
1) Linear support vector machine: One of the most popular and well-known machine learning methods is called a support vector machine (SVM). SVMs are supervised learning models that are further used to evaluate and classify text data. SVM is also frequently used for regression evaluation and text classification. Research introduced and applied SVM in text classification and categorization in 1998. The basic purpose of the SVM training method is to create a model that classifies fresh documents into a number of predetermined groups. SVMs could also be employed as non-linear and linear classifiers. A relatively recent class of machine learning algorithms called Support Vector Machines (SVM) was first developed. Based on the data mining and machine learning theory's structure risk minimization concept, SVM seeks a decision surface to divide the training data points into two classifications and bases its judgments on the training examples that are chosen as the only useful components in the training dataset. The SVM Classifier is used for the classification of the input. The Stages convoluted in the procedure are: According to the definition of natural language processing, this describes the conversion of any raw input into highly comprehending. Fig. 7 depicts the pre-processing stages that are involved.

V. RESULTS AND DISCUSSIONS
Three datasets-the Banks dataset, the Flights dataset, the chat dataset is presented in the experimental findings. SVM, Nave Bayesian, and logistic regression approaches are used independently for each dataset. Moreover, the performance evaluation of different dataset for both multioutput classification and 3-layer hierarchical structure is also presented. Multioutput and local hierarchical classification apply to classifier 3-layer or 3 level structure (primary Categories, secondary Categories and dialogue act ) Primary Categories is (Request, Response, and Other) secondary Categories is (Question, Turn Management, Social Obligation, Social Courtesy, Argumentation, Answer, and Dialogue Structure. First classifier primary categorise next classifier second categories depends on the first classification and last act classification depends on first and second classification.

A. Performance Evaluation
Accuracy: Accuracy processes how accurately the system model operates. Usually, it is the proportion of correctly predicted observations to all observations. Accuracy is uttered in eq. (1), Precision: Precision is estimated as the number of correct positive estimates of the correct text. It is the fraction of precise identification of the text that is computed utilizing eq. (2), www.ijacsa.thesai.org (2) Recall: The recall is defined as the ratio of the entire true positives and false negatives to the right positive forecasting accuracy as shown in eqn. (3).
(3) F1-Score: The F1-score measurement combines precision and recall. Precision and recall are used to calculate the F1score measure that is symbolized in eqn. (4). (4) In the following research Arabic dialogue processing is carried out in machine learning algorithm and the accuracy is obtained and compared with the existing method.

B. Logistic Regression
The following Table III displays the logistic regression's performance measure for the three parameters of recall, precision, and F1-score for various interactions, including Confirm inquiry, Correct, Disagree, Greet, Inform, misunderstanding sign, Pause, Self-introduce, Service response, Service question, Suggest, accepting request, Thanking, Other answer, other question Warning: Turn assignment.  Table IV using local support vector machines. Without cues utilizing a support vector machine is explained in Fig. 8. For all datasets, data without cue using support vector machine classification produced better results in terms of recall, precision, accuracy, and f1-score for data without cue. Without employing cue data, a support vector machine was used to achieve an accuracy value of 0.83 and a recall value of 0.85.
A baseline system employing the suggested feature set even without hierarchical characteristics (the major category of the present utterance and the major category of the prior utterance) has been developed in order to assess the efficacy of a Local hierarchical classification in categorizing dialogue acts when used in discussions, flights, and banks. In terms of recall, F measure, precision, and accuracy macro averages, the proposed systems' performances are shown in the table and figure, with multi-output three-layer classification producing the largest result and highest result.
Tables V and VI depict the multioutput classification and local hierarchical structure to interpret data without cue. Performance assessment accuracy findings for bank datasets are higher utilizing Multioutput classification and the Hierarchical structure for supplemented data without a cue. Fig. 9 displays multioutput classification with enhanced data without a prompt. Fig. 10 illustrates data without a cue using a hierarchical structure.

D. Performance Comparison
Compared to the logistic regression approach shown in Table VII, the projected methodology support vector machine achieves a greater level of efficiency. Compared to performance assessed using the logistic regression method, the novel support vector machine produces greater accuracy which shown in Fig. 11. Using the support vector machine model, an accuracy level of 0.83 was achieved in this case. This suggests that a support vector machine is more effective at classifying acts than the logistic regression approach.

VI. CONCLUSION
The structure that supports Arabic dialogue processing has been modified for better machine comprehension. The proposed classifier was tested using the Jana corpus of realworld spoken dialogues, and the results are quite positive. Researchers made progress in this work by developing an Arabic Dialogue Processing that will allow users to interact via chat, banking, and flights using textual documents. Additionally, researchers provide the JANA corpus, a multigenre collection of spontaneous Arabic dialogues that has been tagged at the utterance level in Arabic Dialogues Act Understanding. A technique for classifying utterances according to whether or not they contain function words was created in this work. Among the attributes of this methodology are a database and a set of function words. On the selected dataset, a number of classifiers were tried to see which one would be most useful over the long term. The support vector machine technique is employed in the study to find the customer and operator responses. Better accuracy is attained when comparing the classification system's performance metrics between the logistic regression approach and other methods in the results section. When compared to the other method, the accuracy of the support vector machine method is 0.83. Researchers want to eventually enable clients to make reservations over the phone by connecting our Dialogue Processing with an active booking system, banking, and chat.

Comparison of Accuracy
Logistic Regression Linear SV www.ijacsa.thesai.org To serve as a standard annotated corpus for testing and assessing several Arabic software applications in the future. Finding the appropriate Support Vector Machine parameters can increase accuracy, therefore future research will examine the impact of each parameter. Future study may focus on determining the best approach to integrate various dialogue act classification development techniques in order to enhance the utilization of any technique, even the most basic.