Rider Driven African Vulture Optimization with Multi Kernel Structured Text Convolutional Neural Network for Classifying e-Commerce Reviews

Opinion mining is a natural language processing based on sentiment classification technique to determine the sentiment of the reviews. The major existing text Convolutional Neural Network (CNN) algorithms are derived based on 3 × 3 size kernels which extract ineffective review text-features and lead to less classification accuracy. Moreover, most of the traditional CNN versions output three classes such as positive, negative, and neutral as their classification results. Hence, a novel algorithm namely ‘RAVO driven Multi-Size Kernel structured Text CNN for classifying ecommerce reviews (MSKTCNN-RAVO)’ is proposed in this work. This proposed approach utilizes five multi-size kernels (3 × 7,5 × 7,1 × 3,1 × 5,1 × 7), multi-dimensional kernels (1D & 2D), and integrates varying size kernels to extract text-features effectively. In addition, the performance of multi-kernel CNN is highly enhanced by RAVO algorithm based on rider optimization. Moreover, the proposed approach is highly effective to process 'review-stop-words removal' that decrease the complexity and time consumption of the opinion mining process. Most existing systems use single pooling operations which reduce feature map processing performance, hence, dual pooling operations (both Max and Average pooling) are employed in this research. Furthermore, it is configured to generate five classification outputs such as bad, fair, neutral, good, and excellent to support better decisionmaking with 95.5% accuracy. This method is evaluated using different quality metrics and five review-databases to measure the performance, and the results reveal that the proposed method outperforms the other existing review classification algorithms. Keywords—Natural language processing; opinion mining; convolutional neural network; text sentiment classification; ecommerce review


I. INTRODUCTION
Social media plays a very important role in almost everybody's day to day life. It allows the people to convey what they think and feel about the products in the e-commerce website. This is called an opinion or review. Online shopping is a form of e-commerce which allows consumers to directly buy goods or services from a seller over the internet [1]. Most of the e-commerce websites incorporate provisions which enable the consumer to post their opinions about the product, companies, or their experiences etc.
Online shopping is being performed by millions of users every day, as a result of this, a huge amount of reviews are being generated constantly. Handling these reviews manually to extract knowledge is a tedious task and hence, companies use classification tools to categorize the customer reviews to understand the customer's mindset. Opinion Mining is also known as Sentiment Analysis (SA) is the automated process of identifying opinions in text, and labeling them as positive, negative or neutral, based on the emotions expressed by the customer. It includes text analysis, computational linguistics to identify and extract subjective information in source materials [2].
The rest of the article is structured as follows: A summary of traditional methodologies for review classification is provided in Section II. Section III covers the proposed approach in detail with diagrammatic representations. Results and discussions are included in Section IV. The work has been concluded in Section V.

II. RELATED WORK
The existing review classification algorithms discussed in various works of literature are summarized in this section. Anam et al. [3] proposed a voting classifier (LR-SGD) model for textual-tweets classification as happy or unhappy for emotion recognition. The dataset contains a lot of contrary tweets which are used for evaluation. The weakness of this work is that, the combination of different models should be employed to increase the performance. Ren and Wu [4] proposed a sentence-based analysis model to identify investor herd behaviour. The data is taken from blue-chip stocks in Chinese stock market. The work's limitation is that it uses only a small amount of data for experimentation, arising doubt on the algorithm's efficacy. Sasikala and Sheela [5] expressed a sentiment analysis technique called Deep learning modified neural network to process the online product reviews. The food review dataset has been taken as input for the proposed technique. The drawback is that, the keyword processing only detects the sentiment expressed in a single word, and frequently fails to provide all information needed to interpret the context. Zeeshan et al. [6] proposed an opinion mining methodology using lexicon and neural networks for classifying online movie reviews. The reviews are collected from the IMDB database. The disadvantage is that, the resultant vectors will be in larger dimension and contain a large number of null *Corresponding Author.
Gupta et al. [10] proposed a feature-based supervised model to identify the extremist reviewers who target whole brand. The datasets are created by crawling reviews from Amazon website. The weakness of the work is that it requires larger training time for even smaller datasets. Madbouly et al. [11] proposed a hybrid classification approach of tweets based on user ranking for online social networks. The dataset consists of tweets with their feature used as input. Tweets were filtered manually to exclude non-English tweets which will increase the execution time of the algorithm. Bhalla and Bagga [12] proposed an RB-Bayes method based on Naive Baye's theorem for prediction to remove problem of zero likelihood. The algorithm is evaluated on a small dataset which contains text data. The efficiency of the model is not proved in large-scale databases. Zhang and Zhong et al. [13] proposed an e-commerce reviews mining method called sentiment similarity analysis to explore user's similarity and trust. The experimental dataset is collected from Amazon.com. The essential parameters involved in calculating the trust between users is not considered in this work which leads to less accuracy.
Aziz et al. [14] proposed the Contextual analysis mechanism to find the relationship between words and sources to predict Supervised machine learning model performance. The experiment is conducted on four different domain datasets collected from Amazon. The result of the prediction algorithm is less while performing real time analysis for individual changes of dataset. Iqbal et al. [15] proposed a hybrid framework for sentiment analysis which bridges the gap between lexicon-based and machine learning approaches. The reviews dataset is collected from UCI ML repository. The proposed framework works only for specific domain and does not support other domains like cyber-intelligence, lawenforcement sector, etc.
Liu et al. [16] proposed a modified fuzzy approach for cyber hate classification. This model uses four datasets collected from Twitter regarding four types of hate speech. Because the intersectionality of different types of hate speech is not addressed, more diversified features are extracted for hate speech detection. Fang et al. [17] proposed a multi-strategy sentiment analysis method with semantic fuzziness to extract the customer's opinions expressed in sentiment Chinese phrases. The input reviews are taken from search forum website. Since the emotional phrases are not included in this work, calculation errors arise which results in lower accuracy.

A. Problem Statement
In the field of SA, machine learning methods such as Decision Tree [18], Logistic Regression [19], Support Vector Machine [20], Naive Bayes [21] and others have shown promising results, but most of them overly rely on handcrafted features and necessitate a lot of manual design and adjustment, which is time-consuming and costly. Deep learning methods have achieved excellent performance with the help of large-scale corpus in many research fields, and become a research hotspot in SA [22]. However, in the majority of traditional text sentiment classification algorithms, the sentence-level sentiment classification remains difficult for various reasons, including a lack of semantic understanding and low classification accuracy. Moreover, the major existing methodology uses only 3 × 3 and 2D kernels which cannot be able to extract the review-text features effectively. Most of the current classification method commonly preprocesses only the basic stop-words, numbers, & symbols, and maintains the words which does not reveal any sentiment information (Ex: around, surround, door etc.). Keeping those words in the classification phase increases the complexity and time consumption of the method. In some existing literatures, when the length of a word in an input review is less than the length of the sentence matrix, the padding operation is used to pad with sufficient amount of NULL data in order to meet the original length of the sentence matrix. This padded NULL data reduces the impact of real data and dominates over it which results in less effective feature representation and negatively impacts the classification accuracy. Substantially, many classification methods classifies the reviews into two (happy, unhappy) or three classes (positive, negative, neutral) which results in an ineffective decision making process.
The main contribution of this paper is to propose a novel e-commerce review classification method namely RAVO based Multi-Size Kernel structured Text CNN based Review Classification (MSK-TCNN-RAVO) based on a new variant of CNN. This method utilizes a convolutional operation that employs multi-size, multi-dimensional kernels to generate enriched features which yields high review classification accuracy. This research demonstrates the concept of varying kernels applied on 1D kernels in order to extract features exclusively from accurate data rather than padded data, which is one of the key processes of this work. This work introduces a new concept called review-stop-words removal, in which, the words that do not convey any sentiment information are eliminated. Keeping these review-stop-words in classification increase the complexity and the time-consumption of both training and testing process. To our knowledge, there are no studies in the available literatures that address the removal of non-sentimental words during the classification stage. Subesquently, it is configured to generate five classification outcomes which will help companies make better business decisions. Finally, to improve the performance of the classifier, the proposed work is driven by a Rider driven African vulture optimization algorithm based on the overtaker's strategy of the Rider optimization algorithm.

III. METHODOLOGY
The proposed review-classifier network is designed based on Convolutional Neural Network (CNN). The design of this new MSK-TCNN-RAVO method is one-of-a-kind, allowing for correct classification of review data. The following are the unique aspects of the proposed work: • Input processing for 7 × 7 size sentence matrix rather than a traditional 5 × 5 size.
• Design of multi-size Convolutional kernels.
• Integration of both 1D and 2D kernels during convolution process.
• Introducing the concept of varying kernel based convolution process to extract features exclusively from accurate data rather than padded data.
• Pooling layer constructed by dual pooling operations • Configured to generate five classification results viz. Bad, Fair, Neutral, Good, and Excellent.
The aforementioned modifications make this MSK-TCNN-RAVO method as a novel work and more effective than existing Text-based CNN classifiers to classify the review-data. The proposed work is mainly divided into three sections such as: MSK-TCNN-RAVO review training process, MSK-TCNN-RAVO review testing process, and Optimization using Modified African Vulture Algorithm.
The workflow of the proposed work is shown in Fig. 1. The input reviews from three domains such as Laptop, Camera, Mobile is taken from Amazon website. The dataset as a whole has a lot of long sentences, with an average of 11.44 sentences per review. The proposed method has both training and testing sections. The basic stopwords, numbers, and symbols are removed from the input reviews in the preprocessing section and words which do not reveal any sentiment information is removed in review-stop-words removal section. The sentence matrix (SM) is generated by extracting each word from the enhanced reviews and placing it in matrix form. The above mentioned process is common for both training and testing section. Each review from the SM is fed as input to the proposed algorithm's training process and trained network is created. Similarly, in the testing section, each review from SM is fed as input to the proposed algorithm's testing process and it continues until all the reviews are processed. Finally, the review classification report is produced from the testing section.
A. MSK-TCNN-RAVO Review Training Process 1) Pre-processing: Reviews are usually composed of incomplete expression, a variety of noise and poorly structured sentences. Noise and unstructured Twitter data will affect the performance of tweet sentiment classification [23], [24]. Prior to feature selection, a series of preprocessing steps are performed on reviews to reduce the meaningless data in the review sentences.
Stop-words are words which have no value (positive or negative) in a sentiment analysis system that are meaningless in information retrieval, hence, they must be eliminated from the data set. For example stop words include "the", "as", "of", "and", "or", "to", etc. Stop word removing is substantial in the preprocessing, it has some advantages like reducing the size of stored data set and it improves the overall efficiency and effectiveness of the analysis system [25]. The proposed system uses a list of stop words obtained from Onix Text Retrieval Toolkit website [26].
In this proposed research work, three crucial preprocessing steps are carried out using text processing functions of MATLAB research tool to remove the (i) basic stop-words (ii) numbers, and (iii) symbols. The resultant preprocessed review is stored in the review-array .
2) Review-stop-words removal: The input of this section is the pre-processed review data. In the actual pre-processing section, the basic stop-words, numbers, and symbols are removed. But the remaining enhanced review contains the word which does not reveal any sentiment information, and those words are called as "Review-Stop-words". Keeping those words will increase time consumption and the complexity of the classification process. A tool is developed to assist for the generation of review-stop-words. Any sample review dataset can be fed as input to this generalized tool which comprises two steps, viz. basic stop-words removal and unique words extraction. Afterwards, the review-stop-words are extracted manually and the review-stop-word list can be generated. The sample review-stop-words are shown in Fig. 2.The survey-stop-words are loaded in the survey-stop-wordsarray . In the pre-processed review, the survey-stop-words are removed based on the list. The resultant survey after this removal process is stored in the review-array .

3) Sentence matrix generation:
The Sentence Matrix represents a sentence in the matrix form. The dimension of is fixed as 7 × 7. It is fixed by maintaining a trade-off between complexity and accuracy. A sentence of the enhanced review (after removing basic stop-words, numbers, symbols, and review-stop-words) is extracted and each word of it is placed in the matrix form.
In this research work, the convolution process is carried out by taking as input using the combination of both 2D kernels such as 3 × 7 , 5 × 7 and 1D kernel such as 1 × 3, 1× 5, and 1× 7 in order to extract the review text feature representation effectively.

4) Training of MSK-TCNN-RAVO Algorithm:
This algorithm is comprised of multi-size, multi-dimensional kernels. The varying kernel adopted in this technique retrieves features only from the exact data and not from the padded data. The dual pooling operations applied in this algorithm increase the efficiency of the feature map processing. The training process is carried out in order to produce five classification results. The architecture of the algorithm is depicted in Fig. 3.

a) 3 × 7 Size Kernel based Convolution:
This convolution process is performed using six 3 × 7 size kernels such as k 0 ,k 1 ,k 2 ,k 3 ,k 4 , and k 5 which is represented in Fig. 4. The column width of these kernels is fixed as seven, because the length of Sentence Matrix SM is seven. First the kernel k 0 is used to convolute the sentence matrix SM to produce convoluted matrix CM k0 . This process is performed using (1) to (5).
Herein, , , represents the convoluted result by applying convolution on row of 0, 2, and 4 respectively in , and 0 indicates convoluted matrix by 0 kernel. Equation (1) defines the kernel 0 . Equation (2) computes the convoluted value by projecting the 0 over the element of (0,0). It means the convolution process performed on 0 th row and 0 th column in , and stored in A. Likewise the Equation (3)  with size 3 × 1 is constructed by placing the A, B, and C values in the order using (5). Similarly the other 3 × 7 size kernels such as 1 , 2 , 3 , 4 , and 5 is convoluted to find the convoluted matrix 1 , 2 , 3 , 4 , and 5 respectively.
The kernel 6 is used to compute the convoluted matrix 6 with size 6 × 1 using (6) and (7).
626 | P a g e www.ijacsa.thesai.org    In Equation (7), the kernel 6 is projected over the rows of padded-SM from row = 0 to row = 5, and the convolution process is performed. Similarly the other convoluted matrix such as 7 , 8 , and 9 is convoluted. c) One Dimensional Varying Kernel based Convolution: In Sentence matrix , the length of a word may be less than the matrix length 7. In this case, padding operation pads the necessary quantity of NULL data to meet the length of the . However, the padded data diminishes the effect of real data in feature representation, lowering review classification accuracy. Hence, this research employs the concept of 1D varying kernels such as 1 × 3,1 × 5, 1 × 7 which have the length bounded in the range of 1 × 7 as shown in Fig. 6. The words which have length less than 7 in undergone the convolution operation with the aid of either the varying kernel 1 × 3 or 1 × 5. The words which have length equal to 7 in undergone the convolution process by using the kernel of 1 × 7 size. This concept extracts the features from the exact data and not from the padded data, which is one of the key processes of this research. The convoluted matrix computation for the kernel 10 is performed using (8) to (14).
where  (13) computes the convoluted value of each word of using 10 th varying kernel, by using the kernels 10 1×3 or 10 1×5 or 10 1×7 depending on the length of word . The Equation (14) shows the convoluted matrix by merging the seven values of . Similarly, the convoluted matrix such as 11 , 12 , and 13 will be computed. d) ReLU Operation and Feature Matrix Generation: In this research, the ReLU function is used as activation function after convolution process. The function returns 0 if it receives any negative input, but for any positive value x, it returns the same value [27]. The ReLU activation function is designed for the convolution matrix 10 is based on Equation (15).
where ∈ [0, − 1] n -Height of the convoluted matrix 0 (Let it be 3 for 3 × 7 size kernels). Similarly the other convolution matrixes ranging 1 to 13 are computed based on (15). The feature matrix is generated in order to reduce the quantity of convoluted matrix from 14 to 6. This helps to improve the speed of the review-analysis. Herein, the bias value participates its part similar with the traditional neural network. The bias value 1 is set as 1 and bias value 2 is set as 0. The illustration of feature matrix generation on the Sentence Matrix is shown in Fig. 7.   Similarly the feature matrix 4 , 5 is undergone both the pooling operations as shown in Fig. 9. The illustration of pooling layer computation is depicted in Fig. 10. The 2 × 1 dimension is the resultant matrix after the pooling operations.
f) Feature Map Vector Generation: The feature map vector generation is performed to collect the same features of same size kernels into small groups. The size reduced 3 × 7 kernels are grouped in order to generate feature map vector 0 using (16).
Similarly, the size reduced 5 × 7 , and one dimensional kernel oriented pooled feature maps are grouped to generate the feature map vector 1 , 2 respectively using (17) and (18). 629 | P a g e www.ijacsa.thesai.org Likewise, the feature map vector 2 is originated using the one dimensional kernels oriented pooled feature maps such as 4 and 5 based on (18).
g) Softmax Function: The resultant feature map vectors 0 , 1 , and 2 holds four elements each. The output of the convolution layer is flattened into a 1D array called as F, which is given as input to the fully connected layer. The output layer in an artificial neural network produces the given outputs for the program. In the training section of the proposed RAVO-MSKTCNN algorithm, the manually marked category information such as BD, FR, NL, GD, EX with target values 1, 2, 3, 4 and 5 respectively are fed to support the learning process. The extracted flattened features of the given training review sample is converted into probabilistic distributions form by softmax function. This data is compiled by the target values of training categories until it converges by the back propagation concept. Thus the training of a review data is progressed. Multi reviews are trained using the MSK-TCNN-RAVO algorithm, and the trained network of the same is generated.

B. MSK-TCNN-RAVO Review Testing Process
The testing process of proposed algorithm is used to classify the review data with the aid of the trained network. The Fig. 11 depicts the architecture of the testing process of the proposed MSK-TCNN-RAVO network. The basic stop words, numbers, and symbols are removed from the test reviews in the preprocessing section and words which do not reveal any sentiment information are removed in review-stopwords removal section.
The enhanced reviews is fed as input to the testing section of the MSK-TCNN-RAVO algorithm and the convolution operations such as 3 × 7 , 5 × 7 , and one dimensional varying kernel based convolution are performed on to generate the corresponding convoluted matrices. The ReLU function is used as an activation function after convolution process, and feature matrix is generated in order to reduce the quantity of convolution matrices from 14 to 6 as shown in Fig. 7. The dual pooling operation such as Max and Average pooling is applied and finally the feature map vector is generated. The extracted feature map is given to the MSK-TCNN-RAVO network as a feature which is then classified into five classes as shown in Fig. 11. This process is repeated until all the test reviews are processed. The classification report generated by the proposed algorithm can be used for better decision making process.

Algorithm -1 of review testing process
Step 1: The test review is given as input Step 2: Remove basic stop words from the given test review Step 3: Remove the survey-stop-words from the pre-processed test reviews Step 4: Convert the processed test review into a Sentence Matrix (SM) of size 7 × 7 Step 5: Design six 3 × 7 size kernels such as 0 , 1 , 2 , 3 , 4 , and 5 .
Step 9: Convolute the Sentence Matrix (SM) by 6 kernel and find the Convolution Matrix 6 .
Step 10: Repeat Step 9 with the other kernels such as 7 , 8 , and 9 to obtain the Convolution Matrix 7 , 8 , and 9 respectively.
Step 16: Activate the convoluted matrices such as CM k0 to CM k13 using the ReLU activation function.
Step 17: Construct the feature matrix 0 to 5 using the convoluted matrices 0 to 13 .
Step 18: Apply the max pooling and average pooling operations on the feature matrix 0 to 5 and compute the corresponding pooled matrix 0 to 5 of 2 × 1 size.
Step 20: Feed the output of fully connected layer to Softmax activation function.
Step 21: Output classification results for test reviews based on RAVO driven MK-TCNN network. 630 | P a g e www.ijacsa.thesai.org   Vol. 13, No. 7, 2022 C. RAVO Driven MSK-TCNN The proposed MSK-TCNN-RAVO network is driven by Rider driven African vulture optimization algorithm based on over taker's strategy of rider optimization algorithm. The parameters of the MSK-TCNN-RAVO is tuned by considering the significant parameters of RAVO such as multi kernel size ( ), Pooling type ( ), and Feature map Vector ( ). These parameters are analyzed using (19).
Where, denotes the hidden layers to be optimized in which represents the initialization parameters of RAVO from MSK-TCNN-RAVO. In addition to the hidden layers, number of epoch, learning rate, batch size are considered as the hyper parameters for obtaining maximum efficiency in terms of performance matrices such as accuracy, precision, recall, F-Score, Time taken, MSE, Misclassification ratio, ranking index, etc. Moreover, these parameters are taken from MSK-TCNN to evaluate the performance based on Mean Square Error (MSE) and Logarithmic Loss. The MSE is considered as a fitness value of MSK-TCNN-RAVO that is processed until it reaches the minimal MSE. Equation (20) is considered as fitness evaluation.
After the initialization of the fitness value the following equation is utilized for determining the best solution.
The selection of best solution is obtained by Roulette wheel selection strategy as shown in Equation (22).
Where, and ( ) are, respectively, the probability solution and the fitness solutions of each individual as . This process is repeated until reaches the n number of iterations. This strategy enhances the diversity in AVOA. Then the second strategy of AVOA is rate of starvation which is determined using (23) and (24).
Where, V s denotes the satiated vultures, denotes the iteration, denotes the maximum iteration, denotes the random number in range (-1 and 1), represents the random value in range (0 and 1). Here, the starvation of vulture is determined by < 0 and > 0 conditions, that show vulture is starved and not starved, respectively. According to this RAVO is executed exploration strategy.
Where, ( + 1) is new position vector, denotes the number of vulture gratified using (24) in the current iteration. Moreover, the variables in the following denotes X Coefficient vector to maximize the random motion that is measured using X = 2 × rand in which rand is random number in range of 0 and 1. P (i) The position vectors of the current iteration.
In aforementioned Equation (26), rand 2 is the random value in range of 0 and 1. lw b , and up b are the variable that lower bound and upper bound, respectively. The Equation (26) is utilized to generate random solution based on the range of lw b , and up b . The " rand 3 ", increases the coefficient of random. Here, rand 3 reaches nearly 1, it disseminates the solution with similar order that increases random motion based lw b . Therefore, the diversity can be increased and explored maximum search space solution. Then, the model is moving on exploitation strategy if the | ( ) | has obtained below 1.

D. Exploitation
The exploitation has incorporated with two strategies where P 2 and P 3 are the parameters to determine the strategy to be executed. The ranges of these parameters are initialized at 0 and 1 before executing the searching process.
(i) Phase 1 In the first phase, the exploitation is obtained only when the value of � ( ) � reaches between the range of 1 and 0.5. Here, determining the P 2 value is highly significant before the execution of searching process with the range between 0 and 1. Here, the "rand P2 ", generates random number between 0 and 1. If rand p2 > P 2 , the Siege-fight strategy is executed. Conversely, if rand p2 < P 2 , rotation flight is processed. The mathematical formula of this is shown in (27).

1) Competition for food:
If the condition has obtained � ( ) � > 0.5 , means that the vultures in the environment are highly satiated and they have considerable energy for surviving. This can be determined by Equation (28).
Where, ( ) is evaluated using (27), denotes the rate of vulture satiated, 4 denotes the random number (0 1) helps for generating random coefficient.
Then the vulture nest strategy is rotating flight that is based on spiraling between whole vulture and one of the two 632 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 7, 2022 best vultures. The mathematically formulation of these strategy is shown in (29) and (30).
Where, ( ) is executed based on same strategy as earlier mentioned. Cos and Sin are utilized in this work based on sine and cosine function. 5 and 6 are random number generated in the range of 0 and 1. 1 , and 2 are generated using (29) and (30). Equation (31) to update the newly generated solution.
(ii) Phase 2: In the second phase of exploitation, the two vultures' movements accumulate several types of vultures over the food source, and the siege and aggressive strive to find food are carried out. If � ( ) � < 0.5, this phase is relying on 3 in the range (0 and 1). If 3 > 3 , various vultures are accumulating over the food. Otherwise, if 3 < 3 , the aggressive siege flight strategy is executed. The mathematical formulation of this is shown in equation (32).

a) Aggressive Competition for Food
If | ( ) | < 0.5 occurs, repositioning the vultures using equation (33) Here, the updating strategy of Rider Optimization is executed especially based on over taker's strategy. This strategy helps to enhance the performance of AVOA. Therefore, Rider based enhanced AVOA can make highly effective performance. ( ) represents the relative success rate of the i th vulture at the time and it will be in the interval of 0 and 1. Finally, determine the coordinate selection based on normalized distance vector that obtained by gauging the difference of i th vulture and head vulture using equation (37).
The value of is selected for ( , ) ℎR values are less than its fitness value. Therefore, the Rider's over taking strategy improves the performance by replacing Levy fight strategy. The RAVO can provide high performance than traditional African Vulture Optimization algorithm.

IV. ANALYSIS AND DISCUSSION
This section is very important in any research paper which portrays the efficiency of the proposed or the other existing methods. The proposed MSK-TCNN-RAVO classification task is compared with three existing methods mentioned below: 1) Weakly supervised deep embedding for review Sentence sentiment classification (WS_RSSC) [28]. 633 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 7, 2022 2) Sentiment similarity analysis based mining of user's trust from e-commerce reviews (SSA_TER) [29].
In order to test the proposed MSK-TCNN-RAVO classifier, the training and testing reviews are taken from five benchmark databases. The product reviews from different categories such as Mobile, Laptop, and Camera are taken from Amazon website. The names of the different datasets are given below: 1) Dataset of Laptop Product Review (DS1_LPR) [28].
From each dataset, 7000 reviews are extracted, out of that, 4000 reviews are utilized for training and the remaining 3000 reviews which are completely new to the MSK-TCNN-RAVO classifier is used for testing purpose to ensure the classifier's efficiency. Following that, 2000 more reviews are extracted from the training samples to act as test reviews, totaling 5000 test reviews for each dataset.

A. Classification Accuracy
The Accuracy of the classifier is computed using (38) and its values are represented in Table I.  The accuracy of the proposed model is high in all datasets when compared to other models. The highest accuracy is produced for the DS4_MPRSE4G dataset which is 95.5% after performing optimization via RAVO. The result shows that the proportion of classified true results produced by the proposed work is high when compared to other models.

B. Precision
Precision measures the exactness of a classifier. It can be calculated using (39).
The average precision values are represented in Table II. The precision is computed for each method for all the five datasets. Average precision of a specific method is computed by averaging the precision value of five datasets. The precision value for a specific dataset is computed by testing 5000 test review samples. The proposed methodology has a higher precision value, indicating that the proposed model produces less false positives, implying that the proposed model's prediction is more reliable. The average value of the proposed method is 0.89

C. Recall
Recall measures the completeness, or sensitivity, of a classifier. Higher recall means less false negatives, while lower recall means more false negatives. The Recall can be calculated using (40).
The average recall values are depicted in Fig. 12. The MSK-TCNN-RAVO method achieves the higher recall value of 0.895 which indicates that the percentage of true positives correctly classified by the proposed model is high compared to other existing models. D. F-Score In F-Score, precision and recall can be combined to produce a single metric known as F-measure, which is the weighted harmonic mean of precision and recall. The F-score can be calculated using (41).
The F-Score analysis is represented in Fig. 13. The F-Score value corresponding to a specific method and particular database is computed by processing the 5000 test review samples. Highest F-score value 0.947 is achieved by the MSK-TCNN-RAVO method for the DS4_MPRSE4G dataset.

E. Mean Square Error
In Machine Learning, main goal is to minimize the error which is defined by the Loss Function. It can be computed using (42). The MSE analysis values are listed in Table III.
Here, N denotes the No. of samples tested, indicates the predicted class for i th sample, and denotes the ground truth class for i th sample.
It can be observed from the Table III that the MSE value for the proposed method is drastically minimized after the RAVO optimization. The least error value after optimization is 426, which indicates that the proposed method's prediction error rate is very low when compared to the other three current models. The WS_RSSC method produces the highest error value of 1090 in this analysis for the DS1_LPR dataset.

F. Cross Entropy
This metric is used in neural networks when a classifier's output is multiclass prediction probabilities. In general, minimizing categorical cross-entropy gives greater accuracy for the classifier [33]. It can be computed using (43), and it's computed values are depicted in Fig. 14. Here, = 1, if the sample belongs to class else 0.
is the probability of the classifier that predicts of sample belonging to class . Since, the SSA_TER method is not coded based on neural network, this analysis is performed for the other three methods and all datasets. The proposed method gives a least value of 0.0619 after RAVO optimization on DS4_MPRSE4G dataset which means it gives high probability for classified data with more stabilized value.

G. Time Taken
In time taken analysis, the time consumed by the methods to produce the classification results is evaluated. The time taken by all methods is listed in Table IV. In this analysis, WS_RSSC method takes the least execution time for all datasets. But, when considering the classification accuracy of the proposed method, the excess time taken by the proposed model is acceptable. The time consumption difference between WS_RSSC and the MSK-TCNN-RAVO classifier is very less i.e., 1.26 seconds. Similarly, the time consumption difference between CNN_MCP_TSC and the proposed classifier is only 0.76 seconds. 635 | P a g e www.ijacsa.thesai.org

H. Misclassification Ratio
This analysis is carried out to determine the ratio of incorrect classification results generated by the classifier. It can be computed using (44). The ratios of misclassified results generated by all the methods are depicted in Fig. 15. From the analysis, the misclassification ratio produced by the proposed method is very low for DS4_MPRSE4G dataset which is 0.095. It shows that the proposed classifier produces very low incorrect classification results. The second best method which has less misclassification ratio is CNN_MCP_TSC which has 0.160 for the same dataset.

I. Performance based Ranking İndex
This analysis ranks the review classification methods based on the ranking index. The ranked values are depicted in Fig. 16. According to classification accuracy, precision, f-score, recall, mean square error, cross entropy, misclassification results, and the ranking index of each method is decided. The best method among the four methods is set with rank 4, i.e., the highest rank. The second best method is ranked by 3 and so on. Hence, from the above Fig. 16, the proposed method achieves the highest fourth rank which shows it is the best method for review classification than the other existing methods.

V. CONCLUSION
In this work, a classification method namely MSK-TCNN-RAVO is proposed for e-commerce reviews. The proposed algorithm is executed on five different datasets from three different domains utilized for experimentation and efficient analysis, with 7000 review samples per database being used for training and testing the proposed classifier. The MSK-TCNN-RAVO classifier achieves the highest classification accuracy of 95.5% when compared to other three existing algorithms such as WS_SSC, SSA_TER, and CNN_MCP_TSC which achieves the accuracy of 81.8%, 82.6%, and 84% respectively. The highest F-Score value of 0.947 is achieved by the proposed method when compared to other three methods. The proposed MSK-TCNN-RAVO classifier outperforms the other three approaches in all metrics, and it consistently performs well on the DS4_MPRSEG4 dataset. When compared to the WS_RSSC technique, the proposed classifier improves classification accuracy by 8.54% before RAVO and 12.86% after RAVO optimization. Furthermore, the MSK-TCNN-RAVO classifier achieves the highest values on all evaluation metrics such as Classification accuracy, Precision, Recall, F-Score, MSE, Cross entropy, and Misclassification ratio, which proves that the proposed method is well suited for opinion mining. When processing the emoticons in the input reviews, the computing complexity of the proposed work grows. Additionally, a slight tweak to the method will be planned to address the increased CPU utilization observed when the emojis are being trained into the classifier. In future work, it is intended to modify the coding to extend the MSK-TCNN-RAVO classifier to produce hepta classifier which classifies the reviews into seven mode classification results. Furthermore, in order to further enhance the classification accuracy and reduce processing time, an 636 | P a g e www.ijacsa.thesai.org