IJACSA Volume 12 Issue 3 - thesai.org

IJACSA Volume 12 Issue 3

Copyright Statement: This is an open access publication licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

View Full Issue

Paper 1: Texture Classification using Angular and Radial Bins in Transformed Domain

Abstract: Texture is generally recognized as fundamental to perceptions. There is no precise definition or characterization available in practice. Texture recognition has many applications in areas such as medical image analysis, remote sensing, and robotic vision. Various approaches such as statistical, structural, and spectral have been suggested in the literature. In this paper we propose a method for texture feature extraction. We transform the image into a two-dimensional Discrete Cosine Transform (DCT) and extract features using the ring and wedge bins in the DCT plane. These features are based on texture properties such as coarseness, smoothness, graininess, and directivity of the texture pattern in the image. We develop a model to classify texture images using extracted features. We use three classifiers: the Decision Tree, Support Vector Machine (SVM), and Logarithmic Regression (LR). To test our approach, we use Brodatz texture image data set consisting of 111 images of different texture patterns. Classification results such as accuracy and F-score obtained from the three classifiers are presented in the paper.

Author 1: Arun Kulkarni

Author 2: Aavash Sthapit

Author 3: Ashim Sedhain

Author 4: Bishrut Bhattarai

Author 5: Saurav Panthee

Keywords: Texture; discrete cosine transform; radial and angular bins; decision tree; support vector machine; logarithmic regression

Paper 2: Intellectual Singularity of Quasi-Holographic Paradigm for a Brain-like Video-Component of Artificial Mind

Abstract: On the basis of a new post-Shannon information approach (quantitative and qualitative together), a hierarchical process of evaluating video-information by an intellectual brain-like video-component of artificial mind is considered. The development of the classical (Shannon's) informational approach to the level of the new (post-Shannon's) informational approach made it possible to formulate an important additional “bonus” in the form of a differential holographic principle (DHP). DHP made it possible to present video information on a dualistic basis, considering its physical and structural components, considered together. Developed an integral quasi-holographic principle (IQHP) is built on the basis of the DHP. However, in contrast to the DHP this principle represents a supra-physical (abstract) principle, uses a long-range action template and is realized instantly (i.e. with an infinitely high speed). In a joint tandem of physical (quantitative) and structural (qualitative) components of video-information evaluation, the structural component is dominant. Due to this, the technology of the video-component of artificial mind based on IQHP always takes the form of an ascending hierarchy of structured (abstract) evaluations of video-information. This technology also includes a hierarchy of self-learning stages, thanks to which the constant development of macro-objects of video-information in the form of video-thesauruses as high-quality measuring scales is carried out. This maintains the relevance, efficiency, and instantaneousness of the video-component of the artificial mind in evaluation video-information. Based on the ideas and principles of a new (post-Shannon) information approach to evaluation video-information, the structural and functional architecture of the video-component of artificial mind built. This architecture is not biologically inspired, but it turned out to be surprisingly exactly coinciding with the known structure of the human neocortex (by the number of levels of the ascending hierarchy, by the presence of a hierarchy in direct and feedback, by the method of structuring and collecting input elementary video-data, etc.). A new theorem for a complete sample of video-data, considered together in physical and structural form, is formulated. The direct version of this theorem corresponds to an ascending hierarchy of video-information evaluations based on IQHP and bundles of video-information’s evaluations. The inverse version characterizes the global hierarchical feedback, which takes the form of a descending hierarchy of “service” video-information evaluations.

Author 1: Yarichin E. M

Author 2: Gruznov V.M

Author 3: Yarichina G.F

Keywords: Differential holographic principle; video-data; structuring; super-saccades; integral quasi-holographic principle; long-range action; bundles; ascending hierarchy; singularity; video-component; video-thesaurus; video-intelligence; architec¬ture; artificial intelligence; artificial mind; full sampling theorem; descending hierarchy; hierarchical feedback

Paper 3: Adopting Vulnerability Principle as the Panacea for Security Policy Monitoring

Abstract: Despite the adoption of information security poli-cies, many industries continue to suffer from the harm of non-compliance. Some of these harms include illegal disclosure of customers sensitive data, leakages of business trade secrets, and various kinds of cyber-attacks. The impact of such harm can be enormous.To avert this, monitoring the compliance of information security policies (otherwise known as use policies) have been adopted as a strategy towards enhancing security policy compliance. One of the main essence of use policy monitoring is to enhance security policy compliance so as to prevent harm. Ironically, the consequences of use policy monitoring can be detrimental. While proponents use utilitarianism ethics to argue that the monitoring of use policy is enhancing security policy compliance, the opponents of use policy skewed to deontological ethics to argue against the monitoring of security policy. Deon-tological ethics is of the view that monitoring of security policy intrudes on employees’ privacy and tend to hamper on their work performance. There have not been any clear solution to this discourse. A survey was conducted to understand the extend of security policy monitoring. Vulnerability principle was therefore explored as the panacea towards enhancing the monitoring of use policy to satisfy all the involve stakeholders.

Author 1: Prosper K. Yeng

Author 2: Stephen D. Wolthusen

Author 3: Bian Yang

Keywords: Information security; vulnerability principle; ethics; security policy monitoring

Paper 4: Feeder Reconfiguration in Unbalanced Distribution System with Wind and Solar Generation using Ant Lion Optimization

Abstract: This paper proposes an approach for the distribution system (DS) feeder reconfiguration (FRC) of balanced and unbalanced networks by minimizing the total cost of operation. Network reconfiguration is a feasible technique for system performance enhancement in low voltage distribution systems. In this work, wind and solar photovoltaic (PV) units are selected as distributed energy resources (DERs) and they are considered in the proposed FRC approach. The uncertainties related to DERs are modeled using probability analysis. In most cases, the distribution system is an unbalanced system and the 3-phase transformers play a vital role as they have different configurations. This paper proposes efficient power flow models for the unbalanced distribution systems with various 3-phase transformer configurations. The proposed FRC approach has been solved by using the evolutionary algorithm based Ant Lion Optimization (ALO), and it has been implemented on 17 bus test system considering the balanced and unbalanced distribution systems with and without RESs.

Author 1: Surender Reddy Salkuti

Keywords: Distributed energy resources; evolutionary algorithms; feeder reconfiguration; operational cost; optimization algorithms; three-phase transformers

Paper 5: Determinants of e-Commerce Use at Different Educational Levels: Empirical Evidence from Turkey

Abstract: Rapid spread of internet has made e-commerce an essential and effective tool for commercial transactions. The purpose of this study is to investigate e-commerce use differences between individuals in Turkey according to their educational levels and to specify the relationship between demographic characteristics of individuals and e-commerce use. In this study, the cross-sectional data obtained by Household Information Technologies Usage Survey were used. Binary logistic regression analysis was utilized to determine the factors associated with the e-commerce use of individuals. This study has concluded that the variables of income level, age, gender, occupation, region, social media use, use of internet banking, use of e-government, number of information equipment in a household and the number of people in a household have relationships with e-commerce use. In addition, it has been found out that the variables in e-commerce use showed differences according to educational levels of individuals. It has been determined as a result of this study that as the education level of the individuals increased, their tendency towards online shopping increased. Higher education level refers to higher income level at both state and private institutions and more perception towards innovations. This has naturally a positive effect on online shopping behaviors of individuals.

Author 1: Seyda Ünver

Author 2: Ömer Alkan

Keywords: Electronic commerce; online shopping; educational level; e-commerce; Turkey; binary logistic regression

Paper 6: Fuzzy based Techniques for Handling Missing Values

Abstract: Usually, time series data suffers from high percentage of missing values which is related to its nature and its collection process. This paper proposes a data imputation technique for imputing the missing values in time series data. The Fuzzy Gaussian membership function and the Fuzzy Triangular membership function are proposed in a data imputation algorithm in order to identify the best imputation for the missing values where the membership functions were used to calculate weights for the data values of the nearest neighbor’s before using them during imputation process. The evaluation results show that the proposed technique outperforms traditional data imputation techniques where the triangular fuzzy membership function has shown higher accuracy than the gaussian membership function during evaluation.

Author 1: Malak El-Bakry

Author 2: Farid Ali

Author 3: Ayman El-Kilany

Author 4: Sherif Mazen

Keywords: Time series data; fuzzy logic; membership functions; machine learning; missing values

Paper 7: Change Detection Method with Multi-temporal Satellite Images based on Wavelet Decomposition and Tiling

Abstract: Change detection method with multi-temporal satellite images based on Wavelet decomposition with Daubechies wavelet function (Multi Resolution Analysis), and tiling is proposed. The method allows detection of changes in time series analysis and is not sensitive to geometric distortions included in the satellite images. In this paper, the author proposed a method based on MRA as a method for extracting change points from satellite images acquired over many periods. Change detection method with multi-temporal satellite images based on Wavelet decomposition and tiling is proposed. The method allows to detect changes and is not sensitive to geometric distortions included in the satellite images. The experimental results with simulation image and a Landsat Thematic Mapper (TM) image show that more appropriate changes can be detected with the proposed method in comparison with the existing method of subtraction. When applied to simulations and real satellite images, it was confirmed that they were robust to minute nonlinear geometric distortion.

Author 1: Kohei Arai

Keywords: Daubechies wavelet; multi-resolution analysis: MRA; change detection; multi-temporal satellite image; geometric distortion; Landsat Thematic Mapper (TM) image

Paper 8: Comprehensive Analysis of Resource Allocation and Service Placement in Fog and Cloud Computing

Abstract: The voluminous data produced and consumed by digitalization, need resources that offer compute, storage, and communication facility. To withstand such demands, Cloud and Fog computing architectures are the viable solutions, due to their utility kind and accessibility nature. The success of any computing architecture depends on how efficiently its resources are allocated to the service requests. Among the existing survey articles on Cloud and Fog, issues like scalability and time-critical requirements of the Internet of Things (IoT) are rarely focused on. The proliferation of IoT leads to energy crises too. The proposed survey is aimed to build a Resource Allocation and Service Placement (RASP) strategy that addresses these issues. The survey recommends techniques like Reinforcement Learning (RL) and Energy Efficient Computing (EEC) in Fog and Cloud to escalate the efficacy of RASP. While RL meets the time-critical requirements of IoT with high scalability, EEC empowers RASP by saving cost and energy. As most of the early works are carried out using reactive policy, it paves the way to build RASP solutions using alternate policies. The findings of the survey help the researchers, to focus their attention on the research gaps and devise a robust RASP strategy in Fog and Cloud environment.

Author 1: A. S. Gowri

Author 2: P.Shanthi Bala

Author 3: Immanuel Zion Ramdinthara

Keywords: Cloud; fog; reinforcement learning; energy-efficient computing; resource allocation; service placement

Paper 9: Performance Analysis of Deep Neural Network based on Transfer Learning for Pet Classification

Abstract: Deep learning frameworks have progressed beyond human recognition capabilities and, now it’s the perfect opportunity to optimize them for implementation on the embedded platforms. The present deep learning architectures support learning capabilities, but they lack flexibility for applying learned knowledge on the tasks in other unfamiliar domains. This work tries to fill this gap with the deep neural network-based solution for object detection in unrelated domains with a focus on the reduced footprint of the developed model. Knowledge distillation provides efficient and effective teacher-student learning for a variety of different visual recognition tasks. A lightweight student network can be easily trained under the guidance of the high-capacity teacher networks. The teacher-student architecture implementation on binary classes shows a 20% improvement in accuracy within the same training iterations using the transfer learning approach. The scalability of the student model is tested with binary, ternary and multiclass and their performance is compared on basis of inference speed. The results show that the inference speed does not depend on the number of classes. For similar recognition accuracy, the inference speed of about 50 frames per second or 20ms per image. Thus, this approach can be generalized as per the application requirement with minimal changes, provided the dataset format compatibility.

Author 1: Bhavesh Jaiswal

Author 2: Nagendra Gajjar

Keywords: Machine learning; knowledge distillation; transfer learning; domain adaptation

Paper 10: Multi-objective based Optimal Network Reconfiguration using Crow Search Algorithm

Abstract: This paper presents an optimal network reconfiguration (ONR)/feeder reconfiguration (FRC) approach by considering the total operating cost and system power losses minimizations as objectives. The ONR/FRC is a feasible approach for the enhancement of system performance in distribution systems (DSs). FRC alters the topological structure of feeders by changing the close/open status of the tie and sectionalizing switches in the system. Apart from the power received from the main grid, this paper considers the power from distributed generation (DG) sources such as wind energy generators (WEGs), solar photovoltaic (PV) units, and battery energy storage (BES) units. The proposed multi-objective-based ONR/FRC problem has been solved by using the multi-objective crow search algorithm (MO-CSA). The proposed methodology has been implemented on two (14 bus and 17 bus) distribution systems with three feeders.

Author 1: Surender Reddy Salkuti

Keywords: Battery storage; distributed generation; evolutionary algorithms; network reconfiguration; renewable energy; uncertainty

Paper 11: Applying Synthetic Minority Over-sampling Technique and Support Vector Machine to Develop a Classifier for Parkinson’s disease

Abstract: As the number of Parkinson’s disease patients increases in the elderly population, it has become a critical issue to understand the early characteristics of Parkinson’s disease and to detect Parkinson’s disease as soon as possible during normal aging. This study minimized the imbalance issue by employing Synthetic Minority Over-sampling Technique (SMOTE), developed eight Support Vector Machine (SVM) models for predicting Parkinson’s disease using different kernel types {(C-SVM or Nu-SVM)×(Gaussian kernel, linear, polynomial, or sigmoid algorithm)}, and compared the accuracy, sensitivity, and specificity of the developed models. This study evaluated 76 senior citizens with Parkinson’s disease (32 males and 44 females) and 285 healthy senior citizens without Parkinson’s disease (148 males and 137 females). The analysis results showed that the liner kernel-based Nu-SVM had the highest sensitivity (62.0%), specificity (81.6%), and overall accuracy (71.3%). The major negative relationship factors of the Parkinson’s disease prediction model were MMSE-K, Stroop Test, Rey Complex Figure Test (RCFT), verbal memory test, ADL, IADL, 70 years old or older, middle school graduation or below, and women. When the influence of variables was compared using “functional weight”, RCFT was identified as the most influential variable in the model for distinguishing Parkinson’s disease from healthy elderly. The results of this study implied that developing a prediction model by using linear kernel-based Nu-SVM would be more accurate than other kernel-based SVM models for handling imbalanced disease data.

Author 1: Haewon Byeon

Author 2: Byungsoo Kim

Keywords: Kernel type; Rey complex figure test; support vector machine; SMOTE; Parkinson’s disease

Paper 12: FishDeTec: A Fish Identification Application using Image Recognition Approach

Abstract: The underwater imagery processing is always in high demand, especially the fish species identification. This activity is as important not only for the biologist, scientist, and fisherman, but it is also important for the education purpose. It has been reported that there are more than 200 species of freshwater fish in Malaysia. Many attempts have been made to develop the fish recognition and classification via image processing approach, however, most of the existing work are developed for the saltwater fish species identification and used for a specific group of users. This research work focuses on the development of a prototype system named FishDeTec to the detect the freshwater fish species found in Malaysia through the image processing approach. In this study, the proposed predictive model of the FishDeTec is developed using the VGG16, is a deep Convolutional Neural Network (CNN) model for a large-scale image classification processing. The experimental study indicates that our proposed model is a promising result.

Author 1: Siti Nurulain Mohd Rum

Author 2: Fariz Az Zuhri Nawawi

Keywords: Component; Freshwater Fish; fish species recognition; FishDeTec; Convolutional Neural Network (CNN); VGG16

Paper 13: Predicting the Anxiety of Patients with Alzheimer’s Dementia using Boosting Algorithm and Data-Level Approach

Abstract: Since overfitting due to imbalanced data can cause prediction errors during the learning process of machine learning and degrades the prediction performance of the model (e.g., sensitivity), it is necessary to add an additional data sampling technique in the model development step to reduce overfitting to overcome this issue, in addition to selecting a machine learning algorithm suitable for the data. This study examined Alzheimer's patients living in South Korea to understand the predictors of anxiety using boosting algorithms (i.e., AdaBoost and XGBoost) and data-level approach (raw data, undersampling, oversampling, and SMOTE) and confirmed the machine learning algorithm with the best prediction performance. We analyzed 253 elderly people who were diagnosed with Alzheimer's disease (aged from 60 to 74 years old) who visited rehabilitation hospitals for early dementia screening. This study developed models for predicting the anxiety of Alzheimer's dementia patients using AdaBoost and XGBoost. Moreover, this study compared the prediction performance (i.e., accuracy, sensitivity, and specificity) of the models. The results of this study showed that XGBoost based on SMOTE (accuracy=0.84, sensitivity=0.85, and specificity=0.81) was identified as the model with the best prediction performance. Consequently, the results of this study presented that using a SMOTE-XGBoost model may provide higher accuracy than using a SMOTE-Adaboost model for developing a prediction model using outcome variable imbalanced data such as disease data in the future.

Author 1: Haewon Byeon

Keywords: Anxiety; AdaBoost; patients with Alzheimer's dementia; SMOTE; XGBoost

Paper 14: Deep Learning Hybrid with Binary Dragonfly Feature Selection for the Wisconsin Breast Cancer Dataset

Abstract: Breast cancer is the world’s top cancer affecting women. While the danger of the factors varies from a place, lifestyle, and diet. Treatment procedures after discovering a confirmed cancer case can reduce the risk of the disease. Unfortunately, breast cancers that arise in low and middle-income countries are diagnosed at a very late stage in which the chances of survival are impeded and reduced. Early detection is therefore required not only to improve the accuracy of discovering breast cancer but also to increase the chances of making the right decision on a successful treatment plan. There have been several studies tending to build software models utilizing machine learning and soft computing techniques for cancer detection. This research aims to build a model scheme to facilitate the detection of breast cancer and to provide the exact diagnosis. Improving the accuracy of a proposed model has, therefore, been one of the key fields of study. The model is based on deep learning that intends to develop a framework to accurately separate benign and malignant breast tumors. This study optimizes the learning algorithm by applying the Dragonfly algorithm to select the best features and perfect parameter values of the deep learning model. Moreover, it compares deep learning results against that of support vector machine (SVM), random forest (RF), and k nearest neighbor (KNN). Those classifiers are chosen as they are the most reliable algorithms having a solid fingerprint in the field of clinical data classification. Consequently, the hybrid model of deep learning combined with binary dragonfly has accurately classified between benign and malignant breast tumors with fewer features. Besides, deep learning model has achieved better accuracy in classifying Wisconsin Breast Cancer Database using all available features.

Author 1: Marian Mamdouh Ibrahim

Author 2: Dina Ahmed Salem

Author 3: Rania Ahmed Abdel Azeem Abul Seoud

Keywords: Breast cancer; Wisconsin data set; classifiers; deep learning; feature selection; dragonfly

Paper 15: Using Machine Learning Technologies to Classify and Predict Heart Disease

Abstract: The techniques of data mining are used widely in the healthcare sector to predict and diagnose various diseases. Diagnosis of heart disease is considered as one of the very important applications of these systems. Data is being collected today in a large amount where people need to rely on the device. In recent years, heart disease has increased excessively and heart disease has become one of the deadliest diseases in many countries. Most data sets often suffer from extreme values that reduce the accuracy percentage in classification. Extreme values are defined in terms of irrelevant or incorrect data, missing values, and the incorrect values of the dataset. Data conversion is another very important way to preconfigure the process of converting data into suitable mining models by acting assembly or assembly and filtering methods such as eliminating duplicate features by using the link and one of the wrap methods, and applying the repeated discrimination feature. This process is performed, dealing with lost values through the "Remove with values" methods and methods of estimating the layer. Classification methods like Naïve Bayes (NB) and Random Forest (RF) are applied to the original datasets and data sets with the feature of selection methods too. All of these operations are implemented on three various sets of heart disease data for the analysis of pre-treatment effect in terms of accuracy.

Author 1: Mohammed F. Alrifaie

Author 2: Zakir Hussain Ahmed

Author 3: Asaad Shakir Hameed

Author 4: Modhi Lafta Mutar

Keywords: Classification; Naive Bayes (NB); (Support Vector Machine SVM); Random Forest; machine learning

Paper 16: Towards Natural Language Processing with Figures of Speech in Hindi Poetry

Abstract: Poems have always been an excellent way of expressing emotions in any language. In particular, Hindi poetry is having versatile popularity among native and non-native speakers all over the world. A typical poem in Hindi is characterized by meter (“Chhand”), emotion (“Rasa”), and figure of speech (“Alankaar”). The present research work is the first of its kind in Hindi Natural Language Processing (NLP), which touches on the area of Hindi figure of speech. The authors have created a systematic hierarchical structure of Hindi “Alankaar” types and sub-types and attempted and extended the work to identify a few. A taxonomical list of 58 Hindi figures of speech is presented along with their nearest mapping to English equivalents. On the sidelines, the paper also presents the distinct rules for each type and sub-type needed for the classification task of NLP. The authors achieved 97% efficiency in reporting the first results with an average execution time of 0.002 seconds.

Author 1: Milind Kumar Audichya

Author 2: Jatinderkumar R. Saini

Keywords: “Alankaar”; figure of speech; Hindi; Natural Language Processing (NLP); poetry

Paper 17: Formal Verification of an Efficient Architecture to Enhance the Security in IoT

Abstract: The Internet of Things (IoT) is one of the world's newest intelligent communication technologies. There are several kinds of novels about IoT architectures, but they still suffer from security and privacy challenges. Formal verification is a vital method for detecting potential weaknesses and vulnerabilities at an early stage. During this paper, a framework in the Event-B formal method will be used to design a formal description of the secure IoT architecture to cover the security properties of the IoT architecture. As well as using various Event-B properties like formal verification, functional checks, and model checkers to design different formal spoofing attacks for the IoT environment. Additionally, the Accuracy of the IoT architecture can be obtained by executing different Event-B runs like simulations, proof obligation, and invariant checking. By applied formal verification, functional checks and model checkers verified models of IoT-EAA architecture have automatically discharged 82.35% of proof obligations through different Event-B provers. Finally, this paper will focus on introducing a well-defined IoT security infrastructure to address and reduce the security challenges.

Author 1: Eman K. Elsayed

Author 2: L. S. Diab

Author 3: Asmaa. A. Ibrahim

Keywords: Internet of things (IoT); IoT architecture; IoT security; formal modeling and verification; Event-B

Paper 18: Conceptual Model with Built-in Process Mining

Abstract: Process mining involves discovering, monitoring, and improving real processes by extracting knowledge from event logs in information systems. Process mining has become an important topic in recent years, as evidenced by a growing number of case studies and commercial tools. Current studies in this area assume that event records are created separately from a conceptual model (CM). Techniques are then used to discover missing processes and conformance with the CM, as well as for checks and enhancements. By contrast, in this paper we focus on modeling events as part of a tight multilevel CM that includes a static description, dynamics, events-log scheme, and monitoring and control system. If there is an out-of-model event log, it is treated as a requirement needed to build or enrich the CM. The motivation for such a unified system is our thesis that process mining is an essential component of a CM with built-in mining capabilities to perform self-process mining and attain completeness. Accordingly, our proposed conceptual model facilitates collecting data generated about itself. The resultant framework emphasizes an integrated representation of systems to include process-mining functionalities. Case studies that start with event logs are recast to evolve around a model-first approach that is not limited to the initial event log. The result presents a framework that achieves the aims of process mining in a more comprehensive way.

Author 1: Sabah Al-Fedaghi

Keywords: Process-mining techniques; event log; conceptual modeling; static model; events model; behavioral model

Paper 19: Modeling a Functional Engine for the Opinion Mining as a Service using Compounded Score Computation and Machine Learning

Abstract: The ever-growing use of the digital platform for the various walks of the applications, primarily on the collaborative platforms of e-commerce, e-learning, social media, blogging, and many more, produces a large corpus of unstructured text data. Many potential strategic solutions require an accurate and fast classification process of the Opinion's text corpus hidden patterns. In-premise applications have various real-time feasibility constraints. Therefore, offering an Opinion as a Service on the cloud platforms is a new research domain. This paper proposes a design framework of the evolution of the classification engine for opinion mining using score-based computation using a customized Vader algorithm. Another method for scalability is a machine learning model that supports a large corpus of unstructured text data classifications. The model validation is performed for the various complexes, unstructured text datasets with the different performance metrics of the cumulative score, learning rate, loss function, and specificity analysis. These metrics indicate the models' stability and scalability behaviors and their accuracy and robustness across different datasets.

Author 1: Rajeshwari D

Author 2: Puttegowda. D

Keywords: Text mining; opinion; sentiments; machine learning; unstructured data; cloud services

Paper 20: Model for Predicting Customer Desertion of Telephony Service using Machine Learning

Abstract: In the present study, it is observed that many people are affected by the services provided by telephony, who leave the service for different reasons, for which the use of a model based on decision trees is proposed, which allows predicting potential dropouts from Customers of a telecommunications company for telephone service. To verify the results, several algorithms were used such as neural networks, support vector machine and decision trees, for the design of the predictive models the KNIME software was used, and the quality was evaluated as the percentage of correct answers in the predicted variable. The results of the model will allow acting proactively in the retention of clients and improves the services provided. A data set with 21 predictor variables that influence customer churn was used. A dependent variable (churn) was used, which is an identifier that determines if the customer left = 1, did not leave = 0 the company's service. The results with a test data set reach a precision of 91.7%, which indicates that decision trees turn out to be an attractive alternative to develop prediction models of customer attrition in this type of data, due to the simplicity of interpretation of the results.

Author 1: Carlos Acero-Charaña

Author 2: Erbert Osco-Mamani

Author 3: Tito Ale-Nieto

Keywords: Software KNIME; Support Vector Machine (SVM); neural networks; decision trees

Paper 21: Evaluating Software Quality Attributes using Analytic Hierarchy Process (AHP)

Abstract: The use of quality software is of importance to stakeholders and its demand is on the increase. This work focuses on meeting software quality from the user and developer’s perspective. After a review of some existing software-quality models, twenty-four software quality attributes addressed by ten models such as the McCall’s, Boehm’s, ISO/IEC, FURPS, Dromey’s, Kitchenham’s, Ghezzi’s, Georgiadou’s, Jamwal’s and Glibb’s models were identified. We further categorized the twenty-four attributes into a group of eleven (11) main attributes and another group of thirteen (13) sub-attributes. Thereafter, questionnaires were administered to twenty experts from fields including Cybersecurity, Programming, Software Development and Software Engineering. Analytic Hierarchy Process (AHP) was applied to perform a multi-criteria decision-making assessment on the responses from the questionnaires to select the suitable software quality attribute for the development of the proposed quality model to meet both users and developer’s software quality requirements. The results obtained from the assessment showed Maintainability to be the most important quality attribute followed by Security, Testability, Reliability, Efficiency, Usability, Portability, Reusability, Functionality, Availability and finally, Cost.

Author 1: Botchway Ivy Belinda

Author 2: Akinwonmi Akintoba Emmanuel

Author 3: Nunoo Solomon

Author 4: Alese Boniface Kayode

Keywords: Analytic Hierarchy Process (AHP); software quality; quality attribute; quality model; sub-attributes

Paper 22: SVM Machine Learning Classifier to Automate the Extraction of SRS Elements

Abstract: The process of extraction of software entities such as system, use case, and actor from an English natural language description of a user’s software requirements is a linguistic and semantic process of a natural language processing application. Entity extraction is known to be a complicated and challenging problem by researchers in the fields of linguistics or computation, due to the ambiguities in natural languages. This paper presents a named entity recognition method called SyAcUcNER (System Actor Use-Case Named Entity Recognizer), for extracting the system, actor, and use case entities from unstructured English descriptions of user requirements for the software. SyAcUcNER uses one of the Machine Learning (ML) approaches, that is, the Support Vector Machine (SVM) as an effective classifier. Also, SyAcUcNER uses a semantic role labeling process to tag the words in the text of user software requirements. SyAcUcNER is the first work that defines the structure of a requirements engineering specialized NER, the first work that uses a specialized NER model as an approach for extracting actor and use case entities from English language requirements description, and the first time an SVM has been used to specify the semantic meanings of words in a certain domain of discourse; that is the Software Requirements Specification (SRS). The performance of SyAcUcNER, which utilizes WEKA’s SVM, is evaluated using a binomial technique, and the results gained from running SyAcUcNER on text corpora from assorted sources give weighted averages of 76.2% for precision, 76% for recall, and 72.1% for the F-measure.

Author 1: Ayad Tareq Imam

Author 2: Aysh Alhroob

Author 3: Wael Jumah Alzyadat

Keywords: Information extraction; named entity recognition; machine learning; support vector machine; software requirement specification; WEKA; I-CASE

Paper 23: Artificial Intelligence: Machine Translation Accuracy in Translating French-Indonesian Culinary Texts

Abstract: The use of machine translation as artificial intelligence (AI) keeps increasing and the world’s most popular translation tool is Google Translate (GT). This tool is not merely used for the benefits of learning and obtaining information from foreign languages through translation but has also been used as a medium of interaction and communication in hospitals, airports and shopping centers. This paper aims to explore machine translation accuracy in translating French-Indonesian culinary texts (recipes). The samples of culinary text were taken from the internet. The research results show that the semiotic model of machine translation in GT is the translation from the signifier (forms) of the source language to the signifier (forms) of the target language by emphasizing the equivalence of the concept (signified) of the source language and the target language. GT aids to translate the existing French-Indonesian culinary text concepts through words, phrases and sentences. A problem encountered in machine translation for culinary texts is a cultural equivalence. GT machine translation cannot accurately identify the cultural context of the source language and the target language, so the results are in the form of a literal translation. However, the accuracy of GT can be improved by refining the translation of cultural equivalents through words, phrases and sentences from one language to another.

Author 1: Muhammad Hasyim

Author 2: Firman Saleh

Author 3: Rudy Yusuf

Author 4: Asriani Abbas

Keywords: Machine translation; Google translation; accuracy; culinary texts; artificial intelligence

Paper 24: A Generic Approach for Allocating Movement Permits During/Outside Curfew Period during COVID-19

Abstract: During the coronavirus disease (COVID-19) pandemic, different exciting concepts around solutions, technical components, smartphone applications, and novel wireless services are needed to adapt to the new lifestyle standard that emerged from the COVID-19 crisis. In this context, social distancing was imposed to prevent or decrease significantly further transmission of the COVID-19. In other words, research results have shown that slowing the spread of COVID-19 is the way to save people's lives and relieve the burden on health-care systems. This social distancing can be tracked using cell phone movement/data. This paper presents a new approach/algorithm for allocating and optimizing/adapting the movement permits during/outside curfew periods inside workplaces, buildings, companies and institutions. This approach is an effective tool to reduce the spread of COVID-19 by promoting health safety during the pandemic, especially in places where social distancing can be difficult. Consequently, this paper presents a technological solution to automate the process of giving movement permits in workplaces. The paper results showed that the proposed strategy of social distancing inside buildings is effective enough to flatten the curve. Furthermore, health authorities are not required to recommend staying home to slow the spread of COVID-19. Consequently, this paper introduces a solution for the resource sharing problem (resource allocation problem), where multiple agents (people or robots) of a system trying to move reliably in their environment. The biggest concern of these agents is to avoid collisions (infections). As a result, the experiments carried out in this paper showed the high performance of the designed algorithm complying with COVID-19 social distancing regulations.

Author 1: Yaser Chaaban

Keywords: COVID-19 pandemic; social distancing; resource allocation problem; movement permits

Paper 25: Internet of Things Security: A Review of Enabled Application Challenges and Solutions

Abstract: The Internet of Things (IoT) has been widely used in every aspect of life. The rapid development of IoT technologies raises concerns regarding security and privacy. IoT security is a critical concern in the preservation of the privacy and reliability of users’ private information. The privacy concern becomes the biggest barrier to further usage of IoT technology. This paper presents a review of IoT application areas in smart cities, smart homes, and smart healthcare that leverage such techniques from the point of view of security and privacy and present relevant challenges. In addition, we present potential tools to ensure the security and preservation of privacy for IoT applications. Furthermore, a review of relevant research studies has been carried out and discusses the security of IoT infrastructure, the protocols, the challenges, and the solutions. Finally, we provide insight into challenges in the current research and recommendations for future works. The reviewed IoT applications have made life easier, but IoT devices that use unencrypted networks are increasingly coming under attack by malicious hackers. This leads to access to sensitive personal data. There is still time to protect devices better by pursuing security solutions with this technology. The results illustrate several technological and security challenges, such as malware, secure privacy management, and non-security infrastructure for cloud storage that still require effective solutions.

Author 1: Mona Algarni

Author 2: Munirah Alkhelaiwi

Author 3: Abdelrahman Karrar

Keywords: Internet of things; internet of things application; internet of things privacy; internet of things architecture; internet of things security; challenges; security protocol

Paper 26: Customer Retention: Detecting Churners in Telecoms Industry using Data Mining Techniques

Abstract: Customers are more concerned with the quality of services that companies can provide. Customer churn is the percentage of service for subscribers, who stop their subscriptions or the proportion of customers, who discontinue using the product of the firm or service within a given time frame. Services by various service providers or sellers are not very distinct that raise rivalry between firms to maintain the quality of their services and upgrade them. This paper aims at manifesting the service quality effect on customer satisfaction and churn prediction to reveal customers who have meant to leave a service. Predictive models can give the extent of the service quality effect on customer satisfaction for the correct determination of possible churners shortly for the provision of a retention solution. This paper analyses the impact of service quality and prediction models that depend on data mining (DM) techniques. The present model contains five steps: data-pre-processing, feature selection, sampling of data, training our classifier, testing for prediction, and output of prediction. A data set with 17 attributes and 5000 records used - which consist of 75% training the model and 25% testing- are randomly selected. The DM techniques applied in this paper are Boruta algorithm, C5.0, Neural Network, Support Vector Machine, and random forest via open-source software R and WEKA.

Author 1: Mahmoud Ewieda

Author 2: Essam M Shaaban

Author 3: Mohamed Roushdy

Keywords: Quality of service; churn prediction; classification; data mining; prediction model; customer retention

Paper 27: Speech-to-Text Conversion in Indonesian Language Using a Deep Bidirectional Long Short-Term Memory Algorithm

Abstract: Nowadays, speech is used also for communication between humans and computers, which requires conversion from speech to text. Nevertheless, few studies have been performed on speech-to-text conversion in Indonesian language, and most studies on speech-to-text conversion were limited to the conversion of speech datasets with incomplete sentences. In this study, speech-to-text conversion of complete sentences in Indonesian language is performed using the deep bidirectional long short-term memory (LSTM) algorithm. Spectrograms and Mel frequency cepstral coefficients (MFCCs) were utilized as features of a total of 5000 speech data spoken by ten subjects (five males and five females). The results showed that the deep bidirectional LSTM algorithm successfully converted speech to text in Indonesian. The accuracy achieved by the MFCC features was higher than that achieved with the spectrograms; the MFCC obtained the best accuracy with a word error rate value of 0.2745% while the spectrograms were 2.0784%. Thus, MFCCs are more suitable than spectrograms as feature for speech-to-text conversion in Indonesian. The results of this study will help in the implementation of communication tools in Indonesian and other languages.

Author 1: Suci Dwijayanti

Author 2: Muhammad Abid Tami

Author 3: Bhakti Yudho Suprapto

Keywords: Speech-to-text; Deep Bidirectional Long Short-Term Memory (LSTM); spectrogram; Mel frequency cepstral coefficients (MFCC); word error rate

Paper 28: Smart Internet of Vehicles Architecture based on Deep Learning for Occlusion Detection

Abstract: In our days, the cyber world is developing due to the revolution of smart cities and machine learning technologies. The internet of Things constitutes the essential background of cyber technology. As a case study, the Internet of Vehicles is one of the leading applications which is developed quickly. Studies are focused on resolving issues related to real-time problems and privacy leakage. Uploading data from the cloud during the data collection step is the origin of delay issues. This process decreases the level of privacy. The objective of the present paper is to ensure a high level of privacy and accelerated data collection. During this study, we propose an advanced Internet of Vehicle architecture to conduct the data collection step. An occlusion detection application based on a deep learning technique is performed to evaluate the IoV architecture. Training data at Distributed Intelligent layer ensures not only the privacy of data but also reduces the delay.

Author 1: Shaya A. Alshaya

Keywords: Internet of vehicle; deep learning; collaborative technologies; cloud; edge computing

Paper 29: Smart Home Energy Management System

Abstract: Home energy management system has been selected as an attractive research issue due to its ability to enhance energy security by including devices, entertainment systems, security systems, environmental controls, etc. Home automation is incorporated as a potential technology to ensure efficient electricity performance without interruption, solve power demand problems and coordinate devices with innovative technologies. In this context, our proposal seeks to implement an accurate home energy management system. The proposed approach aims to improve uninterrupted electricity production and provide comfortable services to families. To implement correct system operations and meet each device's power demand, a Reel Time Energy Management System (RT-EMS) will be implemented and discussed through some required tasks using the Multi-Agent System (MAS). Each agent will be determined according to some criteria to implement the appropriate design and meet each device's power demand. The obtained results will show that the proposed system meets the general objectives of RT-EMS.

Author 1: Yasser AL Sultan

Author 2: Ben Salma Sami

Author 3: Bassam A. Zafar

Keywords: Home energy management; multi-agent system; real-time system; energy recovery

Paper 30: Trade-off between Energy Consumption and Transmission Rate in Mobile Ad-Hoc Network

Abstract: Mobile Ad-Hoc Networks are decentralized systems of mobile nodes where each node is responsible for computing and retaining the routing and topology details. The autonomous nature of the nodes stresses more on the way of handling power consumptions. This raises a concern about how to improve the power efficiency that leads to a better battery life of the mobile nodes. It is important to balance between power and transmission rates to improve the network lifetime and reduce the sudden failure of the nodes. In this paper, a new transmission power control-based scheme is proposed that allows the mobile nodes to achieve a tradeoff between balancing transmission rate and power consumption. Through defining and updating two tables for every node that contain the average transmission rate for the neighboring nodes and number of times the neighboring node is used for data transmission. We validate our proposed scheme using several test cases.

Author 1: Ashraf Al Sharah

Author 2: Mohammad Alhaj

Author 3: Firas Al Naimat

Keywords: Coalition; MANETs; power-aware routing; power consumption; transmission power control-based; transmission rate

Paper 31: Real-Time Intelligent Thermal Comfort Prediction Model

Abstract: Real-time prediction model of indoor thermal comfort depending on Momentum Back Propagation (MBP) function is established by using Arduino hardware and mobile application. The air temperature indoor, air velocity, and relative humidity are gathered via temperature sensor and transferred via Bluetooth to the mobile application to predicate thermal comfort. A significant challenge in designing MBP is to decide the best architecture and parameters as the number of layers and nodes, and number of epochs for the network given the data for the AI issues. These parameters are usually selected on heuristic and fine-tuned manually, which could be as boring as the performance assessment may take hours to test the output of a single MBP parameterization. This paper tends to the issue of determining appropriate parameters for the MBP by applying chicken swarm optimalization (CSO) algorithm. The CSO algorithm simulates the chicken swarm searching for the best parameter employs the Fitness function of these parameters which yielding minimum error and high accuracy. The proposed accuracy approximately equals 98.3% when using the best parameters obtained from Chicken Swarm Optimization (CSO). The proposed methodology performance is assessed on the collected dataset from weather archive and in the context of thermal comfort prediction, that mapping relations between the indoor features and thermal index.

Author 1: Farid Ali Mousa

Author 2: Heba Hamdy Ali

Keywords: Thermal comfort; chicken swarm optimization; momentum back propagation; neural network; bio-inspired optimization algorithm

Paper 32: Potential Data Collections Methods for System Dynamics Modelling: A Brief Overview

Abstract: System Dynamics (SD) modelling is a highly complex process. Although the SD methodology has been discussed extensively in most breakthroughs and present literature, discussions on data collection methods for SD modelling are not explained in details in most studies. To date, comprehensive descriptions of knowledge extraction for SD modelling is still scarce in the literature either. In an attempt to fill in the gap, three primary groups of data sources proposed by Forrester: (1) mental database, (2) written database and (3) numerical database, were reviewed, including the potential data collections methods for each database by taking into account the advancement of current computer and information technology. The contributions of this paper come in threefolds. First, this paper highlights the potential data sources that deserved to be acknowledged and reflected in the SD domain. Second, this paper provides insights into the appropriate mix and match of data collection methods for SD development. Third, this paper provides a practical synthesis of potential data sources and their suitability according to the SD modelling stage, which can serve as modelling practice guidelines.

Author 1: Aisyah Ibrahim

Author 2: Hamdan Daniyal

Author 3: Tuty Asmawaty Abdul Kadir

Author 4: Adzhar Kamaludin

Keywords: System dynamics modelling; data collection methods; data source; system dynamics methodology

Paper 33: Novel Modelling of the Hash-based Authentication of Data in Dynamic Cloud Environment

Abstract: A datacenter in a cloud environment houses a massive quantity of data in a distributed manner. However, with the increasing number of threats like data deduplication attack over the cloud environment, it is quite challenging to ascertain data's full-fledged security. In this regard, data integrity and security are highly questionable. A review of existing literature shows that the existing solutions are not much suitable to meet the requirements and support the existing distributed storage system's security demands concerning data integrity due to the usage of the inferior authentication mechanism. Also, the most frequently used public-key encryption is found not to be purely suitable resource constraint devices. Therefore, this manuscript presents a unique model of authentication of data where a simplified hashing proposition has been designed towards scheduling a distributed chain of data. The idea is to perform dynamic authentication that is present of any form of the adversary. The design of proposed scheme is lightweight which offers cross-verifiable hash-based challenges matching scheme with the provision of the non-repudiation of the tractions using the inclusion of a cloud auditor units. The experiment was carried on numerical computing tool considering, data volume, verification count and verification delay as prime performance metrics. The simulation outcomes shows that the proposed system excels in better security performance as well it is flexible compared to the existing system.

Author 1: Anil Kumar G

Author 2: Shantala C.P

Keywords: Cloud computing; data deduplication; data integrity; data privacy; data security

Paper 34: Cybersecurity Awareness Level: The Case of Saudi Arabia University Students

Abstract: Cybersecurity plays an important role in reliance on digital equipment and programs to manage daily lives chores including the transmission and storage of personal information. Therefore, it is a global issue in our growing society, and it becomes increasingly important to measure and analyze the awareness of it. In this paper, a questionnaire has been designed to measure the current level of cybersecurity awareness (CSA) among Saudi university students. Cybersecurity students' awareness level questionnaire has been adapted from few other previous cybersecurity awareness campaigns. In this questionnaire, a total of 136 students have participated in the survey. The questionnaire was collected to measure the cybersecurity students’ awareness level through their knowledge, culture, and surrounding environment or through students’ behavior by thee affected factors. These are: gender, location, and study department of the students. The study findings reveal that the students’ awareness is in an average has no significant difference in cybersecurity awareness level between male and female students, but females show a bit more concern about cybersecurity. However, there is a clear and high awareness of students of computer and information technology departments compared to others. Moreover, urban students outperformed students in remote areas in awareness of cybersecurity. The survey results indicate that the study model has been effective in measuring students' awareness.

Author 1: Wejdan Aljohni

Author 2: Nazar Elfadil

Author 3: Mutsam Jarajreh

Author 4: Mwahib Gasmelsied

Keywords: Cybersecurity; awareness; protection; internet; students; higher education; security awareness; survey; APAT

Paper 35: Comparative Analysis of Secured Hash Algorithms for Blockchain Technology and Internet of Things

Abstract: Cryptography algorithms play a vital role in Information Security and Management. To test the credibility, reliability of metadata exchanged between the sender and the recipient party of IoT applications different algorithms must be used. The hashing is also used for Electronic Signatures and based on how hard it is to hack them; various algorithms have different safety protocols. SHA-1, SHA-2, SHA3, MD4, and MD5, etc. are still the most accepted hash protocols. This article suggests the relevance of hash functions and the comparative study of different cryptographic techniques using blockchain technology. Cloud storage is amongst the most daunting issues, guaranteeing the confidentiality of encrypted data on virtual computers. Several protection challenges exist in the cloud, including encryption, integrity, and secrecy. Different encryption strategies are seeking to solve these problems of data protection to an immense degree. This article will focus on the comparative analysis of the SHA family and MD5 based on the speed of operation, its security concerns, and the need of using the Secure Hash Algorithm.

Author 1: Monika Parmar

Author 2: Harsimran Jit Kaur

Keywords: Blockchain Technology; IoT; Secured Hash Algorithms; IoT Security; SHA; MD5

Paper 36: Proof-of-Review: A Review based Consensus Protocol for Blockchain Application

Abstract: Blockchain is considered one of the most disruptive technologies of our time and in the last 2 decades andhas drawn attention from research and industrial communities. Blockchain is basically a distributed ledger with immutable records, mostly utilized to perform the transactions across various nodes after achieving the mutual consensus between all the associated nodes. The consensus protocol is a core component of Blockchain technology, playing a vital role in Blockchain’s success, global emergence, and disruption capability.Many consensus protocols such as PoW, PoS, PoET, etc. have been proposed to make Blockchain more efficient to meet real-time application requirements. However, these protocols have their respective limitations of low throughput and high latency and sacrifice on scalability.These limitations have motivated this research team to introduce a novel review-based consensus protocol called Proof-of-Review, which is aimed to establish an efficient, reliable, and scalable Blockchain. The “review” in the proposed protocol is referring to the community trust on a node, which is entirely depending on the node’s previous behavior within the network which includes the previous transactions and interaction with other nodes. Those reviews eventually become the trust value gained by the node. The more positive the reviews the more trustworthyis the nodeto be considered in the network and vice versa. The most trustworthy node is selected to become the round leader and allows to publish a new block. The architecture of the proposed protocol is based on two parallel chains i.e. Transaction Chain and Review Chain. Both chains are linked to each other. The transaction chain stores the transaction whereas the review chain will store the reviews and be analyzed with an NLP algorithm to find the round leader for the next round.

Author 1: Dodo Khan

Author 2: Low Tang Jung

Author 3: Manzoor Ahmed Hashmani

Keywords: Blockchain; consensus protocol; transaction chain; review chain; prove-of-review; PoW; PoS

Paper 37: Movement Control of Smart Mosque’s Domes using CSRNet and Fuzzy Logic Techniques

Abstract: Mosques are worship places of Allah and must be preserved clean, immaculate, provide all the comforts of the worshippers in them. The prophet's mosque in Medina/ Saudi Arabia is one of the most important mosques for Muslims. It occupies second place after the sacred mosque in Mecca/ Saudi Arabia, which is in constant overcrowding by all Muslims to visit the prophet Mohammad's tomb. This paper aims to propose a smart dome model to preserve the fresh air and allow the sunlight to enter the mosque using artificial intelligence techniques. The proposed model controls domes movements based on the weather conditions and the overcrowding rates in the mosque. The data have been collected from two different resources, the first one from the database of Saudi Arabia weather's history, and the other from Shanghai Technology Database. Congested Scene Recognition Network (CSRNet) and Fuzzy techniques have applied using Python programming language to control the domes to be opened and closed for a specific time to renew the air inside the mosque. Also, this model consists of several parts that are connected for controlling the mechanism of opening/closing domes according to weather data and the situation of crowding in the mosque. Finally, the main goal of this paper has been achieved, and the proposed model has worked efficiently and specifies the exact duration time to keep the domes open automatically for a few minutes for each hour head.

Author 1: Anas H. Blasi

Author 2: Mohammad Awis Al Lababede

Author 3: Mohammed A. Alsuwaiket

Keywords: Artificial intelligence; CNN; CSRnet; fuzzy logic; fuzzy control

Paper 38: Correlating Crime and Social Media: Using Semantic Sentiment Analysis

Abstract: Crimes occur all over the world and with regularly changing criminal strategies, law enforcement agencies need to manage them adequately and productively. If these agencies have prior data on the crime or an early indication of the eventual felonious activity, it would encourage them to have some strategic preferences so that they can deploy their restricted and elite assets at the spot of a suspected crime or even better explore it to the point of anticipation. So, integration of social media content can act as a catalyst in bridging the gap between these challenges as we are aware of the fact that almost all our population uses social media and their life, thoughts, and, mindset are available digitally through their social media profiles. In this paper, an attempt has been made to predict crime pattern using geo-tagged tweets from five regions of India. We hypothesized that publicly available data from Twitter may include features that can portray a correlation between Tweets and the Crime pattern using Data Mining. We have further applied Semantic Sentiment Analysis using Bi-directional Long Short memory (BiLSTM) and feed forward neural network to the tweets to determine the crime intensity across a region. The performance of our prosed approach is 84.74 for each class of sentiment. The results showed a correlation between crime pattern predicted from Tweets and actual crime incidents reported.

Author 1: Rhea Mahajan

Author 2: Vibhakar Mansotra

Keywords: Crimes; social media; Twitter; BiLSTM; semantic sentiment analysis

Paper 39: A Hybrid Model for Documents Representation

Abstract: Text representation is a critical issue for exploring the insights behind the text. Many models have been developed to represent the text in defined forms such as numeric vectors where it would be easy to calculate the similarity between the documents using the well-known distance measures. In this paper, we aim to build a model to represent text semantically either in one document or multiple documents using a combination of hierarchical Latent Dirichlet Allocation (hLDA), Word2vec, and Isolation Forest models. The proposed model aims to learn a vector for each document using the relationship between its words’ vectors and the hierarchy of topics generated using the hierarchical Latent Dirichlet Allocation model. Then, the isolation forest model is used to represent multiple documents in one representation as one profile to facilitate finding similar documents to the profile. The proposed text representation model outperforms the traditional text representation models when applied to represent scientific papers before performing content-based scientific papers recommendation for researchers.

Author 1: Dina Mohamed

Author 2: Ayman El-Kilany

Author 3: Hoda M. O. Mokhtar

Keywords: Document representation; latent dirichlet allocation; hierarchical latent dirichlet allocation; Word2vec; Isolation Forest

Paper 40: Predicting Internet Banking Effectiveness using Artificial Model

Abstract: This research aims at building a prediction model to predict the effectiveness of internet banking (IB) in Qatar. The proposed model employs the aspect of hybrid approach through using the regression and neural network models. This study is one of the fewest to evaluate the effectiveness of IB through adopting two data mining approaches including regression and neural networks. The regression analysis is used to optimize and minimize the input dataset metrics through excluding the insignificant attributes. The study builds a dataset of 250 records of internet banking quality metrics where each instance includes 8 metrics. Moreover, the study uses the rapidminer application in building and validating the proposed prediction model. The results analysis indicates that the proposed model predicts the 88.5% of IB effectiveness, and the input attributes influence the customer satisfaction. Also, the results show the prediction model has correctly predict 68% of the test dataset of 50 records using neural networks without regression optimization. However, after employment of regression, the prediction accuracy of satisfaction improved by 12% (i.e. 78%). Finally, it is recommended to test the proposed model in the prediction in other online services such as e-commerce.

Author 1: Ala Aldeen Al-Janabi

Keywords: Artificial Neural Network (ANN); internet banking (IB); Artificial Intelligence (AI); e-banking effectiveness; regression model; rapidminer

Paper 41: Developing a Framework for Data Communication in a Wireless Network using Machine Learning Technique

Abstract: The emergence of Internet of Things (IoT) has become a huge innovation for utilizing the enormous power of wireless media. The adaptation of smart devices, with intelligent networking, has greatly enhanced the traffic of the IoT environment. The present security mechanism is primarily focusing on specific areas such as content filtering, monitoring techniques, and anomaly detection. A vulnerability reflects the inability of a network that allows an attacker to detect the extent of existing mechanism of security. The existing techniques focused on specific attacks rather than monitoring the whole network. However, there is a demand for a framework to govern and protect data and services in IoT network. Anomaly detection framework is a resource intensive activity to protect data and services of IoT / Wireless Sensor Networks (WSN). It supports application layer of IoT network and traces it frequently to find the existence of malicious activities. In this study, researchers proposed an anomaly detection framework to safeguard against wireless attacks. The proposed framework has employed a machine learning technique to detect the traces of wireless attacks. It supports IoT based networks to monitor the functionalities of the resources. In addition, it discusses the open challenges in IoT networks with possible solutions. Researchers employed a test bed for evaluating the proposed framework. The outcome of the study shows that the proposed framework provides better services with more security.

Author 1: Somya Khidir Mohmmed Ataelmanan

Author 2: Mostafa Ahmed Hassan Ali

Keywords: Anomaly detection; internet of things; wireless attacks; artificial intelligence; machine learning

Paper 42: Floating Content: Experiences and Future Directions

Abstract: Floating content is a promising communication paradigm based on pure ad hoc communications. It has a huge usage potential for various context-aware applications. In this paper, recent research related to floating content communication paradigm is presented. This paper focuses on some of the vital experiences ranging from analytical models to simulations to the real-world implementations. Some important results on the performance of floating content based on analytical models, simulations, and real-world implementations are presented. These results not only show the usefulness of the existing analytical models but also explain the ways of extending these existing models for incorporating new communication technologies and mobility models. This paper also highlights the energy consumption of smartphone applications based on floating content and explains how new communication technologies impact the feasibility of using floating content as a communication service for different applications. Based on the experiences, new future directions are highlighted that can prove to be very beneficial for researchers investigating this area.

Author 1: Shahzad Ali

Keywords: Floating content; opportunistic communications

Paper 43: A Collision-aware MAC Protocol for Efficient Performance in Wireless Sensor Networks

Abstract: Both IEEE 802.11 and IEEE 802.15.4 standards adopt the CSMA-CA algorithm to manage contending nodes’ access to the wireless medium. CSMA-CA utilizes the Binary Exponential Backoff (BEB) scheme to reduce the probability of packet collisions over the communication channel. However, BEB suffers from unfairness and degraded channel utilization, as it usually favors the last node that succeeded in capturing the medium to send its packets. Also, BEB updates the size of the contention window in a deterministic fashion, without taking into consideration the level of collisions over the channel. The latter factor has a direct impact on the channel utilization and therefore incorporating it in the computation of the contention window’s size can have positive impacts on the overall performance of the backoff algorithm. In this paper, we propose a new adaptive backoff algorithm that overcomes the shortcomings of BEB and outperforms it in terms of channel utilization, power conservation, and reliability, while preserving the fairness among nodes. We model our algorithm using Markov chain and validate our system through extensive simulations. Our results show a promising performance for an efficient backoff algorithm.

Author 1: Hamid Hajaje

Author 2: Mounib Khanafer

Author 3: Zine El Abidine Guennoun

Author 4: Junaid Israr

Author 5: Mouhcine Guennoun

Keywords: Wireless sensor networks; beacon-enabled IEEE 802.15.4; binary exponent backoff; adaptive backoff; fairness; power consumption; reliability; channel utilization

Paper 44: Distinctive Context Sensitive and Hellinger Convolutional Learning for Privacy Preserving of Big Healthcare Data

Abstract: The collection and effectiveness of sensitive Big Data have grown with Information Technology (IT) development. While using sensitive Big Data to acquire relevant information, it becomes indispensable that irrelevant sensitive data are reduced to safeguard personal information in healthcare sector. Many privacy-preserving strategies have been applied in the recent years using quasi-identifiers (QI) for applications like health services. However, privacy preservation over quasi-identifiers is still challenging in the context of Big Data because most datasets were of huge volume. Existing methods suffer from higher time consumption and lower data utility because of dynamically progressing datasets. In this paper, an efficient Distinctive Context Sensitive and Hellinger Convolutional Learning (DCS-HCL) is introduced to ensure privacy preservation and achieve high data utility for big healthcare datasets. First, Distinctive Impact Context Sensitive Hashing model is designed for the given input Big Dataset where both the distinctive and impact values are identified and applied to Context Sensitive Hashing. With this, similar QI-classes are mapped to evolve the computationally efficient anonymyzed data. Second, Hellinger Convolutional Neural Privacy Preservation model is presented to preserve the privacy of the sensitive unstructured data. This is performed by hashing QI-class values, weight updation and bias in CNN to increase the accuracy and to reduce the information loss. Evaluation results demonstrate that with proposed method with large-volume unstructured datasets improved performance of run time, data utility, information loss and accuracy significantly over existing methods.

Author 1: Sujatha K

Author 2: Udayarani V

Keywords: Big data; information technology; distinctive; impact; context sensitive hashing; quasi-identifier; Hellinger; convolutional neural

Paper 45: Big Data Analytics Framework for Childhood Infectious Disease Surveillance and Response System using Modified MapReduce Algorithm

Abstract: Tanzania, like most East African countries, faces a great burden from the spread of preventable infectious childhood diseases. Diarrhea, acute respiratory infections (ARI), pneumonia, malnutrition, hepatitis, and measles are responsible for the majority of deaths amongst children aged 0-5 years. Infectious disease surveillance and response is the foundation of public healthcare practices, and it is increasingly being undertaken using information technology. Tanzania however, due to challenges in information technology infrastructure and public health resources, still relies on paper-based disease surveillance. Thus, only traditional clinical patient data is used. Nontraditional and pre-diagnostic infectious disease report case data are excluded. In this paper, the development of the Big Data Analytics Framework for Childhood Infectious Disease Surveillance and Response System is presented. The framework was designed to guide healthcare professionals to track, monitor, and analyze infectious disease report cases from sources such as social media for prevention and control of infectious diseases affecting children. The proposed framework was validated through use-cases scenario and performance-based comparison.

Author 1: Mdoe Mwamnyange

Author 2: Edith Luhanga

Author 3: Sanket R. Thodge

Keywords: Big data analytics; childhood infectious diseases; infectious disease surveillance system; infectious disease report cases; framework; Hadoop; healthcare big data; map reduce

Paper 46: Recognizing Human Emotions from Eyes and Surrounding Features: A Deep Learning Approach

Abstract: The need for an efficient intelligent system to detect human emotions is imperative. In this study, we proposed an automated convolutional neural network-based approach to recognize the human mental state from eyes and their surrounding features. We have applied deep convolutional neural network based Keras applications with the help of transfer learning and fine-tuning. We have worked with six universal emotions (i.e., happiness, disgust, sadness, fear, anger, and surprise) with a dataset containing 588 unique double eye images. In this study, we considered the eyes and their surrounding areas (Upper and lower eyelid, glabella, and brow) to detect the emotional state. The state and movement of the iris and pupil can vary with the various mental states. The common features found within the entire eyes during different mental states can help to capture human expression. The dataset was trained with pre-trained weights and used a confusion matrix to analyze the prediction to achieve better accuracy. The highest accuracy was achieved by DenseNet-201 is 91.78%, whereas VGG-16 and Inception-ResNet-v2 show 90.43% and 89.67%, respectively. This study will provide an insight into the current state of research to obtain better facial recognition.

Author 1: Md Nymur Rahman Shuvo

Author 2: Shamima Akter

Author 3: Md. Ashiqul Islam

Author 4: Shazid Hasan

Author 5: Muhammad Shamsojjaman

Author 6: Tania Khatun

Keywords: Human emotion recognition; convolutional neural network (CNN); transfer learning; fine-tuning; VGG-16; Inception-ResNet -V2; DenseNet-201

Paper 47: Novel Data Oriented Structure Learning Approach for the Diabetes Analysis

Abstract: Diabetes mellitus is considered a significant disease an ever rising epidemic. Accordingly this disease represents a worldwide public-health-crisis. Several classification techniques have been recently employed for diabetes diagnosis, however only few researches have been dedicated to facilitating its analysis based on knowledge representation using probabilistic modelling. Bayesian Network as a probabilistic graphical model is considered as one of the most effective techniques of classification. Bayesian Network (BN) is widely employed in several domains like risk analysis, medicine, bioinformatics and security. This probabilistic graphical model represents an effective formalism to reason under uncertainty. The construction of the BN model goes through two learning phases of structure and parameter. The first learning phase of BN skeleton has been assessed as complex problem (NP-hard problem). Accordingly, several methods have been introduced amongst which the score based algorithms that are considered as one of the most powerful methods of structure learning. In this paper, we introduce a novel algorithm based on graph theory and the information theory combination. The proposed algorithm called GIT algorithm for Parents and children detection for BN structure learning. In addition, we evaluate the obtained results and using the reference networks, we prove the efficiency of the proposed GIT algorithm in terms of accuracy. Furthermore, we apply our algorithm in a real field, especially for detecting the interesting dependencies which are useful for the diabetes analysis.

Author 1: Adel THALJAOUI

Keywords: Classification; Bayesian Network; structure learning; score oriented approach; diabetes analysis

Paper 48: Optimal Routing based Load Balanced Congestion Control using MAODV in WANET Environment

Abstract: A decentralized sort of network that can allow the nodes to communicate with them lacking any central controller is Wireless Ad hoc Networks (WANET). Network Congestions can befall on account of nodes' restricted Bandwidth (BW) together with dynamic topology. Network Congestions brings about data loss as it makes the Data Packets (DP) be dropped on the network. Therefore, in order to lessen Network Congestions, it is necessary to model Congestion Control (CC) systems aimed at the WANET. Thus, this paper offers an optimal routing centered CC scheme utilizing the Modified Ad hoc on-demand Distances Vector (MAODV) Routing Protocol (RP) aimed at the WANET. Here, primarily, the Source Node (SN) together with Destination Nodes (DN) is initialized, and after that, the MAODV discovers the multiple routing paths. Subsequently, Stochastic Gradients Descent Deep Learning Neural Network (SGD-DLNN) identifies the Congestion Status (CS) of every node in the discovered paths. In addition, the MAODV allocates the traffic over the optimum congestion-free routing path if congestion befalls. The Levy Flight Based Black Widow Optimization (LF-BWO) algorithm chooses the optimal routing paths as of congestion-free paths. Centered upon path lifetime, residual energy, link cost, together with path distance, this algorithm enhances the Data Transmission (DT) performance by means of discovering a path. The experimentation’s outcomes are rendered to exhibit the proposed RP’s effectiveness.

Author 1: Kanthimathi S

Author 2: JhansiRani P

Keywords: Routing; congestion control; Wireless Ad Hoc Networks (WANET); Modified Ad hoc on-demand Distance Vector (MAODV); Levy Flight Based Black Widow Optimization (LF-BWO); Stochastic Gradient Descent Deep Learning Neural Network (SGD-DLNN)

Paper 49: Smart Digital Forensic Framework for Crime Analysis and Prediction using AutoML

Abstract: Over the most recent couple of years, the greater part of the information, for example books, recordings, pictures, clinical, forensic, criminal and even the hereditary data of people are being pushed toward digitals and cyber-dataspaces. This problem requires sophisticated techniques to deal with the vast amounts of data. We propose a novel solution to the problem of gaining actionable intelligence from the voluminous existing and potential digital forensic data. We have formulated an Automated Learning Framework ontology for Digital Forensic Applications relating to collaborative crime analysis and prediction. The minimum viable ontology we formulated by studying the existing literature and applications of Machine learning has been used to devise an Automated Machine Learning implementation to be quantitatively and qualitatively studied in its capabilities to aid intelligence practices of Digital Forensic Investigation agencies in representing, reasoning and forming actionable insights from the vast and varied collected real world data. A testing implementation of the framework is made to assess performance of our proposed generalized Smart Forensic Framework for Digital Forensics applications by comparison with existing solutions on quantitative and qualitative metrics and assessments. We will use the insights and performance metrics derived from our research to motivate forensic intelligence agencies to exploit the features and capabilities provided by AutoML Smart Forensic Framework applications.

Author 1: Sajith A Johnson

Author 2: S Ananthakumaran

Keywords: Forensic investigation; digital forensic; automated machine learning; smart forensic framework

Paper 50: Multi-level Protection (Mlp) Policy Implementation using Graph Database

Abstract: Retracted: After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of IJACSA`s Publication Principles. We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.

Author 1: Lingala Thirupathi

Author 2: Venkata Nageswara Rao Padmanabhuni

Keywords: Database; graph; protection; multi-level

Paper 51: SGBBA: An Efficient Method for Prediction System in Machine Learning using Imbalance Dataset

Abstract: A real world big dataset with disproportionate classification is called imbalance dataset which badly impacts the predictive result of machine learning classification algorithms. Most of the datasets faces the class imbalance problem in machine learning. Most of the algorithms in machine learning work perfectly with about equal samples counts for every class. A variety of solutions have been suggested in the past time by the different researchers and applied to deal with the imbalance dataset. The performance of these methods is lower than the satisfactory level. It is very difficult to design an efficient method using machine learning algorithms without making the imbalance dataset to balance dataset. In this paper we have designed an method named SGBBA: an efficient method for prediction system in machine learning using Imbalance dataset. The method that is addressed in this paper increases the performance to the maximum in terms of accuracy and confusion matrix. The proposed method is consisted of two modules such as designing the method and method based prediction. The experiments with two benchmark datasets and one highly imbalanced credit card datasets are performed and the performances are compared with the performance of SMOTE resampling method. F-score, specificity, precision and recall are used as the evaluation matrices to test the performance of the proposed method in terms of any kind of imbalance dataset. According to the comparison of the result of the proposed method computationally attains the effective and robust performance than the existing methods.

Author 1: Saiful Islam

Author 2: Umme Sara

Author 3: Abu Kawsar

Author 4: Anichur Rahman

Author 5: Dipanjali Kundu

Author 6: Diganta Das Dipta

Author 7: A.N.M. Rezaul Karim

Author 8: Mahedi Hasan

Keywords: Imbalanced dataset; sub sample; accuracy; fraud; confusion matrix; bagging

Paper 52: An Improved Multi-label Classifier Chain Method for Automated Text Classification

Abstract: Automated text classification is the task of grouping documents (text) automatically into categories from a predefined set. The conventional approach to classification involves mapping a single class label each to a data point (instance). In multi-label classification (MLC), the task is to develop models that could predict multiple class labels to a data instance. There exist several MLC methods such as classifier chain (CC) and binary relevance (BR). However, there are drawbacks with these methods such as random label sequence ordering issue. This study attempts to address this issue peculiar with the classifier chain method. In this paper, a hybrid heuristic evolutionary-based technique is proposed. The proposed PSOGCC is a combination of particle swarm optimization (PSO) and genetic algorithm (GA). Genetic operators of GA are integrated with the basic PSO algorithm for finding the global best solution representing an optimized label sequence order in the chain classifier. In the experiment, three MLC methods: BR, CC, and PSOGCC are implemented using five benchmark multi-label datasets and five standard evaluation metrics. The proposed PSOGCC method improved the predictive performance of the chain classifier by obtaining the best results of 98.66%, 99.5%, 99.16%, 99.33%, 0.0011 accuracy, precision, recall, f1 Score, and Hammingloss values, respectively.

Author 1: Adeleke Abdullahi

Author 2: Noor Azah Samsudin

Author 3: Shamsul Kamal Ahmad Khalid

Author 4: Zuhaila Ali Othman

Keywords: Text classification; multi-label classification; classifier chain; particle swarm optimization; genetic algorithm

Paper 53: Efficient Task Scheduling in Cloud Computing using Multi-objective Hybrid Ant Colony Optimization Algorithm for Energy Efficiency

Abstract: The efficiency of Internet services is determined by the Cloud computing process. Various challenges in computing are being faced, such as security, the efficient allocation of resources, which in turn results in the waste of resources. Researchers have explored a number of approaches over the past decade to overcome these challenges. The main objective of this research is to explore the task scheduling of cloud computing using multi-objective hybrid Ant Colony Optimization (ACO) with Bacterial Foraging (ACOBF) behavior. ACOBF technique maximized resource utilization (Service Provider Profit) and also reduced Makespan and user wait times Job request. ACOBF classifies the user job request in three classes based on the sensitivity of the protocol associated with each request, Schedule Job request in each class based on job request deadline and create a Virtual Machine (VM) cluster to minimize energy consumption. Based on comprehensive experimentation, the simulated results show that the performance of ACOBF outperforms the benchmarked techniques in terms of convergence, diversity of solutions and stability.

Author 1: Fatima Umar Zambuk

Author 2: Abdulsalam Ya’u Gital

Author 3: Mohammed Jiya

Author 4: Nahuru Ado Sabon Gari

Author 5: Badamasi Ja’afaru

Author 6: Aliyu Muhammad

Keywords: Ant colony; scheduling; hybrid; foraging; cloud computing

Paper 54: Motor Insurance Claim Status Prediction using Machine Learning Techniques

Abstract: The insurance claim is a basic problem in insurance companies. Insurance insurers always have a challenge to the growing of insurance claim loss. Because there is the occurrence of claim fraud and the volume of claim data increases in the insurance companies. As a result, it is difficult to classify the insured claim status during the claim review process. Therefore, the aims of the study was to build a machine learning model that classifies and make motor insurance claim status prediction in machine learning approach. To achieve this study Missing value ratio, Z- Score, encoding techniques and entropy were used as data set preparation techniques. The final preprocessed data sets split using K- Fold cross validation techniques into training and testing sets. Finally the prediction model was built using Random Forest (RF) and Multi Class –Support Vector Machine (SVM).The performance of the models, RF and Multi –Class SVM classifiers were evaluated using Accuracy, Precision, Recall, and F- measure. The prediction accuracy of the model is capable of predicting the motor insurance claim status with 98.36% and 98.17% by RF and SVM classifiers respectively. As a result, RF classifier is slightly better than Multi-Class Support vector machines. Developing and implementing hybrid model to benefit from the advantages of different algorithms having graphical user interface to apply the solution to real world problem of the insurance company is a pressing future work.

Author 1: Endalew Alamir

Author 2: Teklu Urgessa

Author 3: Ashebir Hunegnaw

Author 4: Tiruveedula Gopikrishna

Keywords: Motor insurance claim; machine learning; classification; Random Forest (RF); Support Vector Machine (SVM); supervised learning

Paper 55: Detecting Malware based on Analyzing Abnormal behaviors of PE File

Abstract: Attack by spreading malware is a dangerous attack form that is very difficult to detect and prevent. Attack techniques that spread malware through users and then escalate privileges in the system are increasingly used by attackers. The three main methods and techniques for tracking and detecting malware that is being currently studied and applied include signature-based, behavior-based, and hybrid techniques. In particular, the behavior-based technique with the support of machine learning algorithms has given high efficiency. On the other hand, in reality, attackers often find various ways and techniques to hide behaviors of the malware based on the Portable Executable File Format (PE File) of the malware. This makes it difficult for surveillance systems to detect malware. From the above reasons, in this paper, we propose a malware detection method based on the PE File analysis technique using machine learning and deep learning algorithms. Our main contribution in this paper is proposing some features that represent abnormal behaviors of malware based on PE File and the efficiency of some machine learning algorithms in the classification process.

Author 1: Lai Van Duong

Author 2: Cho Do Xuan

Keywords: Malware; portable executable file format; detection malware; abnormal behaviors; machine learning; deep learning

Paper 56: Efficient and Secure Group based Collusion Resistant Public Auditing Scheme for Cloud Storage

Abstract: Tremendous changes have been seen in the arena of cloud computing from previous years. Many organizations share their data or files on cloud servers to avoid infrastructure and maintenance costs. Employees from different departments create their specific groups and share sensitive information among group members. Revoked users from the group may try to access this information by colluding with an untrusted cloud server. Many researchers have specified revocation procedures using re-signature, proxy-re-signature concept to deflect the collusion between the cloud server and a revoked user. But these techniques are costly in terms of communication overhead and verification cost if combined with auditing techniques to prove the integrity of outsourced data on the cloud server. To reduce this cost, a collusion resistant public auditing scheme with group member revocation is proposed in this paper. In this scheme, the data owner regularly updates the recent valid members list which is used by a third-party auditor to validate the signature so that collusion can be avoided. To verify the integrity of outsourced data, proposed scheme uses one of the modern cryptographic technique indistinguishability obfuscation combined with a one-way function which can reduce the verification time significantly. Experimental results show that the proposed scheme decreases the communication overhead and verification cost compared to existing schemes.

Author 1: Smita Chaudhari

Author 2: Gandharba Swain

Keywords: Public auditing; collusion attack; ring signature; message authentication code; indistinguishability obfuscation; dynamic data

Paper 57: Fog Network Area Management Model for Managing Fog-cloud Resources in IoT Environment

Abstract: The Internet of Things (IoT) paradigm is at the forefront of the present and future research activities. The enormous amount of sensing data needing to be processed increases dramatically in volume, variety, and velocity. In response, cloud computing was involved in handling the challenges of collecting, storing, and processing the data. The fog computing technology is a model used to support cloud computing by implementing pre-processing tasks close to the end-user for achieving low latency, less power consumption, and high scalability. However, some resources in fog computing network are not suitable for some tasks, or the number of requests increases outside capacity. So, it is more efficient to reduce sending tasks to the cloud. Perhaps some other fog resources are idle, and it is better to be federated rather than forwarding them to the cloud. This issue affects the fog environment's performance when dealing with large applications or applications sensitive to time processing. This research aims to propose a holistic fog-based resource management model to efficiently discover all the available services placed in resources considering their capabilities, deploy jobs into appropriate resources in the network effectively, and improve the IoT environment's performance. Our proposed model consists of three main components: job scheduling, job placement, and mobile agent software, explained in detail in this paper.

Author 1: Anwar Alghamdi

Author 2: Ahmed Alzahrani

Author 3: Vijey Thayananthan

Keywords: Resource management; job scheduling; load balancing; mobile agent software; fog computing; Internet of Things (IoT)

Paper 58: Automata-based Algorithm for Multiple Word Matching

Abstract: In this paper, an automata-based algorithm that finds the valid shifts of a given set of words W in text T is presented. Unlike known string matching algorithms, a preprocessing phase is applied to T and not to the words being searched for. In this phase, a deterministic finite state automaton (DFA) that recognizes the words in T is built and is augmented with their shifts in T. The preprocessing phase is relatively expensive in terms of time and space. However, it needs to be done once for any number of words to match in a given text document. The algorithm is analyzed for complexity, implemented and compared with an adjusted version of KMP algorithm. It showed better performance than KMP algorithm for large number of words to match in T.

Author 1: Majed AbuSafiya

Keywords: Algorithms; finite state automata; word matching; KMP

Paper 59: Question Answering Systems: A Systematic Literature Review

Abstract: Question answering systems (QAS) are developed to answer questions presented in natural language by extracting the answer. The development of QAS is aimed at making the Web more suited to human use by eliminating the need to sift through a lot of search results manually to determine the correct answer to a question. Accordingly, the aim of this study was to provide an overview of the current state of QAS research. It also aimed at highlighting the key limitations and gaps in the existing body of knowledge relating to QAS. Furthermore, it intended to identify the most effective methods utilized in the design of QAS. The systematic review of literature research method was selected as the most appropriate methodology for studying the research topic. This method differs from the conventional literature review as it is more comprehensive and objective. Based on the findings, QAS is a highly active area of research, with scholars taking diverse approaches in the development of their systems. Some of the limitations observed in these studies encompass the focused nature of current QAS, weaknesses associated with models that are used as building blocks for QAS, the need for standard datasets and question formats hence limiting the applicability of the QAS in practical settings, and the failure of researchers to examine their QAS solutions comprehensively. The most effective methods for designing QAS include focusing on syntax and context, utilizing word encoding and knowledge systems, leveraging deep learning, and using elements such as machine learning and artificial intelligence. Going forward, modular designs ought to be encouraged to foster collaboration in the creation of QAS.

Author 1: Sarah Saad Alanazi

Author 2: Nazar Elfadil

Author 3: Mutsam Jarajreh

Author 4: Saad Algarni

Keywords: Question answering systems; syntax; knowledge systems; deep learning; machine learning; systematic literature review; artificial intelligence

Paper 60: Comprehensive Analysis of Flow Incorporated Neural Network based Lightweight Video Compression Architecture

Abstract: The increasing video content over the internet motivated the exploration of novel approaches in the video compression domain. Though neural network based architectures have already emerge as de-facto in the field of image compression and analytics, their application in video compression also result in promising outputs. Adaptive and efficient compression techniques are required for video transmission over varying bandwidth. Several deep learning based techniques and enhancements were proposed and experimented but they didn’t exhibit full optimal behavior and are not end to end trained and optimized. In the zest of a pure and end to end trainable compression technique, a deep learning based video compression architecture has been proposed comprises of frame autoencoder, flow autoencoder and motion extension network for the reconstruction of predicted frames. The video compression network has been designed incrementally and trained with random emission steps strategy. The proposed work results in significant improvement in visual perception quality measured in SSIM and PSNR when compared to some state-of-art techniques but in trade-off with frame reconstruction time sheet.

Author 1: Sangeeta

Author 2: Preeti Gulia

Author 3: Nasib Singh Gill

Keywords: Deep learning; video compression; autoencoder; SSIM; PSNR

Paper 61: Empirical Study on Microsoft Malware Classification

Abstract: A malware is a computer program which causes harm to software. Cybercriminals use malware to gain access to sensitive information that will be exchanged via software infected by it. The important task of protecting a computer system from a malware attack is to identify whether given software is a malware. Tech giants like Microsoft are engaged in developing anti-malware products. Microsoft's anti-malware products are installed on over 160M computers worldwide and examine over 700M computers monthly. This generates huge amount of data points that can be analyzed as potential malware. Microsoft has launched a challenge on coding competition platform Kaggle.com, to predict the probability of a computer system, installed with windows operating system getting affected by a malware, given features of the windows machine. The dataset provided by Microsoft consists of 10,868 instances with 81 features, classified into nine classes. These features correspond to files of type asm (data with assembly language code) as well as binary format. In this work, we build a multi class classification model to classify which class a malware belongs to. We use K-Nearest Neighbors, Logistic Regression, Random Forest Algorithm and XgBoost in a multi class environment. As some of the features are categorical, we use hot encoding to make them suitable to the classifiers. The prediction performance is evaluated using log loss. We analyze the accuracy using only asm features, binary features and finally both. xGBoost provide a better log-loss value of 0.078 when only asm features are considered, a value of 0.048 when only binary features are used and a final log loss of 0.03 when all features are used, over other classifiers.

Author 1: Rohit Chivukula

Author 2: Mohan Vamsi Sajja

Author 3: T. Jaya Lakshmi

Author 4: Muddana Harini

Keywords: Multi-class classification; malware detection; XGBoost

Paper 62: Smart Intersection Design for Traffic, Pedestrian and Emergency Transit Clearance using Fuzzy Inference System

Abstract: Traffic flow is regulated and controlled with the aid of traffic signals implemented at all major intersections in urban areas. With the increase in vehicles, the traditional control strategies are incapable of clearing heavy traffic which leads to long traffic queues and prolonged waiting time at intersections. Smart cities are increasingly adopting solutions by developing smart traffic lights to improve the flow of vehicles. A major demand arises to increase the efficiency of traffic controllers with the objective to minimize traffic congestion, prioritize emergency transit and give way to pedestrians to cross the lanes at an intersection. This requires leveraging the existing techniques that identify the best solutions at the lowest possible cost. This paper proposes Fuzzy Adaptive Control System (FACS) that uses fuzzy logic to decide the phase sequence and green-time for each lane based on sensed input parameters. It is designed with an aim to improve traffic clearance at isolated intersection especially in peak traffic hours of the day along with giving precedence to emergency vehicle as soon as it is detected and also assist pedestrian passage thus reducing their waiting time at the intersection. Performance of the proposed Fuzzy Adaptive Control System (FACS) is evaluated through simulations and compared with Pre-Timed Control System (PTCS) and Traffic Density-based Control System (TDCS) at a busy intersection with lanes leading to offices, schools and hospitals. Simulation results show significant improvement over PTCS and TDCS in terms of traffic clearance, immediate addressing of the emergency vehicle and giving preference to pedestrian passage at the intersection.

Author 1: Aditi Agrawal

Author 2: Rajeev Paulus

Keywords: Adaptive traffic light control; smart intersection; fuzzy logic; emergency vehicle; pedestrian crossing

Paper 63: Computer Research Project Management

Abstract: Most research project managers, laboratories directors, young researchers at the beginning stage of thesis or professional research projects leaders are well effective at dealing with planned, scheduled events — they know how to function in conducting their research projects according to traditional knowledge areas of classical processes lied to time, cost, human resources, risk, stakeholders, and quality management. Unfortunately, they may have little specific training in selecting the best thematic of research. Indeed, they have no experience in identifying adequate research problems. Despite their motivation for the selected project and research thematic, they don’t well master research problematic and how to deal with: Literature for the selected thematic of research: (Sources, Documents, reports and technical folders): List of problems encountered during the research theme conducting and how to make profit of the obtained solutions approaches for these kinds of research problems. - How to decide if this research theme and the list of connected problems are already resolved or not by any other research team. This paper aims to develop this idea and finally to propose ontology named "Onto-Research-Project" that formalizes all the domain knowledge of computer research projects. Our final goal is to propose an approach for historical research projects reusing. The output of this approach is a computer research project memory. In this way, we have to make use and to restructure the knowledge obtained from the research computer projects stored in the database “HAL-Archives-Nouvelles”.

Author 1: Lassad Mejri

Author 2: Henda Hajjami Ben Ghezala

Author 3: Raja Hanafi

Keywords: Research projects; computer research project ontology; knowledge management; project memory

Paper 64: Clustering of Association Rules for Big Datasets using Hadoop MapReduce

Abstract: Mining association rules is essential in the discovery of knowledge hidden in datasets. There are many efficient association rule mining algorithms. However, they may suffer from generating large number of rules when applied to big datasets. Large number of rules makes knowledge discovery a daunting task because too many rules are difficult to understand, interpret or visualize. To reduce the number of discovered rules, researchers proposed approaches, such as rules pruning, summarizing, or clustering. For the flourishing field of big data and Internet-of-Things (IoT), more effective solutions are crucial to cope with the rapid evolution of data. In this paper, we are proposing a novel parallel association rule clustering approach which is based on Hadoop MapReduce. We ran many experiments to study the performance of the proposed approach, and promising results have been demonstrated, e.g. the lowest scaleup was 77%.

Author 1: Salahadin A. Moahmmed

Author 2: Mohamed A. Alasow

Author 3: El-Sayed M. El-Alfy

Keywords: Internet of Things; big data mining; clustering; association rules; Hadoop

Paper 65: Recent Advancement in Speech Recognition for Bangla: A Survey

Abstract: This paper presents a brief study of remarkable works done for the development of Automatic Speech Recognition (ASR) system for Bangla language. It discusses information of available speech corpora for this language and reports major contributions made in this research paradigm in the last decade. Some important design issues to develop a speech recognizer are: levels of recognition, vocabulary size, speaker dependency and approaches for classifications; these have been defined in this paper in the order of complexity of speech recognition. It also highlights on some challenges which are very important to resolve in this exciting research field. Different studies carried out on last decade for Bangla speech recognition have been shortly reviewed in a chronological order. It was found that selection of classification model and training dataset play important roles in speech recognition.

Author 1: Sadia Sultana

Author 2: M. Shahidur Rahman

Author 3: M. Zafar Iqbal

Keywords: Bangla ASR; Bangla speech corpora; speaker de-pendency; vocabulary size; classification approaches; challenges

Paper 66: Intrusion Detection using Deep Learning Long Short-term Memory with Wrapper Feature Selection Method

Abstract: Recently, many companies move to use cloud com-puting systems to enhance their performance and productivity. Using these cloud computing systems allows the execution of applications, data, and infrastructures on cloud platforms (i.e., online), which increase the number of attacks on such systems. As a resulting, building robust Intrusion detection systems (IDS) is needed. The main goal of IDS is to detect normal and abnormal network trafﬁc. In this paper, we propose a hybrid approach between an Enhanced Binary Genetic Algorithms (EBGA) as a wrapper feature selection (FS) algorithm and Long Short-Term Memory (LSTM). A novel injection method to prevent premature convergence of the GA is proposed in this paper. An intelligent k-means algorithm is employed to examine the solution distribution in the search space. Once 80% of the solutions belong to one cluster, an injection method (i.e., add new solutions) is used to redistribute the solutions over the search space. EBGA will reduce the search space as a preprocessing step, while LSTM works as a binary classiﬁcation method. UNSW-NB15, a real-world public dataset, is used in this work to evaluate the proposed system. The obtained results show the ability of feature selection method to enhance the overall performance of LSTM.

Author 1: Sana Al Azwari

Author 2: Hamza Turabieh

Keywords: Intrusion detection; feature selection; long short-term memory; binary genetic algorithm

Paper 67: Evaluation of Collaborative Filtering for Recommender Systems

Abstract: Recently, due to the increasing amount of data on the Internet along with the increase in products’ purchasing via e-commerce websites, Recommender Systems (RS) play an important role in guiding customers to buy products they may prefer. Furthermore, these systems help the companies to advertise their products to the most potential customers, and therefore raise their revenues. Collaborative Filtering (CF) is the most popular RS approach. It is classified into memory-based and model-based filtering. Memory-based filtering is in turn classified into user-based and item-based. Several algorithms have been proposed for CF. In this paper, a comparison has been performed between different CF algorithms to assess their performance. Specifically, we evaluated K-Nearest Neighbor (KNN), Slope One, co-clustering and Non-negative Matrix Factorization (NMF) algorithms. KNN algorithm is representative of the memory-based CF approach (both user-based and item-based). The other three algorithms, on the other hand, are under the model-based CF approach. In our experiments, we used a popular MovieLens dataset based on six evaluation metrics. Our results reveal that the KNN algorithm for item-based CF outperformed all other algorithms examined in this paper.

Author 1: Maryam Al-Ghamdi

Author 2: Hanan Elazhary

Author 3: Aalaa Mojahed

Keywords: Co-clustering; collaborative filtering; KNN; NMF; recommender systems; slope one

Paper 68: Deployment and Migration of Virtualized Services with Joint Optimization of Backhaul Bandwidth and Load Balancing in Mobile Edge-Cloud Environments

Abstract: Mobile edge-cloud computing environments appear as a novel computing paradigm to offer effective processing and storage solutions for delay sensitive applications. Besides, the container based virtualization technology becomes solicited due to its natural lightweight and portability as well as its small migration overhead that leads to seamless service migration and load balancing. However, with the mobility property, the users’ demands in terms of the backhaul bandwidth is a critical parameter that influences the delay constraints of the running applications. Accordingly, a Binary Integer Programming (BIP) optimization problem is formulated. It minimizes the users’ perceived backhaul delays and enhances the load-balancing degree in order to offer more chance to accept new requests along the network. Also, by introducing bandwidth constraints, the available user backhaul bandwidth after the placement are enhanced. Then, the adopted methodology to design two heuristic algorithms based on Ant Colony System (ACS) and Simulated Annealing (SA) is presented. The proposed schemes are compared using different metrics,and the benefits of the ACS-based solution compared to the SA-based as well as a genetic algorithm (GA) based solutions are demonstrated. Indeed, the normalized cost and the total backhaul costs are given by more optimal values using the ACS algorithm compared to the other solutions.

Author 1: Tarik Chanyour

Author 2: Mohammed Oucamah Cherkaoui Malki

Keywords: Mobile edge-cloud computing; delay-sensitive ser-vices; container migration; container deployment; backhaul band-width; load balancing

Paper 69: Zero-resource Multi-dialectal Arabic Natural Language Understanding

Abstract: A reasonable amount of annotated data is required for fine-tuning pre-trained language models (PLM) on down-stream tasks. However, obtaining labeled examples for different language varieties can be costly. In this paper, we investigate the zero-shot performance on Dialectal Arabic (DA) when fine-tuning a PLM on modern standard Arabic (MSA) data only— identifying a significant performance drop when evaluating such models on DA. To remedy such performance drop, we propose self-training with unlabeled DA data and apply it in the context of named entity recognition (NER), part-of-speech (POS) tagging, and sarcasm detection (SRD) on several DA varieties. Our results demonstrate the effectiveness of self-training with unlabeled DA data: improving zero-shot MSA-to-DA transfer by as large as ~10% F₁ (NER), 2% accuracy (POS tagging), and 4.5% F₁ (SRD). We conduct an ablation experiment and show that the performance boost observed directly results from the unlabeled DA examples used for self-training. Our work opens up opportunities for leveraging the relatively abundant labeled MSA datasets to develop DA models for zero and low-resource dialects. We also report new state-of-the-art performance on all three tasks and open-source our fine-tuned models for the research community.

Author 1: Muhammad Khalifa

Author 2: Hesham Hassan

Author 3: Aly Fahmy

Keywords: Natural language processing; natural language understanding; low-resource learning; semi-supervised learning; named entity recognition; part-of-speech tagging; sarcasm detec-tion; pre-trained language models

Paper 70: Distributed Mining of High Utility Sequential Patterns with Negative Item Values

Abstract: The sequential pattern mining was widely used to solve various business problems, including frequent user click pattern, customer analysis of buying product, gene microarray data analysis, etc. Many studies were going on these pattern mining to extract insightful data. All the studies were mostly concentrated on high utility sequential pattern mining (HUSP) with positive values without a distributed approach. All the ex-isting solutions are centralized which incurs greater computation and communication costs. In this paper, we introduce a novel algorithm for mining HUSPs including negative item values in support of a distributed approach. We use the Hadoop map reduce algorithms for processing the data in parallel. Various pruning techniques have been proposed to minimize the search space in a distributed environment, thus reducing the expense of processing. To our understanding, no algorithm was proposed to mine High Utility Sequential Patterns with negative item values in a distributed environment. So, we design a novel algorithm called DHUSP-N (Distributed High Utility Sequential Pattern mining with Negative values). DHUSP-N can mine high utility sequential patterns considering the negative item utilities from Bigdata.

Author 1: Manoj Varma

Author 2: Saleti Sumalatha

Author 3: Akhileshwar Reddy

Keywords: High utility sequential pattern mining; big data; utility mining; negative utility; distributed algorithms

Paper 71: Deep Neural Network-based Relationship Identification Framework to Discriminate Fake Profile Over Social Media

Abstract: Involvement of social media like personal, business and political propaganda activities, attracts anti-social activities and has also increased. Anti-social elements get a wider platform to spread negativity after hiding their identity behind fake and false profiles. In this paper, an analytical and methodological user identification framework is developed to significantly binds implicit and explicit link relationship over the end-users graphical perspective. Identify malicious user, its communal information and sockpuppet node. Apart from that, this work provides the concept of the deep neural network approach over the graphical and linguistic perspective of end-user to classify as malicious, fake and genuine. This concept also helps identify the trade-off between the similarity of nodes attributes and the density of connections to classifying identical profile as sockpuppet over social media.

Author 1: Suneet Joshi

Author 2: Deepak Singh Tomar

Keywords: Social media; anomaly detection; malicious activity; spam account; fake account; sockpuppet; deep neural network

Paper 72: A Parameter-free Clustering Algorithm based K-means

Abstract: Clustering is one of the relevant data mining tasks, which aims to process data sets in an effective way. This paper introduces a new clustering heuristic combining the E-transitive heuristic adapted to quantitative data and the k-means algorithm with the goal of ensuring the optimal number of clusters and the suitable initial cluster centres for k-means. The suggested heuris-tic, called PFK-means, is a parameter-free clustering algorithm since it does not require the prior initialization of the number of clusters. Thus, it generates progressively the initial cluster centres until the appropriate number of clusters is automatically detected. Moreover, this paper exposes a thorough comparison between the PFK-means heuristic, its diverse variants, the E-Transitive heuristic for clustering quantitative data and the traditional k-means in terms of the sum of squared errors and accuracy using different data sets. The experiments results reveal that, in general, the proposed heuristic and its variants provide the appropriate number of clusters for different real-world data sets and give good clusters quality related to the traditional k-means. Furthermore, the experiments conducted on synthetic data sets report the performance of this heuristic in terms of processing time.

Author 1: Said Slaoui

Author 2: Zineb Dafir

Keywords: Data mining; clustering; overlapping clustering; k-means; cluster centre initialization

Paper 73: Arabic Tweets Sentiment Analysis about Online Learning during COVID-19 in Saudi Arabia

Abstract: The COVID-19 pandemic can be considered as the greatest challenge of our time and is defining and re-shaping many aspects of our life such as learning and teaching, especially in the academic year of 2020. While some people could adapt quickly to online learning, others consider it to be inefficient. The re-opening of schools and universities is currently under consideration. However, many experts in many countries suggested that at least one semester should be online, during the pandemic. Understanding the public’s emotional reaction to online learning has become significant. This paper studies the attitude of people of Saudi Arabia towards online learning. We have used a collection of Arabic tweets posted in 2020, collected mainly via hashtags that originated in Saudi Arabia. Our sentiment analysis has shown that people have maintained a neutral response to online learning. This study will allow scholars and decision makers to understand the emotional effects of online learning on communities.

Author 1: Asma Althagafi

Author 2: Ghofran Althobaiti

Author 3: Hosam Alhakami

Author 4: Tahani Alsubait

Keywords: Social media analytics; sentiment analysis; online learning; Arabic tweets

Paper 74: A Framework for Data Research in GIS Database using Meshing Techniques and the Map-Reduce Algorithm

Abstract: Everywhere, centers, laboratories, hospital and pharmacy have faced many challenges to delivery quality of health service due to constraints related to limited availability of resources such as drugs, places, equipments and specialists, often in health deficit with increasing number of patients, for instance during COVID-19 pandemic. Late information on these constraints from health service centers will play negatively on service quality because of time delayed between requesting service on place and the response to delivery safe service. All these problems don’t strengthen prevention or fighting against diseases in a region. This paper proposes a data research framework in a NoSQL database based on GIS data, containing an abstract table that could be inherited or specialized to any adopted GIS solution leading to a central data management instead of installing several database sites. The central database accepts data updated in back office by data owner and allows data research based on meshing Techniques and the map-reduce algorithm in front office. Variant meshing techniques have been presented to clustering GIS data with associated definitions of the content of map-reduce in order to improve processing time. In application in health service, the experimental results reveal that this system contributes to improve drug management in pharmacies and could be also used in others fields such as Finance, Education and Shopping through agencies spread over the territory, to strengthen national information systems and harmonised data.

Author 1: Abdoulaye SERE

Author 2: Jean Serge Dimitri OUATTARA

Author 3: Didier BASSOLE

Author 4: Jose Arthur OUEDRAOGO

Author 5: Moubaric KABORE

Keywords: Map-reduce; big data; digital health; classification; Geographic Information System (GIS); COVID-19; Spark; Mon-goDb; NewSQL;NoSQL

Paper 75: Pitch Contour Stylization by Marking Voice Intonation

Abstract: The stylization of pitch contour is a primary task in the speech prosody for the development of a linguistic model. The stylization of pitch contour is performed either by statistical learning or statistical analysis. The recent statistical learning models require a large amount of data for training purposes and rely on complex machine learning algorithms. Whereas, the statistical analysis methods perform stylization based on the shape of the contour and require further processing to capture the voice intonations of the speaker. The objective of this paper is to devise a low-complexity transcription algorithm for the stylization of pitch contour based on the voice intonation of a speaker. For this, we propose to use of pitch marks as a subset of points for the stylization of the pitch contour. The pitch marks are the instance of glottal closure in a speech waveform that captures characteristics of speech uttered by a speaker. The selected subset can interpolate the shape of the pitch contour and acts as a template to capture the intonation of a speaker’s voice, which can be used for designing applications in speech synthesis and speech morphing. The algorithm balances the quality of the stylized curve and its cost in terms of the number of data points used. We evaluate the performance of the proposed algorithm using the mean square error and the number of lines used for fitting the pitch contour. Furthermore, we perform a comparison with other existing stylization algorithms using the LibriSpeech ASR corpus.

Author 1: Sakshi Pandey

Author 2: Amit Banerjee

Author 3: Subramaniam Khedika

Keywords: Pitch contour; pitch marking; linear stylization; straight-line approximation

Paper 76: A Multi-purpose Data Pre-processing Framework using Machine Learning for Enterprise Data Models

Abstract: Growth in the data processing industry has automated decision making for various domains such as engineering, education and also many fields of research. The increased growth has also accelerated higher dependencies on the data driven business decisions on enterprise scale data models. The accuracy of such decisions solely depends on correctness of the data. In the recent past, a good number of data cleaning methods are projected by various research attempts. Nonetheless, most of these outcomes are criticized for higher generalness or higher specificness. Thus, the demand for multi-purpose, however domain specific, framework for enterprise scale data pre-processing is in demand in the recent time. Hence, this work proposes a novel framework for data cleaning method as missing value identification using the standard domain length with significantly reduced time complexity, domain specific outlier identification using customizable rule engine, detailed generic outlier reduction using double differential clustering and finally dimensionality reduction using the change percentage dependency mapping. The outcome from this framework is significantly impressive as the outliers and missing treatment showcases nearly 99% accuracy over benchmarked dataset.

Author 1: Venkata Ramana B

Author 2: Narsimha G

Keywords: Standard domain length; domain specific rule engine; double differential clustering; change percentage; dependency map

Paper 77: Deep Attention on Measurable and Behavioral-driven Complete Service Composition Design Process

Abstract: The web service technology has still proved its effectiveness in the digital revolution we are facing. This success unfortunately raises more and more complex obstacles, particularly related to the service composition. The integration of Non-Functional Requirements (NFRs) in each step of service composition process, starting with abstract service composition specification to the generation of the verified and concrete composed services, represents one of them. Furthermore, this complexity remains more difficult when NFRs are addressed in both quantifiable (i.e. Quality of Service) and behavioral aspects. Despite the relevant contributions present in the literature, this challenge still remains an open issue when considering NFRs modeling, publishing, integrating with each other, and handling conflicts and dependencies in the whole composition’s lifecycle. As a consequence, we suggest this contribution that aims to propose an approach showing how to weave efficiently required NFRs with functional requirements in a complete lifecycle composition supporting specification, formalization, model checking verification and integration steps of desired concrete composite service. Patient Health Records in Regional and University Health Centers in Morocco is used as a case study to experiment our approach.

Author 1: Ilyass El Kassmi

Author 2: Radia Belkeziz

Author 3: Zahi Jarir

Keywords: Non-Functional requirements composition; behavioral non-functional requirements; quantifiable non-functional requirements; model checking; web service composition formalization

Paper 78: Concatenative Speech Recognition using Morphemes

Abstract: This paper adopts a novel sub-lexical approach to construct viable continuous speech recognition systems with scalable vocabulary that use the components of words to form the elements of pronunciation dictionaries and recognition lattices. The proposed Concatenative ASR family utilizes combination rules between morphemes (prefixes, stems, and suffixes), along with their theoretical grammatical categories. The constrained structure reduces invalid words by using grammar rules governing agglutination of affixes with stems, while having a large vocabulary space and hence fewer out-of-vocabulary words. In pursuing this approach, the project develops automatic speech recognition (ASR) parameterized models, designs parameter values, constructs and implements ASR systems, and analyzes the characteristics of these systems. The project designs parameter values in the context of Arabic to yield a subset hierarchy of vocabularies of the ASR systems facilitating meaningful analysis. It investigates the characteristics of the ASR systems with respect to vocabulary, recognition lattice, dictionary, and word error rate (WER). In the experiments, the standard Word ASR model has the best characteristics for vocabulary of up to five thousand words and the Concatenative ASR family is most appropriate for vocabulary of up to half a million words. The paper shows that the approach used encompasses fundamentally different processes of word formation and thus is applicable to languages that exhibit concatenative word-formation processes.

Author 1: Afshan Jafri

Keywords: Morphemes; sub-lexemes; speech recognition; Arabic; concatenative morphology

Paper 79: Multiclass Vehicle Classification Across Different Environments

Abstract: Vehicle detection and classification are necessary components in a variety of useful applications related to traffic, security, and autonomous driving systems. Many studies have focused on recognizing vehicles from the point of view of a single perspective, such as the rear of other cars from the driving seat, but not from all possible perspectives, including the aerial view. In addition, they are usually given prior knowledge of a specific kind of vehicle, such as the fact that it is a car, as opposed to being a bus, before deducing other information about it. One of the popular classification techniques used is boosting, where weak classifiers are combined to form a strong classifier. However, most boosting applications consider complex classification problems to be a combination of binary problems. This paper explores in detail the development of a multi-class classifier that recognizes vehicles of any type, from any view, without prior information, and without breaking the task into binary problems. Instead, a single multi-class application of the GentleBoost algorithm is used. This system is compared to a similar system built from a combination of separate classifiers that each classifies a single vehicle. The results show that a single, multi-class classifier clearly outperforms a combination of separate classifiers, and proves that a simple boosting classifier is sufficient for vehicle recognition, given any type of vehicle from any perspective of viewing, without the need of representing the problem as a complex 3D model.

Author 1: Aisha S. Azim

Author 2: Afshan Jafri

Author 3: Ashraf Alkhairy

Keywords: Vehicle detection; vehicle recognition; multiclass learning; boosting; GentleBoost

Paper 80: Arabic Sign Language Recognition using Faster R-CNN

Abstract: Deafness does not restrict its negative effect on the person’s hearing, but rather on all aspect of their daily life. Moreover, hearing people aggravated the issue through their reluctance to learn sign language. This resulted in a constant need for human translators to assist deaf person which represents a real obstacle for their social life. Therefore, automatic sign language translation emerged as an urgent need for the community. The availability and the widespread use of mobile phones equipped with digital cameras promoted the design of image-based Arabic Sign Language (ArSL) recognition systems. In this work, we introduce a new ArSL recognition system that is able to localize and recognize the alphabet of the Arabic sign language using a Faster Region-based Convolutional Neural Network (R-CNN). Specifically, faster R-CNN is designed to extract and map the image features, and learn the position of the hand in a given image. Additionally, the proposed approach alleviates both challenges; the choice of the relevant features used to encode the sign visual descriptors, and the segmentation task intended to determine the hand region. For the implementation and the assessment of the proposed Faster R-CNN based sign recognition system, we exploited VGG-16 and ResNet-18 models, and we collected a real ArSL image dataset. The proposed approach yielded 93% accuracy and confirmed the robustness of the proposed model against drastic background variations in the captured scenes.

Author 1: Rahaf Abdulaziz Alawwad

Author 2: Ouiem Bchir

Author 3: Mohamed Maher Ben Ismail

Keywords: Arabic sign language recognition; supervised learning; deep learning; faster region based convolutional neural network

Paper 81: Attack Resilient Trust and Signature-based Intrusion Detection Systems

Abstract: Wireless sensor networks have been widely applied in many areas due to their unique characteristics. These have exposed them to different types of active and passive attacks. In the literature, several solutions have been proposed to mitigate these attacks. Most of the proposed solutions are too complex to be implemented in wireless sensor networks considering the resource-constraint of sensor nodes. In this work, we proposed a hierarchical trust mechanism based on clustering approach to detect and prevent denial of service attacks in wireless sensor networks. The approach was validated through simulation using Network Simulator (NS2). The following metrics were used to evaluation the proposed scheme: packet delivery ratio, network lifetime, routing delay, overhead, and number of nodes. The proposed approach is capable of detecting compromised sensor nodes vulnerable to a denial of service attacks. Moreover, it is able to detect all sensed data that have been compromised during transmission to the base station. The results show that our method can effectively detect and defend against denial of service attacks in sensor wireless sensor networks.

Author 1: Boniface Kabaso

Author 2: Saber A. Aradeh

Author 3: Ademola P. Abidoye

Keywords: Wireless sensor network; routing attacks; public-key cryptography; packet dropping; denial of service attacks

Paper 82: Factors Influencing the Use of Wireless Sensor Networks in the Irrigation Field

Abstract: Battle control, natural disasters discovery, water monitoring, smart homes, agricultural applications, health care, weather forecasts, smart buildings, intrusion detection, medical devices, and more are the application areas of wireless sensor networks (WSNs). WSNs can help bring about revolutionary changes in important areas of our world. As a result, this technology has become a particularly interesting technology, which can be used to meet the specific requirements of a particular application because of its distinctive characteristics. In this context, WSNs are a promising approach in the agricultural sector and irrigation in particular for overcoming the world's major problems (e.g., the global water crisis). When implementing WSN in the irrigation field, many factors, like limited sensor node resources, limited sensor node power, costs, hardware constraints, and type of deployment environment, must be taken into account in order to improve WSN performance and achieve the desired results. In this paper, we will study and analyze the main factors that affect WSNs in the irrigation field. We will also provide a set of measures and solutions that need to be taken to overcome the challenges of deploying a WSN in irrigation. In this regard, we will also highlight several factors for improvement to achieve an efficient and consistent irrigation system using WSN.

Author 1: Loubna HAMAMI

Author 2: Bouchaib NASSEREDDINE

Keywords: Cost; energy consumption and management; irrigation; smart irrigation; wireless sensor network; WSN deployment

Paper 83: Deep Learning Algorithm for Classification of Cerebral Palsy from Functional Magnetic Resonance Imaging (fMRI)

Abstract: Cerebral palsy is a disorder of neurology that may be caused by prenatal, perinatal or postnatal reasons that result in the failure of motor functioning in children besides mental well-being. Referring to the location of brain injury and the effect of it on the muscle tone, cerebral palsy is classified into subgroups namely spastic, non-spastic etc. Each type of palsy varies in symptoms and hence the therapy planning and rehabilitation are decided depending on the factors involved in each type. This urges the requirement of a suitable technique to classify the type of Palsy at the earlier stages to effectively plan therapy. Functional MRI of the neonatal brain helps in imaging and classification of cerebral palsy. The deep neural network is a subset of machine learning that is widely used in image classification applications. This technique is applied to the functional magnetic resonance brain images of infants to classify the type of cerebral palsy using a deep convolutional network of modified AlexNet architecture that helps the physician further in a planned rehabilitation to facilitate the lifestyle of the affected children.

Author 1: Pradeepa Palraj

Author 2: Gopinath Siddan

Keywords: Cerebral palsy; deep neural network; functional magnetic resonance image