IJACSA Volume 14 Issue 3 - thesai.org

IJACSA Volume 14 Issue 3

Copyright Statement: This is an open access publication licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

View Full Issue

Paper 1: An Ontology-driven DBpedia Quality Enhancement to Support Entity Annotation for Arabic Text

Abstract: Improving NLP outputs by extracting structured data from unstructured data is crucial, and several tools are available for the English language to achieve this objective. However, little attention has been paid to the Arabic language. This research aims to address this issue by enhancing the quality of DBpedia data. One limitation of DBpedia is that each resource can belong to multiple types and may not represent the intended concept. Additionally, some resources may be assigned incorrect types. To overcome these limitations, this study proposes creating a new ontology to represent Arabic data using the DBpedia ontology, followed by an algorithm to verify type assignments using the resource's title metadata and similarity between resources' descriptions. Finally, the research builds an entity annotation tool for Arabic using the verified dataset.

Author 1: Adham Kahlawi

Keywords: Entity annotation; semantics annotation; DBpedia; Arabic language; ontology; semantic web; linked open data

Paper 2: Illicit Activity Detection in Bitcoin Transactions using Timeseries Analysis

Abstract: A key motivator for the usage of cryptocurrency such as bitcoin in illicit activity is the degree of anonymity provided by the alphanumeric addresses used in transactions. This however does not mean that anonymity is built into the system as the transactions being made are still subject to the human element. Additionally, there is around 400 Gigabytes of raw data available in the bitcoin blockchain, making it a big data problem. HPCC Systems is used in this research, which is a data intensive, open source, big data platform. This paper attempts to use timing data produced by taking the time intervals between consecutive transactions performed by an address and make an identification of the nature of the address (illegal or legal). With the use of three different goodness of fit run tests namely Kolmogorov–Smirnov test, Anderson-Darling test and Cramér–von Mises criterion, two addresses are compared to find if they are from the same source. The BABD-13 dataset was used as a source of illegal addresses, which provided both references and test data points. The research shows that time-series data can be used to represent transactional behaviour of a user and the algorithm proposed is able to identify different addresses originating from the same user or users engaging in similar activity.

Author 1: Rohan Maheshwari

Author 2: Sriram Praveen V A

Author 3: Shobha G

Author 4: Jyoti Shetty

Author 5: Arjuna Chala

Author 6: Hugo Watanuki

Keywords: Bitcoin; time-series analysis; HPCC systems; random time interval; illicit activity detection

Paper 3: Strategic Monitoring for Efficient Detection of Simultaneous APT Attacks with Limited Resources

Abstract: Advanced Persistent Threats (APT) are a type of sophisticated multistage cyber attack, and the defense against APT is challenging. Existing studies apply signature-based or behavior-based methods to analyze monitoring data to detect APT, but little research has been dedicated to the important problem of addressing APT detection with limited resources. In order to maintain the primary functionality of a system, the resources allocated for security purposes, for example logging and examining the behavior of a system, are usually constrained. Therefore, when facing multiple simultaneous powerful cyber attacks like APT, the allocation of limited security resources becomes critical. The research in this paper focuses on the threat model where multiple simultaneous APT attacks exist in the defender’s system, but the defender does not have sufficient monitoring resources to check every running process. To capture the footprint of multistage activities including APT attacks and benign activities, this work leverages the provenance graph which is constructed based on dependencies of processes. Furthermore, this work studies the monitoring strategy to efficiently detect APT attacks from incomplete information of paths on the provenance graph, by considering both the “exploitation” effect and the “exploration” effect. The contributions of this work are two-fold. First, it extends the classic UCB algorithm in the domain of the multi-armed bandit problem to solve cyber security problems, and proposes to use the malevolence value of a path, which is generated by a novel LSTM neural network as the exploitation term. Second, the consideration of “exploration” is innovative in the detection of APT attacks with limited monitoring resources. The experimental results show that the use of the LSTM neural network is beneficial to enforce the exploitation effect as it satisfies the same property as the exploitation term in the classic UCB algorithm and that by using the proposed monitoring strategy, multiple simultaneous APT attacks are detected more efficiently than using the random strategy and the greedy strategy, regarding the time needed to detect same number of APT attacks.

Author 1: Fan Shen

Author 2: Zhiyuan Liu

Author 3: Levi Perigo

Keywords: Advanced persistent threats; intrusion detection; LSTM; multi-armed bandit

Paper 4: Deep Learning Algorithm based Wearable Device for Basketball Stance Recognition in Basketball

Abstract: With the continuous improvement of technology, modern sports training is gradually developing towards precision and efficiency, which requires more accurate identification of athletes' sports stances. The study first establishes a classification structure of basketball stance, then designs a hardware module to collect different stance data by using inertial sensors, thus extracting multidimensional motion stance features. Then the traditional convolutional neural network (CNN) is improved by principal component analysis (PCA) to form the PCA+CNN algorithm. Finally, the algorithm is simulated and tested. The outcomes demonstrated that the average discrimination error rate of the improved PCA+CNN algorithm in the Human 3.6M dataset was 3.15%, which was a low error rate. In recognition of basketball sports pose, the wearable based on the improved algorithm had the highest accuracy of 99.4% and took the quietest time of 18s, which was better than the other three methods. It demonstrated that the method had high discrimination precision and recognition efficiency, which could provide a reliable technical means to improve the science of basketball sports training plan and training effect.

Author 1: Lan Jiang

Author 2: Dongxu Zhang

Keywords: Deep learning; wearable devices; basketball; sports pose; CNN

Paper 5: Dynamic Hardware Redundancy Approaches Towards Improving Service Availability in Fog Computing

Abstract: The distributed nature of fog computing is designed to alleviate bottleneck traffic congestion which happens when a massive number of devices try to connect to more powerful computing resources simultaneously. Fog computing focuses on bringing data processing geographically closer to data source utilizing existing computing resources such as routers and switches. This heterogeneity nature of fog computing is an important feature and a challenge at the same time. To enhance fog computing availability with such nature, several studies have been conducted using different methods such as placement policies and scheduling algorithms. This paper proposes a fog computing model that includes an extra layer of duplex management system. This layer is designated for operating fog managers and warm spares to ensure higher availability for such a geographically disseminated paradigm. A Markov chain is utilized to calculate the probabilities of each possible state in the proposed model along with availability analysis. By utilizing the standby system, we were able to increase the availability to 93%.

Author 1: Sara Alraddady

Author 2: Ben Soh

Author 3: Mohammed AlZain

Author 4: Alice Li

Keywords: Fog computing; fault tolerance; Markov chain; hardware redundancy

Paper 6: Eye Contact as a New Modality for Man-machine Interface

Abstract: In daily life, people use many appliances, where different machines and tools should be operated with their specialized interfaces. These specialized interfaces are often not intuitive and thus require considerable time and effort to master. On the other hand, human communications are rich in modalities and mostly intuitive. One of them is eye contact. This study proposes eye contact for enriching modalities for human-machine interface. The proposed interface modality, based on a neural network for object detection, allows humans to initiate machine operations by looking at them. In this paper, the hardware framework for building this interface is elaborated and the results of usability assessment through users’ experiments are reported.

Author 1: Syusuke Kobayashi

Author 2: Pitoyo Hartono

Keywords: Eye contact; man-machine interface; non-verbal communication; object detection; neural network

Paper 7: Pancreatic Cancer Segmentation and Classification in CT Imaging using Antlion Optimization and Deep Learning Mechanism

Abstract: Pancreatic cancer, a fatal type of cancer, has a very poor prognosis. To monitor, forecast, and categorise cancer presence, automated pancreatic cancer segmentation and classification utilising Computer-Aided Diagnostic (CAD) model can be used. Furthermore, deep learning algorithms can provide in-depth diagnostic knowledge and precise image analysis for therapeutic usage. In this context, our study aims to develop an Antlion Optimization-Convolutional Neural Network-Gated Recurrent Unit (ALO-CNN-GRU) model for pancreatic tumour segmentation and classification based on deep learning and CT scans. The ALO-CNN-GRU technique’s objective is to segment and categorize the presence of cancer tissues. This technique consists of pre-processing, segmentation and feature extraction and classification phases. The images go through a pre-processing stage to reduce noise from the dataset that was obtained. A hybrid Gaussian and median filter is applied for the pre-processing phase. To identify the pancreatic area that is affected, the segmentation is processed utilizing the Antlion optimization method. Then, the categorization of pancreatic cancer as benign or malignant is done by employing the classifiers of the Convolutional neural network and Gated Recurrent Unit networks. The suggested model offers improved precision and a better rate of pancreatic cancer diagnosis with an accuracy of 99.92%.

Author 1: Radhia Khdhir

Author 2: Aymen Belghith

Author 3: Salwa Othmen

Keywords: Pancreatic cancer; antlion optimization; deep learning; convolutional neural network

Paper 8: A New Task Scheduling Framework for Internet of Things based on Agile VNFs On-demand Service Model and Deep Reinforcement Learning Method

Abstract: Recent innovations in the Internet of Things (IoT) have given rise to IoT applications that require quick response times and low latency. Fog computing has proven to be an effective platform for handling IoT applications. It is a significant challenge to deploy fog computing resources effectively because of the heterogeneity of IoT tasks and their delay sensitivity. To take advantage of idle resources in IoT devices, this paper presents an edge computing concept that offloads edge tasks to nearby IoT devices. The IoT-assisted edge computing should meet two conditions, edge services should exploit the computing resources of IoT devices effectively and edge tasks offloaded to IoT devices do not interfere with local IoT tasks. Two main phases are included in the proposed method: virtualization of edge nodes, and task scheduling based on deep reinforcement learning. The first phase offers a layered edge framework. In the second phase, we applied deep reinforcement learning (DRL) to schedule tasks taking into account the diversity of tasks and the heterogeneity of available resources. According to simulation results, our proposed task scheduling method achieves higher levels of task satisfaction and success than existing methods.

Author 1: Li YANG

Keywords: Internet of things; task scheduling; edge computing; resource allocation

Paper 9: A Blockchain-based Three-factor Mutual Authentication System for IoT using PUFs and Group Signatures

Abstract: The widespread adoption of Internet of Things has brought many benefits to society, such as increased efficiency and convenience in various aspects of daily life. However, this has also led to a rise in security threats. Moreover, resource-constrained feature of IoT devices makes them vulnerable to various attacks that compromise the user's privacy and sensitive information confidentiality. It is therefore essential to address the security concerns of IoT devices to ensure their reliable and secure operation. This paper proposes a blockchain-based three-factor mutual authentication system for IoT using Elliptic Curve Cryptography, physical unclonable functions and group signatures. The main purpose is to achieve a secure mutual authentication among different involved entities while providing anonymous group member authentication and reliable auditing. The AVISPA tool is utilized in the paper to formally prove that the proposed system satisfies the security and privacy requirements.

Author 1: Meriam Fariss

Author 2: Ahmed Toumanari

Keywords: Internet of Things; blockchain; mutual authentication; physical unclonable functions; biometrics; group signatures; elliptic curve cryptography

Paper 10: Self-Adapting Security Monitoring in Eucalyptus Cloud Environment

Abstract: This paper discusses the importance of virtual machine (VM) scheduling strategies in cloud computing environments for handling the increasing number of tasks due to virtualization and cloud computing technology adoption. The paper evaluates legacy methods and specific VM scheduling algorithms for the Eucalyptus cloud environment and compare existing algorithms using QoS. The paper also presents a self-adapting security monitoring system for cloud infrastructure that takes into account the specific monitoring requirements of each tenant. The system uses Master Adaptation Drivers to convert tenant requirements into configuration settings and the Adaptation Manager to coordinate the adaptation process. The framework ensures security, cost efficiency, and responsiveness to dynamic events in the cloud environment. The paper also presents the need for improvement in the current security monitoring platform to support more types of monitoring devices and cover the consequences of multi-tenant setups. Future work includes incorporating log collectors and aggregators and addressing the needs of a super-tenant in the security monitoring architecture. The equitable sharing of monitoring resources between tenants and the provider should be established with an adjustable threshold mentioned in the SLA. The results of experiments show that Enhanced Round-Robin uses less energy compared to other methods, and the Fusion Method outperforms other techniques by reducing the number of Physical Machines turned on and increasing power efficiency.

Author 1: Salman Mahmood

Author 2: Nor Adnan Yahaya

Author 3: Raza Hasan

Author 4: Saqib Hussain

Author 5: Mazhar Hussain Malik

Author 6: Kamal Uddin Sarker

Keywords: Component; VM scheduling; cloud computing; Eucalyptus; virtualization; power efficiency; self-adapting security monitoring system; tenant-driven customization; dynamic events; adaptation manager; master adaptation drivers

Paper 11: Optimal Training Ensemble of Classifiers for Classification of Rice Leaf Disease

Abstract: Rice is one of the most extensively cultivated crops in India. Leaf diseases can have a significant impact on the productivity and quality of a rice crop. Since it has a direct impact on the economy and food security, the detection of rice leaf diseases is the most important factor. The most prevalent diseases affecting rice leaves are leaf blast, brown spots, and hispa. To address this issue, this research builds a new classification model for rice leaf diseases. The model begins with a preprocessing step that employs the Median Filter (MF) process. Improved BIRCH is then utilized for picture segmentation. Features such as LBP, GLCM, color, shape, and modified Median Binary Pattern (MBP) are retrieved from segmented images. Then, an ensemble of three classification models, including Bi-GRU, Convolutional Neural Network (CNN), and Deep Maxout (DMN) is utilized. By adjusting the model weights, the suggested Opposition Learning Integrated Hybrid Feedback Artificial and Butterfly algorithm (OLIHFA-BA) will train the model to improve the performance of the proposed work.

Author 1: Sridevi Sakhamuri

Author 2: K Kiran Kumar

Keywords: Rice Leaf; Modified MBP; Bi-GRU; Improved BIRCH; OLIHFA-BA Algorithm

Paper 12: Optimal Land-cover Classification Feature Selection in Arid Areas based on Sentinel-2 Imagery and Spectral Indices

Abstract: Adding spectral indices to Sentinel-2 spectral bands to improve land-cover (LC) classification with limited sample size can affect the accuracy due to the curse of dimensionality. In this study, we compared the performance metrics of Random Forest (RF) classifier with three different combinations of features for land cover classification in an urban arid area. The first combination used the ten Sentinel-2 bands with 10 and 20 m spatial resolution. The second combination consisted of the first combination in addition to five common spectral indices (15 features). The third combination represented the best output of features in terms of performance metrics after applying recursive feature elimination (RFE) for the second combination. The results showed that applying RFE reduced the number of features in combination 2 from 15 to 8 and the average F1-score indicator increased by nearly 8 and 6 percent in comparison with using the other two combinations respectively. The findings of this study confirmed the importance of feature selection in improving LC classification accuracy in arid areas through removing the redundant variable when using limited sample size and using spectral indices with spectral bands, respectively.

Author 1: Mohammed Saeed

Author 2: Asmala Ahmad

Author 3: Othman Mohd

Keywords: Feature selection; land cover; sentinel-2; arid areas; random forest; accuracy

Paper 13: An Add-on CNN based Model for the Detection of Tuberculosis using Chest X-ray Images

Abstract: Machine Learning has been potentially contributing towards smart diagnosis in the medical domain for more than a decade with a target towards achieving higher accuracy in detection and classification. However, from the perspective of medical image processing, the contribution of machine learning towards segmentation is not been much to find in recent times. The proposed study considers a use case of Tuberculosis detection and classification from chest x-rays where a unique machine learning approach of Convolution Neural Network is adopted for segmentation of lung images from CXR. A computational framework is developed that performs segmentation, feature extraction, detection, and classification. The proposed system's study outcome is analyzed with and without segmentation over existing machine learning models to exhibit 99.85% accuracy, which is the highest score to date in contrast to existing approaches found in the literature. The study outcome based on the comparative analysis exhibits the effectiveness of the proposed system.

Author 1: Roopa N K

Author 2: Mamatha G S

Keywords: Chest X-Ray; machine learning; convolution neural network; segmentation; detection; classification

Paper 14: Brightness and Contrast Enhancement Method for Color Images Via Pairing Adaptive Gamma Correction and Histogram Equalization

Abstract: For enhanced adaptability to poor light enhancement whilst achieving high image contrast, a new method for color image correction based on the advantages of non-linear function in grey transformation and histogram equalization techniques is proposed in this work. Firstly, the original red, green and blue (RGB) image is converted into the HSV color space, and the V channel is used for enhancement. An adaptive gamma generator is proposed to adaptively calculate gamma parameters in accordance with dark, medium, or bright image conditions. The computed gamma parameters are used to propose a cumulative distribution function that produces an optimized curve for illumination values. Next, a second modified equalization is performed to evenly correct the offset of the illumination curve values on the basis of the equal probability of the available values only. Finally, the processed V channel replaces the original V channel, and the new HSV model returns to the RGB color space. Experiments show that the proposed method can significantly improve the low contrast and poor illumination of the color image whilst preserving the color and details of the original image. Results from benchmark data sets and measurements indicate that the proposed method outperforms other state-of-the-art methods.

Author 1: Bilal Bataineh

Keywords: Color image; gamma correction; histogram equalization; image contrast; image enhancement

Paper 15: A Study on Institution Improvement Plans for the National Supercomputing Joint Utilization System in South Korea

Abstract: The purpose of this paper is to discover institutional gaps in the supercomputing joint utilization system that the government is actively promoting as an alternative to the problem of shortage of domestic supercomputing resources. The institutional gaps were discovered by examining the current status of laws, top-level plans, and operating guidelines related to the current joint utilization system and matching them with problems or issues that need to be resolved socially. The improvement plan for the institutional gaps was derived at a level that can be reflected in the operating guidelines of the Specialized Center and Unit Center so that the performing subject constituting the joint utilization system can directly participate and solve it. In the future, for the effective operation of the joint utilization system, we plan to promote the domestic market through the diffusion of research results and secure external technological competitiveness by reflecting the contents of institution improvement.

Author 1: Hyungwook Shim

Author 2: Myungju Ko

Author 3: Yungeun Choe

Author 4: Jaegyoon Hahm

Keywords: Component; supercomputer; joint utilization system; institutional gap; national supercomputing center; specialized center

Paper 16: Efficient Handwritten Signatures Identification using Machine Learning

Abstract: Any agreement or contract between two or more parties requires at least one party to employ a signature as evidence of the other parties' identities and as a means of establishing the parties' intent. As a result, more people are curious about Signature Recognition than other biometric methods like fingerprint scanning. Utilizing both Fourier Descriptors and histogram of oriented gradients (HOG) features, this paper presents an efficient algorithms for signature recognition. The use of Local binary patterns (LBP) features in a signature verification technique has been proposed. Using morphological techniques, the signature is encapsulated within a curve that is both symmetrical and a good match. Measured by the frequency with which incorrect patterns are confirmed by a given system, false acceptance rate (FAR) provides an indication of the effectiveness and precision of the proposed system. Using a local dataset of 60 test signature patterns, this investigation found that 10% were incorrectly accepted for FAR of 0.169. Experiments are conducted on signature photos from a local dataset. Verification of signatures has previously made use of KNN classifier. KNN classifier produced higher FARs and recognition accuracies than prior techniques.

Author 1: Ibraheem M. Alharbi

Keywords: K-nearest neighbor; histogram of oriented gradients; local binary patterns; false acceptance rate; Fourier descriptors

Paper 17: An AHP based Task Scheduling and Optimal Resource Allocation in Cloud Computing

Abstract: Cloud systems by virtue characterize ultimate resource utilization with ever evolving user requirements facilitating adaptivity. With a scope of enhancing the QoS needs of user applications, numerous factors are considered for tunning among which Task scheduling promises to grab focus. The Task Scheduling mechanism ascertains improvement by distributing the subtasks to specific set of resources pertaining to prevailing Quality models. The work emphasizes the need for effective task scheduling and optimizing resource allocation by modelling a modified AHP (Analytical Hierarchy Process) driven approach. The proposed method guarantees the functionality in two phases pertaining to Task ranking and pipelined with Optimized scheduling algorithms resulting in maximization of resource utilization. The former phase of task ranking is aided by improved AHP with substantial usage of fuzzy clustering followed by an enhanced CUCMCA (Chimp Updated and Cauchy Mutated Coot Algorithm) algorithm for optimal resource allocation of cloud applications. The contributed model promises leveraged performance of 32% for memory usage, 33.5% for execution time, 29% for makespan and 18% for communication cost over pre-existing conventional models considered.

Author 1: Syed Karimunnisa

Author 2: Yellamma Pachipala

Keywords: Task scheduling; AHP; TS; QoS; optimization; CUCMCA

Paper 18: DMobile-ELA: Digital Image Forgery Detection via cascaded Atrous MobileNet and Error Level Analysis

Abstract: With the current developments in technology, not only has digital media become widely available, the editing and manipulation of digital media has become equally available to everyone without any prior experience. The need for detecting manipulated images has grown immensely as it can now cause false information in news media, forensics, and daily life of common users. In this work, a cascaded approach DMobile-ELA is presented to ensure an image’s credibility and that the data it contains has not been compromised. DMobile-ELA integrates Error Level Analysis and MobileNet-based classification for tampering detection. It was able to achieve promising results compared to the state of the art on CASIAv2.0 dataset. DMobile-ELA has successfully reached a training accuracy of 99.79% and a validation accuracy of 98.48% in detecting image manipulation.

Author 1: Karma M. Fathalla

Author 2: Malak Sowelem

Author 3: Radwa Fathalla

Keywords: Tampering detection; MobileNet; error level analysis; CASIAv2.0

Paper 19: A Comprehensive Study on Medical Image Segmentation using Deep Neural Networks

Abstract: Over the past decade, Medical Image Segmentation (MIS) using Deep Neural Networks (DNNs) has achieved significant performance improvements and holds great promise for future developments. This paper presents a comprehensive study on MIS based on DNNs. Intelligent Vision Systems are often evaluated based on their output levels, such as Data, Information, Knowledge, Intelligence, and Wisdom (DIKIW), and the state-of-the-art solutions in MIS at these levels are the focus of research. Additionally, Explainable Artificial Intelligence (XAI) has become an important research direction, as it aims to uncover the "black box" nature of previous DNN architectures to meet the requirements of transparency and ethics. The study emphasizes the importance of MIS in disease diagnosis and early detection, particularly for increasing the survival rate of cancer patients through timely diagnosis. XAI and early prediction are considered two important steps in the journey from "intelligence" to "wisdom." Additionally, the paper addresses existing challenges and proposes potential solutions to enhance the efficiency of implementing DNN-based MIS.

Author 1: Loan Dao

Author 2: Ngoc Quoc Ly

Keywords: Medical image segmentation (MIS); SOTA solutions in MIS; XAI; early disease diagnosis

Paper 20: An Ensemble Multi-Layered Sentiment Analysis Model (EMLSA) for Classifying the Complex Datasets

Abstract: Sentiment analysis is one domain that analyzes the feelings and emotions of the users based on their text messages. Sentiment analysis of short messages, reviews in online social media (OSM), and social networking sites (SNS) messages gives the analysis of given text data. Processing short text and SNS messages is a very tedious task because of the restricted detailed information generally contained. Solving this issue requires advanced techniques that are combined to give accurate results. This paper developed an Ensemble Multi-Layered Sentiment Analysis Model (EMLSA) that exploits the trust-based sentiment analysis on various real-time datasets. EMLA is the combined approach with VADER (Valence Aware Dictionary and sEntiment Reasoned) and Recurrent Neural Networks (RNNs). VADER is the lexicon and rule-based sentiment analysis model that predicts the sentiments extracted from input datasets and it is used for training. The feature extraction technique is term-frequency and inverse document frequency. Word-Level Embeddings (WLE) and Character-Level Embeddings (CLE) are the two models that increase the short text and single-word analysis. The proposed model was applied to four real-time datasets: Amazon, eBay, Trip-advisor, and IMDB Movie Reviews. The performance is analyzed using various parameters such as sensitivity, specificity, precision, accuracy, and F1-score.

Author 1: Penubaka Balaji

Author 2: D. Haritha

Keywords: Sentiment analysis; online social media; social networking sites; VADER; recurrent neural networks

Paper 21: Dynamic Programming Approach in Aggregate Production Planning Model under Uncertainty

Abstract: In order to achieve a competitive edge in the market, one of the most essential components of effective operations management is aggregate production planning, abbreviated as APP. The sources of uncertainty discussed in the APP model include uncertainty in demand, uncertainty of production costs, and uncertainty of storage costs. The problem of APP usually involves many imprecise, conflicting and incommensurable objective functions. The application of APP in real conditions is often inaccurate, because some information is incomplete or cannot be obtained. The aim of this study is to develop APP model under uncertainty with a dynamic programming (DP) approach to meet consumer demand and minimize total costs during the planning period. The APP model includes several parameters including market demand, production costs, inventory costs, production levels and production capacity. After describing the problem, the optimal APP model is formulated using artificial neural network (ANN) techniques in the demand forecasting process and fuzzy logic (FL) in the DP framework. The ANN technique is used to forecast the input demand for APP and minimize the total cost during the planning period using the FL technique in the DP framework to accommodate uncertainties. The model input is historical data obtained through interviews. A case study was conducted on the the need for aluminum plates for the automotive industry. The results show that the ANN technique proposed for demand projection has a low error value in forecasting demand and FL in the DP framework is able to find minimal production costs in the APP model.

Author 1: Umi Marfuah

Author 2: Mutmainah

Author 3: Andreas Tri Panudju

Author 4: Umar Mansyuri

Keywords: Aggregate production planning; artificial neural network; dynamic programming; fuzzy logic

Paper 22: A Predictive Approach to Improving Agricultural Productivity in Morocco through Crop Recommendations

Abstract: Agricultural productivity is a critical component of sustainable economic growth, particularly in developing countries. Morocco, with its vast agricultural potential, is in need of advanced technologies to optimize crop productivity. Precision farming is one such technology, which incorporates the use of artificial intelligence and machine learning to analyze data from various sources and make informed decisions about crop management. In this study, we propose a web-based crop recommendation system that leverages ML algorithms to predict the most suitable crop to harvest based on environmental factors such as soil nutrient levels, temperature, and precipitations. We evaluated the performance of five ML algorithms (Decision Tree, Naïve Bayes, Random Forest, Logistic Regression, and Support Vector Machine) and identified Random Forest as the best-performing algorithm. Despite the promising results, we faced several challenges, including limited availability of data and the need for field validation of the results. Nonetheless, our platform aims to provide free and open-source precision farming solutions to Moroccan farmers to improve agricultural productivity and contribute to sustainable economic growth in the country.

Author 1: Rachid Ed-daoudi

Author 2: Altaf Alaoui

Author 3: Badia Ettaki

Author 4: Jamal Zerouaoui

Keywords: Precision agriculture; artificial intelligence; machine learning; crop recommendation; Morocco

Paper 23: The Research on the Motion Control of the Sorting Manipulator based on Machine Vision

Abstract: With the development of production technology, manipulators are gradually introduced in advanced production manufacturing industries to complete some tasks such as picking and sorting. However, the traditional manipulator has a complicated sorting process and low production efficiency. In order to improve the accuracy of sorting and reduce the labor intensity of workers, this paper studied the motion control of the sorting manipulator with machine vision. After placing four kinds of objects of different shapes on the conveyor belt, experiments were conducted on the catching and sorting process of the manipulator under different experimental environments, different conveyor belt speeds, and with or without machine vision. It was found that the overall success rate of the sorting robotic arm using machine vision for catching objects of different shapes was as high as 96%, and the sorting accuracy was as high as 97.91%. Therefore, it is concluded that the manipulator can achieve high accuracy in catching and sorting objects with the guidance of machine vision, and the adoption of machine vision has a positive impact on the motion control of the sorting manipulator.

Author 1: Kuandong Peng

Author 2: Zufeng Wang

Keywords: Machine vision; manipulator; motion control; camera calibration; item sorting

Paper 24: Frequency Domain Improvements to Texture Discrimination Algorithms

Abstract: As the production speeds of factories increase, it becomes more and more challenging to inspect products in real time. The goal of this article is to come up with a computationally efficient texture discrimination algorithm by first testing their ability to localize defects and then increase their efficiency by removing less effective parts of them. Therefore, abilities of the most popular texture classification algorithms such as the GLCM, the LBP and the SDH to localize defects are tested on different datasets. These tests reveal that, on small windows GLCM and SDH perform better. Frequency properties of the textures are used to fine-tune the parameters of these algorithms. Further experiments on three different datasets prove that the accuracy of the algorithms are increased almost twice while decreasing the processing time considerably.

Author 1: Ibrahim Cem Baykal

Keywords: Machine vision; ANN; SVM; pattern recognition; co-occurrence; texture feature extraction

Paper 25: An Autonomous Multi-Agent Framework using Quality of Service to prevent Service Level Agreement Violations in Cloud Environment

Abstract: Cloud is a specialized computing technology accommodating several million users to provide seamless services via the internet. The extension of this reverenced technology is growing abruptly with the increase in the number of users. One of the major issues with the cloud is that it receives a huge volume of workloads requesting resources to complete their executions. While executing these workloads, the cloud suffers from the issue of service level agreement (SLA) violations which impacts the performance and reputation of the cloud. Therefore, there is a requirement for an effective design that supports faster and optimal execution of workloads without any violation of SLA. To fill this gap, this article proposes an automatic multi-agent framework that ensures the minimization of the SLA violation rate in workload execution. The proposed framework includes seven major agents such as user agent, system agent, negotiator agent, coordinator agent, monitoring agent, arbitrator agent and the history agent. All these agents work cooperatively to enable the effective execution of workloads irrespective of their dynamic nature. With effective execution of workloads, the proposed model also resulted in an advantage of minimized energy consumption in data centres. The inclusion of a history agent within the framework enabled the model to predict future requirements based on the records of resource utilization. The proposed model followed the Poisson distribution to generate random numbers that are further used for evaluation purposes. The simulations of the model proved that model is more reliable in reducing SLA violations compared to the existing works. The proposed method resulted in an average SLA violation rate of 55.71% for 1200 workloads and resulted in an average energy consumption of 47.84kWh for 1500 workloads.

Author 1: Jaspal Singh

Author 2: Major Singh Goraya

Keywords: Cloud computing; multi-agent framework; SLA violations; energy consumption; history agent; Poisson distribution

Paper 26: An Efficient Deep Learning based Hybrid Model for Image Caption Generation

Abstract: In the recent yeas, with the increase in the use of different social media platforms, image captioning approach play a major role in automatically describe the whole image into natural language sentence. Image captioning plays a significant role in computer-based society. Image captioning is the process of automatically generating the natural language textual description of the image using artificial intelligence techniques. Computer vision and natural language processing are the key aspect of the image processing system. Convolutional Neural Network (CNN) is a part of computer vision and used object detection and feature extraction and on the other side Natural Language Processing (NLP) techniques help in generating the textual caption of the image. Generating suitable image description by machine is challenging task as it is based upon object detection, location and their semantic relationships in a human understandable language such as English. In this paper our aim to develop an encoder-decoder based hybrid image captioning approach using VGG16, ResNet50 and YOLO. VGG16 and ResNet50 are the pre-trained feature extraction model which are trained on millions of images. YOLO is used for real time object detection. It first extracts the image features using VGG16, ResNet50 and YOLO and concatenate the result in to single file. At last LSTM and BiGRU are used for textual description of the image. Proposed model is evaluated by using BLEU, METEOR and RUGE score.

Author 1: Mehzabeen Kaur

Author 2: Harpreet Kaur

Keywords: CNN; RNN; LSTM; YOLO

Paper 27: Collaborative based Vehicular Ad-hoc Network Intrusion Detection System using Optimized Support Vector Machine

Abstract: The Vehicular Ad hoc Network (VANET) can be used to provide secured information to the user vehicles. However, these days the immunity of safeguarding the information from vulnerabilities and threats are of great challenge. Therefore, it is necessary to provide a secured solution for the improvement of security with the deployment of advanced technology. In context with this, a blockchain based VANET structure for secured communication incorporated with the enhancement of confidentiality, scalability, and privacy is planned. The k-means clustering model forms cluster formation. The cluster head selection is carried out with Tabu Search-based Particle Swarm Optimization (TS-PSO) algorithm. The proposed approach aims to mitigate the delay with the enhancement of throughput and energy efficiency. Meanwhile, the deployed blockchain will enhance reliability and security. Moreover, the novel War Strategy Optimization (WSO) based Support Vector Machine (SVM) model (Optimized SVM) can be used for the trust-based collaborative intrusion detection in the VANET. Our work targets to detect the intrusion and non-intrusion classes. Meanwhile, our proposed work can be used for the prevention of repetitive detection processes and therein it enhances the security by rewarding the vehicles. An experimental analysis is carried out to ensure its usage in detecting the malicious node from the resource constraint vehicles and also used to achieve better security, energy utilization and end-to-end delay.

Author 1: Azath M

Author 2: Vaishali Singh

Keywords: Vehicular Ad hoc network; intrusion detection; tabu search based particle swarm optimization; war strategy optimization; support vector machine

Paper 28: Content-Based Image Retrieval using Encoder based RGB and Texture Feature Fusion

Abstract: Recent development of digital photography and the use of social media using smartphones has boosted the demand for image query by its visual semantics. Content-Based Image Retrieval (CBIR) is a well-identified research area in the domain of image and video data analysis. The major challenges of a CBIR system are (a) to derive the visual semantics of the query image and (b) to find all the similar images from the repository. The objective of this paper is to precisely define the visual semantics using hybrid feature vectors. In this paper, a CBIR system using encoded-based feature fusion is proposed. The CNN encoding features of the RGB channel are fused with the encoded texture features of LBP, CSLBP, and LDP separately. The retrieval performance of the different fused features is tested using three public datasets i.e. Corel-lK, Caltech, and 102flower. The result shows the class properties are better retained using the LDP with RGB encoded features, this helps to enhance the classification and retrieval performance for all three datasets. The average precision of Corel-lK is 94.5% and it is 89.7% for Caltech, and 88.7% for the 102flower. The average f1-score is 89.5% for Caltech, and 88.5% for the 102flower. The improvement in the f1-score value implies the proposed fused feature is more stable to deal the class imbalance problem.

Author 1: Charulata Palai

Author 2: Pradeep Kumar Jena

Author 3: Satya Ranjan Pattanaik

Author 4: Trilochan Panigrahi

Author 5: Tapas Kumar Mishra

Keywords: CBIR; CNN Encoded Feature; LBP; CSLBP; LDP; feature fusion

Paper 29: Knowledge Graph based Representation to Extract Value from Open Government Data

Abstract: Open government data refers to data that is made available by government entities to be freely reused by anyone and for any purpose. The potential benefits of open government data are numerous and include increasing transparency and accountability, enhancing citizens' quality of life, and boosting innovation. However, realizing these benefits is not always straightforward, as the usage of this raw data often faces challenges related to its format, structure, and heterogeneity which hinder its processability and integration. In response to these challenges, we propose an approach to maximize the usage of open government data and achieve its potential benefits. This approach leverages knowledge graphs to extract value from open government data and drive the construction of a knowledge graph from structured, semi-structured, and non-structured formats. It involves the extraction, transformation, semantic enrichment, and integration of heterogeneous open government data sources into an integrated and semantically enhanced knowledge graph. Learning mechanisms and ontologies are used to efficiently construct the knowledge graph. We evaluate the effectiveness of the approach using real-world public procurement data and show that it can detect potential fraud such as favoritism.

Author 1: Kawtar YOUNSI DAHBI

Author 2: Dalila CHIADMI

Author 3: Hind LAMHARHAR

Keywords: Knowledge graph; open government data; knowledge graph construction; public procurement; fraud detection

Paper 30: Deep Learning CNN Model-Based Anomaly Detection in 3D Brain MRI Images using Feature Distribution Similarity

Abstract: Towards detecting an anomaly in brain images, different approaches are discussed in the literature. Features like white mass values and shape features have identified the presence of brain tumors. Various deep learning models like the neural network has been adapted to the problem tumor detection and suffers to meet maximum accuracy in detecting brain tumor. An Adaptive Feature Centric Distribution Similarity Based Anomaly Detection Model with Convolution Neural Network (AFCD-CNN) is sketched towards disease prediction problem to handle the problem. The model considers black-and-white mass features with the distribution of features. First, the method applies the Multi-Hop Neighbor Analysis (MHNA) algorithm in normalizing the brain image. Further, the process uses the Adaptive Mass Determined Segmentation (AMDS) algorithm, which groups the pixels of MRI according to the white and black mass values. The method extracts the ROI with the segmented image and convolves the features with CNN at the training phase. The CNN is designed to convolve the features into one dimension. The output layer neurons are designed to estimate different Feature Distribution Similarity (FDS) values against various features to compute the Anomaly Class Weight (ACW). According to the ACW value, anomaly detection is performed with higher accuracy up to 97% where the time complexity is reduced up to 32 seconds.

Author 1: Amarendra Reddy Panyala

Author 2: M. Baskar

Keywords: Deep learning; brian tumor; disease prediction; anomaly detection; CNN; FDS; ACW

Paper 31: An Algorithm Transform DNA Sequences to Improve Accuracy in Similarity Search

Abstract: Similarity search of DNA sequences is a fundamental problem in the bioinformatics, serving as the basis for many other problems. In this, the calculation of the similarity value between sequences is the most important, with the Edit distance (ED) commonly used due to its high accuracy, but slow speed. With the advantage of transforming the original DNA sequences into numerical vector form that retaining unique features based on properties. The calculation processing on these transformed data will be much faster, many times faster than a direct comparison on the original sequence. Additionally, from a long DNA sequence, after transformation, it typically has a lower storage capacity, making it have good data compression. The challenge of this job is to develop algorithms based on features that maintain biological significance while ensuring search accuracy, which is also the problem to be solved. Previous methods often used pure mathematical statistics such as frequency statistics and matrix transformations to construct features. In this paper, an improved algorithm is proposed based on both biological significances and mathematical statistics to transforming gene data into numerical vectors for ease of storage and to improve accuracy in similarity search between DNA sequences. Based on the experimental results, the new algorithm improves the accuracy of similarity calculations while maintaining good performance.

Author 1: Hoang Do Thanh Tung

Author 2: Phuong Vuong Quang

Keywords: Similarity search; data transformation; DNA sequence; big data

Paper 32: An Automated Text Document Classification Framework using BERT

Abstract: Due to the rapid advancement of technology, the volume of online text data from numerous various disciplines is increasing significantly over time. Therefore, more work is needed to create systems that can effectively classify text data in accordance with its content, facilitating processing and the extraction of crucial information. Since these non-automated systems use manual feature extraction and classification, which is error-prone and time-consuming by choosing the best appropriate algorithms for feature extraction and classification, traditional procedures are typically resource intensive (computational, human, etc.), which is not a viable solution. To address the shortcomings of traditional approaches, we offer a unique text categorization strategy based on a well-known DL algorithm called BERT. The proposed framework is trained and tested using cutting-edge text datasets, such as the UCI email dataset, which includes spam and non-spam emails, and the BBC News dataset, which includes multiple categories such as tech, sports, politics, business, and entertainment. The system achieved the highest accuracy of 91.4% and can be used by different organizations to classify text-based data with a high performance. The effectiveness of the proposed framework is evaluated using multiple evaluation metrics such as Accuracy, Precision, and Recall.

Author 1: Momna Ali Shah

Author 2: Muhammad Javed Iqbal

Author 3: Neelum Noreen

Author 4: Iftikhar Ahmed

Keywords: Deep learning; text classification; BERT

Paper 33: Experimental Evaluation of Genetic Algorithms to Solve the DNA Assembly Optimization Problem

Abstract: This paper aims to highlight the motivations for investigating genetic algorithms (GAs) to solve the DNA Fragment Assembly (DNAFA) problem. DNAFA problem is an optimization problem that attempts to reconstruct an original DNA sequence by finding the shortest DNA sequence from a given set of fragments. This paper is a continuation of our previous research paper in which the existence of a polynomial-time reduction of DNAFA into the Traveling Salesman Problem (TSP) and the Quadratic Assignment Problem (QAP) was discussed. Taking advantage of this reduction, this work conceptually designed a genetic algorithm (GA) platform to solve the DNAFA problem. This platform offers several ingredients enabling us to create several variants of GA solvers for the DNAFA optimization problems. The main contribution of this paper is the designing of an efficient GA variant by carefully integrating different GAs operators of the platform. For that, this work individually studied the effects of different GAs operators on the performance of solving the DNAFA problem. This study has the advantage of benefiting from prior knowledge of the performance of these operators in the contexts of the TSP and QAP problems. The best designed GA variant shows a significant improvement in accuracy (overlap score) reaching more than 172% of what is reported in the literature.

Author 1: Hachemi Bennaceur

Author 2: Meznah Almutairy

Author 3: Nora Alqhtani

Keywords: Genetic algorithms; traveling salesman problem; quadratic assignment problem; DNA fragments assembly problem

Paper 34: Fake News Classification Web Service for Spanish News by using Artificial Neural Networks

Abstract: The use of digital media, such as social networks, has promoted the spreading of fake news on a large scale. Therefore, several Machine Learning techniques, such as artificial neural networks, have been used for fake news detection and classification. These techniques are widely used due to their learning capabilities. Besides, models based on artificial neural networks can be easily integrated into social media and websites to spot fake news early and avoid their propagation. Nevertheless, most fake news classification models are available only for English news, limiting the possibility of detecting fake news in other languages, such as Spanish. For this reason, this study proposes implementing a web service that integrates a deep learning model for the classification of fake news in Spanish. To determine the best model, the performance of several neural network architectures, including MLP, CNN, and LSTM, was evaluated using the F1 score., and LSTM using the F1 score. The LSTM architecture was the best, with an F1 score of 0.746. Finally, the efficiency of web service was evaluated, applying temporal behavior as a metric, resulting in an average response time of 1.08 seconds.

Author 1: Patricio Xavier Moreno-Vallejo

Author 2: Gisel Katerine Bastidas-Guacho

Author 3: Patricio Rene Moreno-Costales

Author 4: Jefferson Jose Chariguaman-Cuji

Keywords: Fake news; LSTM; classification; web service; machine learning

Paper 35: Support Vector Regression based Localization Approach using LoRaWAN

Abstract: The Internet of Things (IoT) domain has experienced significant growth in recent times. There has been extensive research conducted in various areas of IoT, including localization. Localization of Long Range (LoRa) nodes in outdoor environments is an important task for various applications, including asset tracking and precision agriculture. In this research article, a localization approach using Support Vector Regression (SVR) has been implemented to predict the location of the end node using LoRaWAN. The experiments are conducted in the outdoor campus environment. The SVR used the Received Signal Strength Indicator (RSSI) fingerprints to locate the end nodes. The results show that the proposed method can locate the end node with a minimum error of 36.26 meters and a mean error of 171.59 meters.

Author 1: Saeed Ahmed Magsi

Author 2: Mohd Haris Bin Md Khir

Author 3: Illani Bt Mohd Nawi

Author 4: Abdul Saboor

Author 5: Muhammad Aadil Siddiqui

Keywords: LoRaWAN; localization; RSSI; fingerprinting; support vector regression

Paper 36: Text-based Sarcasm Detection on Social Networks: A Systematic Review

Abstract: Sarcasm is a sophisticated phenomenon used for conveying a meaning that differs from what is being said, and it is usually used to express displeasure or ridicule others. Sentiment analysis is a process of uncovering the subjective information from a text. Detecting figurative language such as irony or sarcasm, is a focused challenging research field of sentiment analysis. Detecting and understanding the use of sarcasm in social networks could provide businesses and politicians with significant insight, since it reflects people’s opinions about certain topics, news, and products. This has especially become relevant recently because sarcastic texts have been trending on social networks and are being posted by millions of active users. As a result of this situation, there is now an increasing amount of research on the detection of sarcasm in social network posts. Many works have been published on sarcasm detection, and they include a wide variety of techniques based on rules, lexicons, traditional machine learning, deep learning, and transformers. However, sarcasm detection is a challenging task due to the ambiguity and non-straightforward nature of sarcastic text. In addition, very few reviews have been conducted on the research in this area. Therefore, this systematic review mainly aims at exploring the newly published sarcasm detection articles on social networks in the years between 2019 and 2022. Several databases were extensively searched, and 30 articles that met the criteria were included. The selected articles were reviewed based on their approaches, datasets, and evaluation metrics. The findings emphasized that deep learning is the most commonly used technique for sarcasm detection in recent literature, and Twitter and F-measure are the most used source and performance metric, respectively. Finally, this article presents a brief discussion regarding the challenges in sarcasm detection and future research directions.

Author 1: Amal Alqahtani

Author 2: Lubna Alhenaki

Author 3: Abeer Alsheddi

Keywords: Sentiment analysis; figurative language; sarcasm detection; irony; machine learning; deep learning; transformer

Paper 37: Current Development, Challenges, and Future Trends in Cloud Computing: A Survey

Abstract: Cloud computing is a new paradigm in information and communication technologies (ICTs) that provides the ability to access shared pools of different computing resources that are related to many cloud users within a pay-per-use or on-demand approach. It has transformed the delivery model of ICT from a product to a service. This provides several different advantages for institutions, companies and users based on savings and reduced capital expenditure through lower operating expenses. This paper provides a comprehensive survey of cloud computing. It first develops an understanding of cloud computing in general and discusses its advantages, current development, challenges and future trends. Subsequently, a detailed discussion on the cloud computing architectures, services models, fault tolerance mechanisms, services selection methods, adoption by industry, and scheduling of cloud-based resources is also presented. Nonetheless, cloud computing has many obstacles which expose it to a number of limitations. Some of these challenges include security of data, fault tolerance, and load balancing. A number of techniques in literature are proposed to cope with these challenges which are discussed and analyzed. Experimental data and usage drift validates the popularity of cloud computing and its adoption in recent years. Future trends in cloud computing support the use of intelligent machine learning (ML) techniques and new technologies to cope with some of the challenges and making cloud computing more efficient, secure and commercially viable to be widely accepted.

Author 1: Hazzaa N. Alshareef

Keywords: Cloud computing; security challenges; machine learning; resource scheduling; information and communication technologies

Paper 38: A High-Performance Approach for Irregular License Plate Recognition in Unconstrained Scenarios

Abstract: This paper proposes a novel framework for locating and recognizing irregular license plates in real-world complex scene images. In the proposed framework, an efficient deep convolutional neural network (CNN) structure specially designed for keypoint estimation is first employed to predict the corner points of license plates. Then, based on the predicted corner points, perspective transformation is performed to align the detected license plates. Finally, a lightweight deep CNN structure based on the YOLO detector is designed to predict license plate characters. The character recognition network can predict license plate characters without depending on license plate layouts (i.e., license plates of single-line or double-line text). Experiment results on CCPD and AOLP datasets demonstrate that the proposed method obtains better recognition accuracy compared with previous methods. The proposed model also achieves impressive inference speed and can be deployed in real-time applications.

Author 1: Hoanh Nguyen

Keywords: License plate recognition; deep learning; convolutional neural network; keypoint detector; YOLO detector

Paper 39: Proxy Re-encryption Scheme based on the Timed-release in Edge Computing

Abstract: With the growth of Industrial Internet, various types of data show explosive growth. Data is being moved from the cloud to the edge for computing more frequently and edge computing becomes an important factor affecting the deep application of Industrial Internet platforms. However, the security issue of data transmission sharing is not addressed. Therefore, this paper proposes a security scheme based on timed-release, multi-dimensional virtual permutation, and proxy re-encryption(PRE), to protect the confidentiality of the data during the transmitter; symmetric cryptography is employed to encrypt the transmission data. At the same time, a time server is used, making it impossible for data receivers to get the information about the data before the specified time arrives, it solves the application scenario of data transmission and sharing with timed-release requirements and the efficiency of a large number of data is solved, and the security of data is improved. Finally, the security of the scheme was proved theoretically. Compared to existing PRE schemes, this scheme adds timed-release controlled access, has resistance to ciphertext attacks and end-to-end security features, and uses fewer bilinear pair operations in the algorithm. The performance was tested experimentally and the results show that the scheme improves efficiency while ensuring security and has significant advantages in terms of data security and private data protection.

Author 1: Yifeng Yin

Author 2: Wanyi Zhou

Author 3: Zhaobo Wang

Author 4: Yong Gan

Author 5: Yanhua Zhang

Keywords: Timed-release; edge-computing; multi-dimensional virtual permutation; proxy re-encryption; symmetric encryption

Paper 40: Automatic Detection of Software Defects based on Machine Learning

Abstract: Defects in software are one of the critical problems in software engineering community because they provide inaccurate results and negatively affect the quality and reliability of the software. These defects must be detected in the early stages of software development. Researchers had used Software Defect Detection (SDD) techniques to allow predicting module fault-proneness. By implementing the hyperparameter optimization techniques and exploiting data imbalances in predicting defects, this paper proposes and develops an SDD model with high performance and generalization capability. To classify defects in software modules, machine learning algorithms and ensemble learning techniques are used on the balanced datasets. The balanced datasets are obtained through using a hybrid of synthetic minority oversample (SMOTE) and Support Vector Machine (SVM). To obtain the optimal hyperparameters needed for the used classifiers and for the dataset balanced algorithms, Non-dominated Sorting Genetic Algorithm II (NDSGA-II) is used. To reduce the time and save other used resources, Hyperband technique, which is a multi-fidelity optimization, is used in NDSGA-II. A 10-fold Cross Validation (CV) is applied to overcome the overfitting and underfitting problems. The accuracy, recall, F-measure, and ROC AUC metrics are used to evaluate the SDD model. The results show that the proposed model predicts defects more accurately than the compared studies.

Author 1: Nawal Elshamy

Author 2: Amal AbouElenen

Author 3: Samir Elmougy

Keywords: Software defect detection; NDSGA-II; hyperband; imbalance dataset

Paper 41: Balancing Technological Advances with User Needs: User-centered Principles for AI-Driven Smart City Healthcare Monitoring

Abstract: In recent years, the integration of artificial intelligence (AI) technologies has greatly benefited smart city healthcare, meeting the growing demand for affordable, efficient, and real-time healthcare services. Patient monitoring is one area where artificial intelligence has shown great promise. Improved health outcomes have been made possible by the advancement of AI-based monitoring systems, which enable more personalized and continuous patient monitoring. However, to fully maximize the benefits of these systems, a user-centered approach is essential, which prioritizes patients' needs and experiences while ensuring their privacy and autonomy are respected. This study focuses on the application of user-centered design principles in the development and deployment of AI-driven monitoring systems in smart city healthcare. Addressing the challenges and opportunities of AI-driven monitoring systems, the article considers issues such as privacy and security concerns, data accuracy, and user acceptance. Finally, some possible future directions to the challenges are suggested. A user-centered approach to AI monitoring systems is recommended for healthcare providers to enhance patient experience in smart city healthcare.

Author 1: Ali H. Hassan

Author 2: Riza bin Sulaiman

Author 3: Mansoor A. Abdulgabber

Author 4: Hasan Kahtan

Keywords: Smart healthcare; patient monitoring; smart city; artificial intelligence; user-centered

Paper 42: Investigation of Combining Deep Learning Object Recognition with Drones for Forest Fire Detection and Monitoring

Abstract: Forest fires are a global environmental problem that can cause significant damage to natural resources and human lives. The increasing frequency and severity of forest fires have resulted in substantial losses of natural resources. To mitigate this, an effective fire detection and monitoring system is crucial. This work aims to explore and review the current advancement in the field of forest fire detection and monitoring using both drones or unmanned aerial vehicles (UAVs), and deep learning techniques. The utilization of drones fully equipped with specific sensors and cameras provides a cost-effective and efficient solution for real-time monitoring and early fire detection. In this paper, we conduct a comprehensive analysis of the latest developments in deep learning object detection, such as YOLO (You Only Look Once), R-CNN (Region-based Convolutional Neural Network), and their variants, with a focus on their potential application in the field of forest fire monitoring. The performed experiments show promising results in multiple metrics, making it a valuable tool for fire detection and monitoring.

Author 1: Mimoun YANDOUZI

Author 2: Mounir GRARI

Author 3: Mohammed BERRAHAL

Author 4: Idriss IDRISSI

Author 5: Omar MOUSSAOUI

Author 6: Mostafa AZIZI

Author 7: Kamal GHOUMID

Author 8: Aissa KERKOUR ELMIAD

Keywords: Forest fire; deep learning; drones; unmanned aerial vehicles; object detection; YOLO; Faster R-CNN

Paper 43: Method for Frequent High Resolution of Optical Sensor Image Acquisition using Satellite-Based SAR Image for Disaster Mitigation

Abstract: Method for frequent high resolution of optical sensor imagery data acquisition from satellite-based SAR (Synthetic Aperture Radar) image for disaster mitigation is proposed. The proposed method is based on Generative Adversarial Network: GAN-based super resolution and conversion method from a SAR imagery data to the corresponding optical sensor imagery data in order to increase observation frequency. Through experiments, it is found that it is possible to convert SAR imagery data to the corresponding optical sensor imagery data and also found that the spatial resolution of SAR imagery data is improved remarkably. Thus, initial stage of disaster (small scale of disaster) can be detected with resolution enhanced optical sensor imagery data derived from the corresponding SAR imagery data which results in prevention of secondary occurrence of relatively large scale of disaster. It is also found that 2.5 m of spatial resolution of optical sensor imagery data can be acquired every 2.5 days in the case that only Sentinel-1/SAR and Sentinel-2/MSI (Multi Spectral Imager) are used, for instance.

Author 1: Kohei Arai

Author 2: Yushin Nakaoka

Author 3: Osamu Fukuda

Author 4: Nobuhiko Yamaguchi

Author 5: Wen Liang Yeoh

Author 6: Hiroshi Okumura

Keywords: Frequent observation; Synthetic Aperture Radar: SAR; super resolution; Generative Adversarial Network: GAN; GAN-based conversion of images

Paper 44: A Comparative Study of Twofish, Blowfish, and Advanced Encryption Standard for Secured Data Transmission

Abstract: Now-a-days, network security is becoming an increasingly significant and demanding research area of interest. Threats and attacks on information and Internet security are getting increasingly difficult to detect. As a result, encryption has emerged as a solution and now plays a critical role in information security systems. Many techniques are required to safeguard shared data. In this work, the encryption, decryption times, and throughput (speed) of the three most commonly used block cipher algorithms: Twofish, Blowfish, and AES were investigated using different file types. Comparison of symmetric encryption techniques of experiments on these types of algorithms uses a lot of computer resources including CPU time, memory, and battery power. Previous research has yielded diverse results in terms of time complexity, speed, space complexity, power consumption, and security. However, this research evaluated the effectiveness of each algorithm based on the following parameters: process time and speed. An application was developed for data simulation to test different file formats and for the encryption process and speed using Python 3.10.

Author 1: Kwame Assa-Agyei

Author 2: Funminiyi Olajide

Keywords: Cryptography; twofish; blowfish; advanced encryption standard; throughput; data encryption; decryption

Paper 45: Elitist Animal Migration Optimization for Protein Structure Prediction based on 3D Off-Lattice Model

Abstract: Predicting the structure of protein has been the center of attraction for the researchers. The aim is to make a reliable prediction of the protein structure by obtaining the minimum energy values among amino acids interactions. According to the generated shape of amino acids, the functionality of the proteins can be determined. However, it is known as one of the most challenging tasks in the field of bioinformatics considering its high computation complexity. Metaheuristic algorithms are mainly preferred by researchers from various fields, since their performances are quite satisfactory in solving such complex problems. Animal Migration Optimization (AMO) algorithm is a metaheuristic approach which mimics the behavior of animals during the migration process. However, in this research to reach a high solution quality, an elitist version of Animal Migration Optimization (ELAMO) algorithm is considered and in particular it is applied to Protein Structure Prediction (PSP) problem. The performance of ELAMO is tested on some well-studied artificial and real protein sequences, and then compared with powerful optimization algorithms which are specially designed for solving PSP problem. The results show that ELAMO is quite capable in solving this problem. Hence, it can be used as an efficient optimizer for solving complex problems that require better solution quality in the field of bioinformatics.

Author 1: Ezgi Deniz Ülker

Keywords: Animal migration optimization; bioinformatics; elitism; metaheuristics; protein structure prediction

Paper 46: Improved Multiclass Brain Tumor Detection using Convolutional Neural Networks and Magnetic Resonance Imaging

Abstract: Recently, Deep learning algorithms, particularly Convolutional Neural Networks (CNNs), have been applied extensively for image recognition and classification tasks, with successful results in the field of medicine, such as in medical image analysis. Radiologists have a hard time categorizing this lethal illness since brain tumors include a variety of tumor cells. Lately, methods based on computer-aided diagnostics claimed to employ magnetic resonance imaging to help with the diagnosis of brain cancers (MRI). Convolutional Neural Networks (CNNs) are often used in medical image analysis, including the detection of brain cancers. This effort was motivated by the difficulty that physicians have in appropriately detecting brain tumors, particularly when they are in the early stages of brain bleeding. This proposed model categorized the brain image into four distinct classes: (Normal, Glioma, Meningioma, and Pituitary). The proposed CNN networks reach 95% of recall, 95.44% accuracy and 95.36% of F1-score.

Author 1: Mohamed Amine Mahjoubi

Author 2: Soufiane Hamida

Author 3: Oussama El Gannour

Author 4: Bouchaib Cherradi

Author 5: Ahmed El Abbassi

Author 6: Abdelhadi Raihani

Keywords: Deep learning; convolutional neural networks; brain tumor; classification; magnetic resonance imaging

Paper 47: Sentiment Analysis on Moroccan Dialect based on ML and Social Media Content Detection

Abstract: As technology continues to evolve, humans tend to follow suit, and currently social media has taken place as the defacto method of communication. As it tends to happen with verbal communication, people express their opinions in written form and through an analysis of their words, one can extract what an individual wants from a product, a topic, or an event. By looking at the emotions expressed in such content, governments, businesses, and people can learn a lot that can help them improve their strategies. Therefore, in this study, we will use different algorithms to improve the Moroccan sentiment classification. The first step is to gather and prepare Moroccan Dialectal Arabic Twitter comments. Then, a lot of different combinations of extraction (n-grams) and weighting schemes (BOW/ TF-IDF) and word embedding for feature construction are applied to get the best classification models. We used Naive Bayes, Random Forests, Support Vector Machines, and Logistic regression and LSTM to classify the data we prepared. Our machine learning approach, which incorporates sentiment analysis, was designed to analyze Twitter comments written in Modern Standard Arabic or Moroccan Dialectal Arabic. As a final benchmark of our paper, we were simply a sliver shy away from the 70% mark in our accuracy by relying on the SVM algorithm. Although not a game-changing result, this was enough to encourage us to continue developing our model further.

Author 1: Mouaad Errami

Author 2: Mohamed Amine Ouassil

Author 3: Rabia Rachidi

Author 4: Bouchaib Cherradi

Author 5: Soufiane Hamida

Author 6: Abdelhadi Raihani

Keywords: Sentiment analysis; Arabic Moroccan dialect; tweets; machine learning

Paper 48: 1D Convolutional Neural Network for Detecting Heart Diseases using Phonocardiograms

Abstract: According to estimations made by World Health Organization, heart disease is the largest cause of mortality throughout the globe, and it is safe to assume that diagnosing heart diseases in their earliest stages is very essential. Diagnosis of cardiovascular disease may be carried out by detection of interference in cardiac signals, one of which is called phonocardiography, and it can be accomplished in a number of various ways. Using phonocardiogram (PCG) inputs and deep learning, the researchers aim to develop a classification system for different types of heart illness. The slicing and normalization of the signal served as the first step in the study's signal preprocessing, which was subsequently followed by a wavelet based transformation method that employs mother wavelet analytic morlet. The results of the decomposition are first shown with the use of a scalogram, afterwards, they are utilized as input for the deep CNN. In this investigation, the analyzed PCG signals were separated into categories, denoting normal and pathological heart sounds. The entire utilized data was divided into two categories as training and test data as 80% to 20%. The developed model demonstrates the degree of clinical diagnosis, sensitivity, specificity and AUC-ROC value. As a result, it has been determined that the proposed method was superior to the mother wavelet as well as other classifier approaches. Consequently, we were able to acquire an electronic stethoscope that has a diagnostic accuracy of more than 90% when it comes to identifying cardiac problems. To be more specific, the proposed deep CNN model has an accuracy of 93.25% in identifying aberrant heart sounds and 93.50% in identifying regular heartbeats. In addition, given the fact that an examination may be completed in only 15 seconds, speed is the primary advantage offered by the suggested stethoscope.

Author 1: Meirzhan Baikuvekov

Author 2: Abdimukhan Tolep

Author 3: Daniyar Sultan

Author 4: Dinara Kassymova

Author 5: Leilya Kuntunova

Author 6: Kanat Aidarov

Keywords: Deep learning; CNN; heart disease; phonocardiogram; classification; detection; PCG

Paper 49: CNN-BiLSTM Hybrid Model for Network Anomaly Detection in Internet of Things

Abstract: Anomaly detection in internet of things network traffic is a critical aspect of intrusion and attack detection, in which a deviation from typical behavior signals the existence of malicious or inadvertent assaults, faults, flaws, and other issues. The necessity to examine a large number of security events to identify anomalous behavior of smart devices adds to the urgency of addressing the challenge of picking machine-learning and deep learning models for identifying anomalies in network traffic. For the challenge of binary data categorization, a software implementation of an intrusion detection system based on supervised-learning algorithms has been completed. The UNSW-NB15 open dataset, which contains 2,540,044 records - vectors of TCP/IP network connection signals and their associated class labels are used to train and test the system. This research compares different machine-learning models and proposes CNN-BiLSTM hybrid model for IoT network intrusion detection. The metrics for measuring the quality of classification and the running duration of algorithms for different ratios of train and test samples are the result of the built framework testing.

Author 1: Bauyrzhan Omarov

Author 2: Omirlan Auelbekov

Author 3: Azizah Suliman

Author 4: Ainur Zhaxanova

Keywords: IoT; internet of things; network anomalies; network security; anomaly attack; machine learning; supervised learning; UNSW-NB15

Paper 50: Scouting Firefly Algorithm and its Performance on Global Optimization Problems

Abstract: For effective optimization, metaheuristics should maintain the proper balance between exploration and exploitation. However, the standard firefly algorithm (FA) posted some limitations in its exploration process that can eventually lead to premature convergence, affecting its performance and adding uncertainty to the optimization results. To address these constraints, this study introduces an additional novel search mechanism for the standard FA inspired by the behavior of the scout bee in the artificial bee colony (ABC) algorithm, termed the "Scouting FA". Specifically, fireflies stuck in the local optima will take directed extra random walks to escape toward the region of the optimum solution, thus improving convergence accuracy. Empirical findings on the five standard benchmark functions have validated the effects of this modification and revealed that Scouting FA is superior to its original version.

Author 1: Jolitte A. Villaruz

Author 2: Bobby D. Gerardo

Author 3: Ariel O. Gamao

Author 4: Ruji P. Medina

Keywords: Metaheuristics; firefly algorithm; modified firefly algorithm; global optimization; scout bee; exploitation and exploration

Paper 51: A Robust Steganographic Algorithm based on Linear Fractional Transformation and Chaotic Maps

Abstract: The fundamental objectives of a steganographic technique are to achieve both robustness and high-capacity for the hidden information. This paper proposes a steganographic algorithm that satisfies both of these objectives, based on enhanced chaotic maps. The algorithm consists of two phases. In the first phase, a cryptographic substitution box is constructed using a novel fusion technique based on logistic and sine maps. This technique overcomes existing vulnerabilities of chaotic maps, such as frail chaos, finite precision effects, dynamical degradation, and limited control parameters. In the second phase, a frequency-domain-based embedding scheme is used to transform the secret information into ciphertext by employing the substitution boxes. The statistical strength of the algorithm is assessed through several tests, including measures of homogeneity, correlation, mean squared error, information entropy, contrast, peak signal-to-noise ratio, energy, as well as evaluations of the algorithm's performance under JPEG compression and image degradation. The results of these tests demonstrate the algorithm's robustness against various attacks and provide evidence of its high-capacity for securely embedding secret information with good visual quality.

Author 1: Muhammad Ramzan

Author 2: Muhammad Fahad Khan

Keywords: Steganography; information security; chaotic map vulnerabilities; enhanced chaotic maps; S-box Design

Paper 52: Implementing Bisection Method on Forex Trading Database for Early Diagnosis of Inflection Point

Abstract: Many people are trading in the forex market during the COVID-19 pandemic with the hope of earning money, but they are experiencing shortages due to the lack of information and technology-based tools for existing daily data. Sometimes traders only use moving averages in trading data, even though this information needs to be processed again to get the right inflection point. The objective of this research is to find inflection points based on Forex trading database. Another algorithm can also be used to determine the inflection point between two points on a moving average. This can be supported by the Bisection method used because it can guarantee that convergence will occur. The results show that the points resulting from the bisection calculation on the moving average provide a fairly accurate decision support for the location where the inflection point is located. From 10,000 data there is a standard deviation of 0.71 points which is very small compared to an average of 20 pips (points used as the difference in price values in forex). The use of the bisection method provides an accuracy of the results in seeing the inflection point of 87%.

Author 1: Agustinus Noertjahyana

Author 2: Zuraida Abal Abas

Author 3: Zeratul Izzah Mohd. Yusoh

Author 4: M. Zainal Arifin

Keywords: Bisection method; moving average; inflection point; forex trading; decision support

Paper 53: A Novel Approach to Network Forensic Analysis: Combining Packet Capture Data and Social Network Analysis

Abstract: Log data from computers used for network forensic analysis is ineffective at identifying specific security threats. Log data limitations include the difficulty in reconstructing communication patterns between nodes and the inability to identify more advanced security threats. By combining traditional log data analysis methods with a more effective combination of approaches, a more comprehensive view of communication patterns can be achieved. This combined approach can then help identify potential security threats more effectively. It's difficult to determine the specific benefits of combining Packet Capture (PCAP) and Social Network Analysis (SNA) when performing forensics. This article proposes a new approach to forensic analysis that combines PCAP and social network analysis to overcome some of the limitations of traditional methods. The purpose of this discovery is to improve the accuracy of network forensic analysis by combining PCAP and social network analysis to provide a more comprehensive view of network communication patterns. Network forensics, which combines pcap analysis and social network analysis, provides more comprehensive results. PCAP analysis is used to analyze network traffic, conversation statistics, protocol distribution, packet content and round-trip times. Social network analysis maps communication patterns between nodes and identifies the most influential key players within the network. PCAP analysis efficiently captures and analyzes network packets, and SNA provides insight into relationships and communication patterns between devices on the network.

Author 1: Irwan Sembiring

Author 2: Suharyadi

Author 3: Ade Iriani

Author 4: Jenni Veronika Br Ginting

Author 5: Jusia Amanda Ginting

Keywords: PCAP analysis; social network analysis; network forensic; network communication pattern

Paper 54: Validating the Usability Evaluation Model for Hearing Impaired Mobile Application

Abstract: Usability is an important element that enables the identification of the efficiency for application or product. However, many applications have been developed for general users’ needs and are unable to provide adequate applications usage for disabled people. This study focuses on the development of usability evaluation model and the validation process on the proposed model through experts. The developed model later evaluated by group of experts through focus group method. Focus group method enables to identify the 13 variables derived to develop the model are appropriately placed and useful in the evaluation process. The results shows that the selected variables are appropriate to identify usability of mobile application for the hearing impairment through three variables tested namely, gain satisfaction with the model, satisfaction with the model presentation, and support for tasks. Conclusively, the developed model can identify usability of mobile applications for hearing impairment and enable in identifying useful criteria to be included during application development process in real life process. As future study, the model can be tested among the hearing impairment people and practitioner to establish the results obtained which contributes to usability practitioners and application developers for the disabled.

Author 1: Shelena Soosay Nathan

Author 2: Nor Laily Hashim

Author 3: Azham Hussain

Author 4: Ashok Sivaji

Author 5: Mohd Affendi Ahmad Pozin

Keywords: Usability; hearing impaired; validation; evaluation; MAEHI

Paper 55: Online Signature Verification for Forgery Detection

Abstract: The increasing trend of using e-versions of document transmission and storage requires the electronic verification of sender/author. This research presents an efficient and robust online handwritten signature verification system targeting verification rates better than the available state-of-the-art systems in the presence of skilled forgeries. Fourier analysis is employed on the signatures to represent feature vectors in higher dimensional space followed by Local Fisher Discriminant Analysis to obtain compress representation while enhancing inter-class scatter between signature patterns. Signature modeling is performed using m-mediod-based modeling approach where m-mediods are put on to represent data distribution in each class. Connected component labeling is applied to binarized images of Urdu text to extract ligatures which are separated into primary ligatures and diacritics. Fast Euclidean Distance is used as dis(similarity). A total of 2414 signature samples including skilled forgeries are considered in our study. The evaluation of the proposed system on Japanese signature dataset provided by SigWiComp2013 realized promising results than the competitors.

Author 1: Muhammad Rizwan

Author 2: Farhan Aadil

Author 3: Mehr Yahya Durrani

Author 4: Rajermani Thinakaran

Keywords: Fast Euclidean distance; m-mediod; local fisher discriminant analysis

Paper 56: Image Denoising using Wavelet Cycle Spinning and Non-local Means Filter

Abstract: Removing as much noise as possible in an image while preserving its fine details is a complex and challenging task. We propose a wavelet-based and non-local means (NLM) denoising method to overcome the problem. Two well-known wavelets: dual-tree complex wavelet transform (DT-CWT) and discrete wavelet transform (DWT), have been used to change the noise image into several wavelet coefficients sequentially. NLM filtering and universal hard thresholding with cycle spinning have been used for thresholding on its approximation and detail coefficients, respectively. The inverse two-dimensional DWT was applied to the modified wavelet coefficients to obtain the denoised image. We conducted experiments with twelve test images on the set12 data set, adding the additive Gaussian white noise with variances of 10 to 90 in increments of 10. Three evaluation metrics, such as peak signal noise to rate (PSNR), structural similarity index metric (SSIM), and mean square error (MSE), have been used to evaluate the effectiveness of the proposed denoising method. From these measurement results, the proposed denoising method outperforms DT-CWT, DWT, and NLM almost in all noise levels except for the noise level of 10. At that noise level, the proposed denoising method is lower than NLM but better than DT-CWT and DWT.

Author 1: Giat Karyono

Author 2: Asmala Ahmad

Author 3: Siti Azirah Asmai

Keywords: Image denoising; discrete wavelet transform (DWT); dual-tree complex wavelet transform (DT-CWT); non-local means (NLM); cycle spinning

Paper 57: Advanced Detections of Norway Lobster (Nephrops Norvegicus) Burrows using Deep Learning Techniques

Abstract: Marine experts are facing lot of challenges in habitat monitoring of marine species. One of the biggest challenges is the underwater environment and species movement. The other challenge is the data collection of marine species. People used the camera sensors and satellite data in the past for data collection but in this era the scientists are using underwater Autonomous Underwater Vehicles (AUVs), the Remotely Operated Vehicles (ROVs), and certain sledges with high-definition still and video cameras to record the underwater footages. The ocean is composed of thousands of species which make the environment more challenging to monitor any specific specie. This work will focus on specie named Norway lobster (Nephrops norvegicus). The Nephrops norvegicus is one of the commercial specie in the Europe and generates millions of dollars yearly. This specie lives under the seabed and leaves behind the burrow structure on the sea ground. The Nephrops spend most of their time under the seabed. The scientists are currently monitoring the habitat of Nephrops norvegicus by underwater television (UWTV) surveys that is collected yearly on many European grounds. The collected data is reviewed manually by the experts who count the burrows on the sheet. This work focuses on the automatic detection of Nephrops burrows from underwater videos using the deep learning techniques. This work trained the Faster R-CNN models Inceptionv2, MobileNetv2, ResNet50, and ResNet101. Instead of training the models from scratch we used the transfer learning technique to fine tune these networks. The data is obtained from the Gulf of Cadiz (FU30) station. Twenty-eight different set of experiments are performed. The models are evaluated quantitatively using the mean Average Precision (mAP), precision and recall curves. Also, the models are qualitatively analyzed by visually presenting the output. The results prove that deep learning techniques are very helpful for marine scientists to assess the Nephrops norvegicus abundance.

Author 1: Atif Naseer

Keywords: Nephrops norvegicus; deep learning; stock assessment; faster RCNN

Paper 58: Implementation of a Smarter Herbal Medication Delivery System Employing an AI-Powered Chatbot

Abstract: Medicinal plants are a practical and cost-effective alternative for treating common ailments, especially in areas with limited access to public healthcare systems. This paper introduces a prototype of an intelligent interactive system that merges chatbot technology with artificial intelligence (AI) to address inquiries related to treatment alternatives and the application of different medicinal plants for prevalent health conditions which promote and advance alternative healing practices in the locality. The platform is a hybrid online chat service that prioritizes consumer health and encourages the responsible use of medicinal plants. This study used a survey questionnaire to gather information from traditional healers and users and concerned government agencies about how well the system prototype performed. The system's performance was assessed in terms of effectiveness, efficiency, and customer satisfaction, with respondents providing an aggregate rating of "Strongly Agree". Significantly, this study lays the groundwork for education on the use of local medicinal plants to cure illnesses and highlights the importance of providing users with accurate and reliable information on the safe use of medicinal plants. This approach empowers users to make informed decisions about the plants they use, reducing the likelihood of harmful effects and optimizing the potential benefits of medicinal plants. By supporting this effort, this study contributes to the achievement of the third Sustainable Development Goal of the UN, which aims to promote health and well-being by offering the local populace a low-cost option as a first line of defense for improving their health and wellness.

Author 1: Maria Concepcion S. Vera

Author 2: Thelma D. Palaoag

Keywords: Chatbot; artificial intelligence; intelligent interactive system; applied computing; consumer health; medicinal plants; traditional medicine

Paper 59: Suppressing Chest Radiograph Ribs for Improving Lung Nodule Visibility by using Circular Window Adaptive Median Outlier (CWAMO)

Abstract: Chest radiograph ribs obstruct lung nodules. To see the nodule under the chest radiograph ribs, remove or suppress them. The paper describes a circular median filter approach for finding outliers in chest radiographs. The method uses 147 Japanese Society of Radiological Technology x-ray pictures (JSRT). Pixels with intensities two standard deviations above the median are median outliers. Contrast-Limited Adaptive Histogram Equalization enhances nodule visibility (CLAHE). The method is tested on modest chest radiographs and compared to the Budapest University Bone Shadow Eliminated X-Ray Dataset methodology. The initial test uses 50 modest chest radiographs (Test 1). The proposed approach is applied after active shape modelling (ASM) lung segmentation. True positive nodules are seen on 89% of chest radiographs of various subtleties. Test-2 and Test-3 used 20 subtlety-level photos. In Test-2, the peak signal-to-noise ratio (PSNR), mean-to-standard deviation ratio (MSR), and universal image quality index (IQI) are evaluated for the full image and compared to the existing algorithm. For all three parameters, the suggested technique outperforms the algorithm. Test-3 computes nodule MSR and compares it to Budapest University's Bone Shadow Eliminated Dataset and original chest radiographs. The new algorithm improved nodule area contrast by 3.83% and 23.94% compared to the original chest radiograph. This approach improves chest radiograph nodule visualization.

Author 1: Dnyaneshwar Kanade

Author 2: Jagdish Helonde

Keywords: Lung cancer; chest radiograph; contrast limited adaptive histogram equalization; median outlier

Paper 60: Disease Identification in Crop Plants based on Convolutional Neural Networks

Abstract: The identification, classification and treatment of crop plant diseases are essential for agricultural production. Some of the most common diseases include root rot, powdery mildew, mosaic, leaf spot and fruit rot. Machine learning (ML) technology and convolutional neural networks (CNN) have proven to be very useful in this field. This work aims to identify and classify diseases in crop plants, from the data set obtained from Plant Village, with images of diseased plant leaves and their corresponding Tags, using CNN with transfer learning. For processing, the dataset composing of more than 87 thousand images, divided into 38 classes and 26 disease types, was used. Three CNN models (DenseNet-201, ResNet-50 and Inception-v3) were used to identify and classify the images. The results showed that the DenseNet-201 and Inception-v3 models achieved an accuracy of 98% in plant disease identification and classification, slightly higher than the ResNet-50 model, which achieved an accuracy of 97%, thus demonstrating an effective and promising approach, being able to learn relevant features from the images and classify them accurately. Overall, ML in conjunction with CNNs proved to be an effective tool for identifying and classifying diseases in crop plants. The CNN models used in this work are a very good choice for this type of tasks, since they proved to have a very high performance in classification tasks. In terms of accuracy, all three models are very accurate in image classification, with an accuracy of over 96% with large data sets.

Author 1: Orlando Iparraguirre-Villanueva

Author 2: Victor Guevara-Ponce

Author 3: Carmen Torres-Ceclén

Author 4: John Ruiz-Alvarado

Author 5: Gloria Castro-Leon

Author 6: Ofelia Roque-Paredes

Author 7: Joselyn Zapata-Paulini

Author 8: Michael Cabanillas-Carbonell

Keywords: CNN; identification; models; pathogen; plant; classification; machine learning

Paper 61: Automatic Detection of Oil Palm Growth Rate Status with YOLOv5

Abstract: Oil palm plantations are essential for Indonesia as a source of foreign exchange and a provider of employment opportunities. However, large-scale land clearing is considered a cause of deforestation, which harms the environment and society. So, it is necessary to manage plantations that are sustainable and still maintain the preservation of forests and biodiversity. One solution is to apply remote sensing technology. The research was conducted to develop a multi-class detection method for the growth rate of oil palm trees, with five categories: healthy palm, dead palm, yellowish palm, mismanaged palm, and smallish palm. The deep learning-based object detection method, YOLO Version 5 (YOLOv5), is used. This study compares the YOLOv5 network models, namely YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. Parameter setting is also carried out in the BCE (Binary Cross Entropy) with Logits Loss Function to handle the problem of unbalanced data distribution in each class. The YOLOv5 model with the highest mAP value is the YOLOv5l and YOLOv5x, the YOLOv5x requires longer training time. In this study, hyperparameter optimization was also carried out using hyperparameter evolution techniques. However, it has yet to provide increased results because the experiments conducted in this study are still limited.

Author 1: Desta Sandya Prasvita

Author 2: Dina Chahyati

Author 3: Aniati Murni Arymurthy

Keywords: Automatic detection; deep learning; oil palm; YOLOv5

Paper 62: Hybrid Approach Used to Analyze the Sentiments of Romanized Text (Sindhi)

Abstract: Sentiment analysis is an important part of natural language processing (NLP). This study evaluated the sentiment of Romanized Sindhi Text (RST) using a hybrid approach and ground truth values. The methodology of sentiment analysis involves three major steps: input data, process on tool, analysis of data and evaluation of results. One hundred RST sentences were used in this study's sentiment analysis, which can be positive, neutral, or negative. The statements in the corpus of this study are simple to understand and are used in everyday life. This research used an online Python tool to process a text and get results in the form of outcomes. The results showed that 86% of the sentences have neutral sentiments, 9% of the total results of sentiment analysis have negative sentiments, and only 5% of sentences of Romanized Sindhi Text have positive sentiments. The accuracy of the RST was measured on an online calculator and the value was 87.02% on the basis of ground truth values. An error ratio of 12.98% was calculated on the basis accuracy found on the online calculator of confusion matrix.

Author 1: Irum Naz Sodhar

Author 2: Suriani Sulaiman

Author 3: Abdul Hafeez Buller

Author 4: Anam Naz Sodhar

Keywords: Sentiment analysis; natural language processing; hybrid approach; python tool; Romanized Sindhi

Paper 63: Consolidated Definition of Digital Transformation by using Text Mining

Abstract: Digital transformation has become essential for the majority of organizations, in both public and private sectors. The term "digital transformation" has been used (and misused), so frequently that it is now somewhat ambiguous. It has become imperative to give it some conceptual rigor. The objective of this study is to identify the major elements of digital transformation as well as develop a proper definition for DT in the public and private sectors. For this purpose, 56 different definitions of DT collected from the available literature were analyzed, and we found that they extracted elements from definition of DT manually. So, text mining (TF-IDF and Fp-tree) techniques are used to identify the major constituents and finally consolidate in generic DT definitions. The approach consists of five phases: 1) collecting and classifying DT definitions; 2) detecting synonyms; 3) extracting major elements (terms); 4) discussing and comparing DT elements; 5) formulating DT definitions for different business categories. An evaluation tool was also developed to assess the level of DT elements coverage in various definitions found in the literature, and, as a validation, it was applied to the formulated definitions.

Author 1: Mohammed Hitham M. H

Author 2: Hatem Elkadi

Author 3: Neamat El Tazi

Keywords: Digital transformation; text mining; association rules; FP tree

Paper 64: Fast Hybrid Deep Neural Network for Diagnosis of COVID-19 using Chest X-Ray Images

Abstract: In the last three years, the coronavirus (COVID-19) pandemic put healthcare systems worldwide under tremendous pressure. Imaging techniques, such as Chest X-Ray (CXR) images, play an essential role in diagnosing many diseases (for example, COVID-19). Recently, intelligent systems (Machine Learning (ML) and Deep Learning (DL)) have been widely utilized to identify COVID-19 from other upper respiratory diseases (such as viral pneumonia and lung opacity). Nevertheless, identifying COVID-19 from the CXR images is challenging due to similar symptoms. To improve the diagnosis of COVID-19 using CXR images, this article proposes a new deep neural network model called Fast Hybrid Deep Neural Network (FHDNN). FHDNN consists of various convolutional layers and various dense layers. In the beginning, we preprocessed the dataset, extracted the best features, and expanded it. Then, we converted it from two dimensions to one dimension to reduce training speed and hardware requirements. The experimental results demonstrate that preprocessing and feature expansion before applying FHDNN lead to better detection accuracy and reduced speedy execution. Furthermore, the model FHDNN outperformed the counterparts by achieving an accuracy of 99.9%, recall of 99.9%, F1-Score has 99.9%, and precision of 99.9% for the detection and classification of COVID-19. Accordingly, FHDNN is more reliable and can be considered a robust and faster model in COVID-19 detection.

Author 1: Hussein Ahmed Ali

Author 2: Nadia Smaoui Zghal

Author 3: Walid Hariri

Author 4: Dalenda Ben Aissa

Keywords: COVID-19; Chest X-ray (CXR); Deep Learning (DL); Convolutional Neural Network (CNN)

Paper 65: Digital Image Encryption using Composition of RaMSH-1 Map Transposition and Logistic Map Keystream Substitution

Abstract: Digital communication of multimedia data (text, signal/audio, image, and video) through the internet network has an important role in the era of industrial revolution 4.0 and society 5.0. However, the easiness of exchanging personal and confidential digital information/data has a high risk of being hijacked by irresponsible people. The development of reliable and robust data encryption methods is a solution to this risk. This paper suggests combining modified Henon Map transposition encryption and Logistic Map keystream substitution encryption to create a novel data encryption method (called RaMSH-1). The proposed algorithm simultaneously transposes data positions and substitute data values randomly, as well as having encryption key combination or key space of 1.05 × 10670. A few images with various sizes and variations of color features, object shapes, and textures have been tested. Based on the results of the analysis of randomness, key sensitivity, and visual, it is evident that the proposed encryption algorithm is resistant to differential attack, entropy attack and brute force attack.

Author 1: Rama Dian Syah

Author 2: Sarifuddin Madenda

Author 3: Ruddy J. Suhatril

Author 4: Suryadi Harmanto

Keywords: Encryption; Decryption; Digital Image; RaMSH-1 Map Transposition; Logistic Map Substitution

Paper 66: State-of-the-Art of the Swarm Ship Technology for Alga Bloom Rapid Monitoring

Abstract: The swarm intelligence has become an interesting topic for employing of multi-agent robotics with specific purpose. The capability of multi-coordination, scalability and goals-oriented control in spatial and temporal environment are already concerned and proven for several applications such as in military patrol, and drones leader-follower coordination. In marine-based environment, swarm intelligence adopted by ASV or ROV has been used for water quality and environment monitoring with sufficient optimized results making it convenient for rapid assessments. In this paper, the arrangement for building a trusted cyber-physical systems for algal bloom rapid assessment using swarm ship technology were explained in state-of-the-art perspectives. The minimum requirements for sensing, vehicle controlling, and communication of this system with others were explored as well as algorithm chosen for the best known configuration to monitor algal bloom events before spreading so fast to larger area. Some models were explained to show the robustness of autonomous unmanned ship control. From this point of view, we concluded that swarm ship technology has become an important potential implementation for near real time in situ monitoring compared to other decision making method such as laboratory examination or remote sensing-based results. The results of this review open the opportunity to realization of swarm ship technology in cyber physical system for monitoring algal bloom in specific area near real time efficiently.

Author 1: Denny Darlis

Author 2: Indra Jaya

Author 3: Karlisa Priandana

Author 4: Yopi Novita

Author 5: Ayi Rachmat

Keywords: Swarm ship; algal bloom; cyber physical system; rapid monitoring

Paper 67: Modeling of Organic Waste Classification as Raw Materials for Briquettes using Machine Learning Approach

Abstract: The existence of organic waste must be utilized by the community so that it does not only end up in landfills but can also be processed into something constructive so that it is useful and has high economic value. Organic waste can be converted into raw materials to manufacture of biomass briquettes. Machine learning techniques were developed for technological applications, object detection, and categorization. Methods with artificial reasoning networks that use a number of algorithms, such as the Naive Bayes Classifier, will work together in determining and identifying certain characteristics in a digital data set. The manufacturing method goes through several processes with a waste classification model as a source of learning data. The image data is based on five types: coconut shells, sawdust, corn cobs, rice husks, and plant leaves. The research aims to identify and classify types of waste both organically and non-organically so that it will make it easier to sort waste. The results of testing the organic waste application from digital images have an accuracy rate of 97%. The model design carried out in training data is useful for producing a data model.

Author 1: Norbertus Tri Suswanto Saptadi

Author 2: Ansar Suyuti

Author 3: Amil Ahmad Ilham

Author 4: Ingrid Nurtanio

Keywords: Classification; organic waste; raw material; machine learning

Paper 68: Predicting Hypertension using Machine Learning: A Case Study at Petra University

Abstract: Hypertension is a key cardiovascular disease risk factor (CVD). Identifying these high-risk individuals is crucial since it would save time and money before using any sophisticated, invasive, or costly diagnostic technique. This endeavour may be accomplished in part with the use of modern machine learning techniques. Specifically, a prediction model may be created based on several easily-obtained, non-invasive, and inexpensive indicator characteristics of high-risk individuals. This research is an effort to forecast hypertension risks based on Petra University’s population. This case study was done between 2019 and 2020 at Petra University. Using hospital-visited patients’ medical records, the gathered data was used to develop a model. The research comprised a comprehensive dataset of 31500 patients, comprising 12658 hypertension cases and 18842 non-hypertensive cases. SMOTE was used as a dataset for the categorization of hypertension. The SMOTE-k-nearest neighbour prediction model performs exceptionally well, as evidenced by its excellent performance (83.9% classification accuracy, 85.1% specificity, 83.3% sensitivity, and 89.6% AUC) when compared to other classifiers using 10-fold cross-validation with full features and no oversampling on the hypertension dataset. The data extracted from Petra University Health Center is considered to be very helpful for ML and is availed to produce a decision tree to identify the data related to hypertension.

Author 1: Yasmin Sakka

Author 2: Dina Qarashai

Author 3: Ahmad Altarawneh

Keywords: Hypertension; machine learning; medical records; sensitivity; specificity

Paper 69: Medical Name Entity Recognition Based on Lexical Enhancement and Global Pointer

Abstract: Named entity recognition (NER) in biological sources, also called medical named entity recognition (MNER), attempts to identify and categorize medical terminology in electronic records. Deep neural networks have recently demonstrated substantial effectiveness in MNER. However, Chinese MNER has issues that cannot use lexical information and involve nested entities. To address these problems, we propose a model which can handle both nested and non-nested entities. The model uses a simple lexical enhancement method for merging lexical information into each character's vector representation, and then uses the Global Pointer approach for entity recognition. Furthermore, we retrain a pre-trained model with a Chinese medical corpus to incorporate medical knowledge, resulting in F1 score of 68.13% on the nested dataset CMeEE, 95.56% on the non-nested dataset CCKS2017, 85.89% on CCKS2019, and 92.08% on CCKS2020. These data demonstrate the efficacy of our proposed model.

Author 1: Pu Zhang

Author 2: Wentao Liang

Keywords: MNER; nested NER; Global Pointer; lexical enhancement

Paper 70: Distributed Focused Web Crawling for Context Aware Recommender System using Machine Learning and Text Mining Algorithms

Abstract: In today’s world, Recommender System (RS) is the most effective means used to manage the huge amount of multimedia content available on the internet. RS learns the user preferences and relationships among the users and items. It helps the users to discover new interesting items and make use of different media types such as text, audio, video and images. RS can act as an information filtering model which can overcome the issues related to over-fitting and excess information. In this work, a new distributed framework named DAE-SR (Deep AutoEncoder based Softmax Regression) is introduced for context-aware recommender systems which focus on user-item based interaction and offers personalized recommendations. The proposed model is implemented in PYTHON platform. The dataset used for experimentation is Foursquare dataset. The performance of the proposed context-aware RS is beneficial to both the users and service providers. Its helps in decision making process and can offer relevant recommendations to users. The performance is evaluated in terms of various metrics such as accuracy, recall, precision and so on. From the implementation outcomes, the proposed strategy achieved good accuracy (98.33%), precision (98%), run time (1.43 ms) and recall (98.1%). Thus, it is proved that the proposed DAE-SR classifier performs better compared to other models and offer dependable and relevant recommendations to users.

Author 1: Venugopal Boppana

Author 2: P. Sandhya

Keywords: Recommender Systems (RSs); context-aware; softmax regression; deep autoencoder; multimedia information

Paper 71: Convolutional Neural Network Model based Students’ Engagement Detection in Imbalanced DAiSEE Dataset

Abstract: The COVID-19 pandemic has significantly changed learning processes. Learning, which had generally been carried out face-to-face, has now turned online. This learning strategy has both advantages and challenges. On the bright side, online learning is unbound by space and time, allowing it to take place anywhere and anytime. On the other side, it faces a common challenge in the lack of direct interaction between educators and students, making it difficult to assess students’ engagement during an online learning process. Therefore, it is necessary to conduct research with the aim of automatically detecting students’ engagement during online learning. The data used in this research were derived from the DAiSEE dataset (Dataset for Affective States in E-Environments), which comprises ten-second video recordings of students. This dataset classifies engagement levels into four categories: low, very low, high, and very high. However, the issue of imbalanced data found in the DAiSEE dataset has yet to be addressed in previous research. This data imbalance can cause errors in the classification model, resulting in overfitting and underfitting of the model. In this study, Convolutional Neural Network, a deep learning model, was utilized for feature extraction on the DAiSEE dataset. The OpenFace library was used to perform facial landmark detection, head pose estimation, facial expression unit recognition, and eye gaze estimation. The pre-processing stages included data selection, dimensional reduction, and normalization. The PCA and SVD techniques were used for dimensional reduction. The data were later oversampled using the SMOTE algorithm. The training and testing data were distributed at an 80:20 ratio. The results obtained from this experiment exceeded the benchmark evaluation values on the DAiSEE dataset, achieving the best accuracy of 77.97% using the SVD dimensional reduction technique.

Author 1: Mayanda Mega Santoni

Author 2: T. Basaruddin

Author 3: Kasiyah Junus

Keywords: Convolutional neural networks; imbalanced data; deep learning; PCA; COVID-19; online learning; students’ engagement; SVD; SMOTE

Paper 72: A Hybrid Model for Covid-19 Detection using CT-Scans

Abstract: Although some believe it has been wiped out, the coronavirus is striking again. Controlling this epidemic necessitates early detection of coronavirus disease. Computed tomography (CT) scan images allow fast and accurate screening for COVID-19. This study seeks to develop the most precise model for identifying and classifying COVID-19 by developing an automated approach using transfer-learning CNN models as a base. Transfer learning models like VGG16, Resnet50, and Xception are employed in this study. The VGG16 has a 98.39% accuracy, the Resnet50 has a 97.27% accuracy, and the Xception has a 96.6% accuracy; after that, a hybrid model made using the stacking ensemble method has an accuracy of 98.71%. According to the findings, hybrid architecture offers greater accuracy than a single architecture.

Author 1: Nagwa G. Ali

Author 2: Fahad K. El Sheref

Author 3: Mahmoud M. El khouly

Keywords: Covid-19; coronavirus; transfer-learning; CT-scan and ensemble method

Paper 73: A New Big Data Architecture for Analysis: The Challenges on Social Media

Abstract: The streams of social media big data are now becoming an important issue. But the analytics method and tools for this data may not be able to find the useful information from this massive amount of data. The question then becomes: how do we create a high-performance platform and a method to efficiently analyse social networks’ big data; how to develop a suitable mining algorithm for finding useful information from social media big data. In this work, we propose a new hierarchical big data analysis for understanding human interaction, and we present a new method to measure the useful tweets of Twitter users based on the three factors of tweet texts. Finally, we use this test implementation score, in order to detect useful and classification tweets by interested degree.

Author 1: Abdessamad Essaidi

Author 2: Mostafa Bellafkih

Keywords: Social media; useful information; big data analysis; stream processing; classification tweets

Paper 74: Drug Repositioning for Coronavirus (COVID-19) using Different Deep Learning Architectures

Abstract: In December 2019, the COVID-19 epidemic was found in Wuhan, China, and soon hundreds of millions were infected. Therefore, several efforts were made to identify commercially available drugs to repurpose them against COVID-19. Inferring potential drug indications through computational drug repositioning is an efficient method. The drug repositioning problem is a top-K recommendation function that presents the most likely drugs for specific diseases based on drug and disease-related data. The accurate prediction of drug-target interactions (DTI) is very important for drug repositioning. Deep learning (DL) models were recently exploited for promising DTI prediction performance. To build deep learning models for DTI prediction, encoder-decoder architectures can be utilized. In this paper, a deep learning-based drug repositioning approach is proposed, which is composed of two experimental phases. Firstly, training and evaluating different deep learning encoder-decoder architecture models using the benchmark DAVIS Dataset. The trained deep learning models have been evaluated using two evaluation metrics; mean square error and the concordance index. Secondly, predicting antiviral drugs for Covid-19 using the trained deep learning models created during the first phase. In this phase, these models have been experimented to predict different antiviral drug lists, which then have been compared with a recently published antiviral drug list for Covid-19 using the concordance index metric. The overall experimental results of both phases showed that the most accurate three deep learning compound-encoder/protein-encoder architectures are Morgan/AAC, CNN/AAC, and CNN/CNN with best values for the mean square error, the first phase concordance index, and the second phase concordance index.

Author 1: Afaf A. Morad

Author 2: Mohamed A. Aborizka

Author 3: Fahima A. Maghraby

Keywords: Antiviral drugs; computational drug repositioning; coronavirus; deep learning; drug-target interactions

Paper 75: Development of a Smart Sensor Array for Adulteration Detection in Black Pepper Seeds using Machine Learning

Abstract: Black pepper is an expensive commodity with a high risk of adulteration. Ground papaya seed is the main adulterant in pepper because it cannot be discriminated visually. There are few destructive methods. Since pepper is costlier, non-destructive method of adulteration is must but it is challenging one. The existing non-destructive method uses costlier equipment, bulky, involve laboratory-based testing, time consuming in the process. To overcome the above issues, this article presents the development of Non-destructive E- nose gas sensor for pepper adulteration detection. This system determines the VOC in a controlled environment. The proposed system utilizes MQ2 and MQ3 gas sensor arrays to identify Volatile Organic Compounds present in pepper seeds to discriminate adulterant and non-adulterant sample. The sensor data are utilized to perform the qualitative analysis to determine the adulteration using a support vector machine learning algorithm. The proposed sensor system with Support Vector Machine learning algorithm outperforms in comparison with existing methods with 100% classification accuracy. Conclusion: The developed gas sensor system is connected to the internet via the IoT application model to show results on the web pages and enables access by the authenticated user from anywhere. Client server model with MQTT protocol is used for developing IoT application.

Author 1: Sowmya Natarajan

Author 2: Vijayakumar Ponnusamy

Keywords: Gas sensor system; volatile organic compounds; pepper seeds; papaya seeds; machine learning

Paper 76: Digital Signature Algorithm: A Hybrid Approach

Abstract: Security is one of the most important issues in layout of a Digital System. Communication these days is digital. Consequently, utmost care must be taken to secure the information. This paper specializes in techniques used to defend the facts from thefts and hacks the use of quit-to-cease encryption and decryption. Cryptography is the important thing technique related to Encrypting and Decrypting messages. We use Digital Signature preferred (DSS) and the Digital Signature Algorithm (DSA). The code for this algorithm is written in MATLAB. The DSA Algorithm is commonly used in cryptographic applications to provide services such as entity authentication, key transit, and key agreement in an authenticated environment. This structure is related with steady Hash Function and cryptographic set of rules the government groups in USA as it is taken into consideration to be one of the safest approaches of protection system. This fashion- able could have a top notch effect on all of the Government Agencies and Banks for protective the facts.

Author 1: Prajwal Hegde N

Author 2: Veena Devi Shastrimath V

Keywords: DSA; digital signature algorithm; hash function; public key; private key; RSA

Paper 77: Auto JSON: An Automatic Transformation Model for Converting Relational Database to Non-relational Documents

Abstract: In recent days, the demand for dealing large set of distributed data obsoletes the relational database and its structured query language (SQL) solutions in practice and paves the way for novel solutions in the name of non-relational database as not-only SQL (NoSQL). The NoSQL offers dynamic, flexible, scalable, highly available, greater performance and near real-time access to the distributed nature of voluminous data used for current industrial applications. Apart from these giant features of NoSQL, the SQL is still found to be in operation because of its popularity and standard. This paper projected an algorithm to convert the relational documents of MySQL into any document oriented NoSQL databases automatically without destructing the existing relational database setup and installing the NoSQL from scratch in the core machines. Java Script Object Notation (JSON) is a human readable data interchange format, being used in web development. The characteristics of JSON widened its use cases from web development to database storage. The Mongo DB, one of the most popular document oriented NoSQL adapts JSON format for its storage. The proposed algorithm is built based on its schema definition and the performance is captured through evaluating it against a sample database from hospital management system. The findings are discussed with great interest of addressing the challenges and revealing the scope for improvement.

Author 1: K. Revathi

Author 2: T. Tamilselvi

Author 3: Batini Dhanwanth

Author 4: M. Dhivya

Keywords: Distributed data; document oriented NOSQL; hospital management system; Mongo DB; my SQL

Paper 78: An Energy‐Aware Technique for Task Allocation in the Internet of Things using Krill Herd Algorithm

Abstract: The Internet of Things (IoT) is an innovative technology that connects the digital and physical worlds as well as allows physical devices capable of different capacities to share resources to accomplish tasks. Most IoT objects have limited battery life and are heterogeneous. Assignment of these objects is, therefore, extremely challenging. Energy consumption and reliability are the primary objectives of task allocation algorithms. We present an optimization solution to the IoT task allocation problem based on the krill herd algorithm. The algorithm increases the energy efficiency and stability of the network while providing a reliable task allocation solution. An extensive test of the proposed algorithm has been conducted using the MATLAB simulator. Compared to the most relevant method in the literature, our algorithm provides a higher level of energy efficiency.

Author 1: Dejun Miao

Author 2: Rongyan Xu

Author 3: Jiusong Chen

Author 4: Yizong Dai

Keywords: Internet of things; resource allocation; task scheduling; energy efficiency

Paper 79: Marigold Flower Blooming Stage Detection in Complex Scene Environment using Faster RCNN with Data Augmentation

Abstract: In recent years, flower growing has developed into a lucrative agricultural sector that provides employment and business opportunities for small and marginal growers in both urban and rural locations in India. One of the most often cultivated flowers for landscaping design is the Marigold flower. It is also widely used to create garlands for ceremonial and social occasions using loose flowers. Understanding the appropriate stage of harvesting for each plant species is essential to ensuring the quality of the flowers after they have been picked. It has been demonstrated that human assessors consistently used a category scoring system to evaluate various flowering stages. Deep learning and convolutional neural networks have the potential to revolutionize agriculture by enabling efficient analysis of large-scale data. In order to address the problem of Marigold flower stages detection and classification in complex real-time field scenarios, this study proposes a fine-tuned Faster RCNN with ResNet50 network coupled with data augmentation. Faster RCNN is a popular deep learning framework for object detection that uses a region proposal network to efficiently identify object locations and features in an image. The Marigold flower dataset was collected from three different Marigold fields in the Anand District of Gujarat State, India. The collection includes of photos that were taken outdoors in natural light at various heights, angles, and distances. We have developed and fine-tuned a Faster RCNN detection and classification model to be particularly sensitive to Marigold flowers, and we have compared the generated method's performance to that of other cutting-edge models to determine its accuracy and effectiveness.

Author 1: Sanskruti Patel

Keywords: Deep learning; convolutional neural networks; object detection; marigold flower blooming stage detection

Paper 80: Polarimetric SAR Characterization of Mangrove Forest Environment in the United Arab Emirates (UAE)

Abstract: This Mangrove forests in the United Arab Emirates (UAE) provide valuable ecosystem services such as coastal erosion protection, water purification and refuge for a wide variety of plants and animals. Therefore, the ﬁrst step toward understanding the mangrove forests is the monitoring of this important ecological system. This paper proposes an original study to characterize the mangrove forest environment in the UAE by using polarimetric synthetic aperture radar (PolSAR) remote sensing. Free access C-band dual- PolSAR Sentinel 1 data have been exploited. The elements as of the covariance matrix as well as the entropy/alpha decomposition parameters have been studied. Results show that the VH intensity, the coherence between VV and VH polarimetric channels, the entropy and alpha angle provide the most pronounced signatures that discern mangrove forests. Thus, these parameters could be exploited to improve the accuracy of the remote sensing monitoring and mapping techniques of mangrove forests in the UAE.

Author 1: Soumaya Fatnassi

Author 2: Mohamed Yahia

Author 3: Tarig Ali

Author 4: Maruf Mortula

Keywords: Mangrove forests; dual-PolSAR; sentinel 1; United Arab Emirates; entropy/alpha decomposition

Paper 81: An In-depth Analysis of Uneven Clustering Techniques in Wireless Sensor Networks

Abstract: The low-cost and convenient feature of Wireless Sensor Networks (WSNs) has made them popular in many sectors over the last decade. The WSNs are now widely used as a result of recent advancements in low-power communication and being energy-efficient. The WSNs typically use batteries to power sensor nodes. The finite stored energy in batteries and the hassle of battery replacement have led to a critical focus on energy efficiency for WSNs. Clustering and data aggregation are the most efficient methods to address the energy concerns of WSNs. This paper comprehensively reviews several uneven clustering methods and compares the various uneven clustering algorithms. The methods are described in terms of their goals, attributes, categories, advantages and disadvantages. Probabilistic clustering is used when there is a need of simplicity and speed. As a result, this study compared all these types of protocols based on their clustering properties, CHs properties, and on the type of clustering process; and current research gap effective techniques are also addressed.

Author 1: Hai-yu Zhang

Keywords: Wireless sensor networks; data aggregation; uneven clustering; energy-efficient; review

Paper 82: Univariate and Multivariate Gaussian Models for Anomaly Detection in Multi Tenant Distributed Systems

Abstract: Due to the flaws in shared memory, settings, and network access, distributed systems on a network always have been susceptible to cyber intrusions. Co-users on the same server give attackers the chance to monitor the activity of many other users and launch an attack when those users' security is at risk. Building completely secure network topologies immune from risks and assaults has traditionally been the goal. It is also hard to create an architecture that is 100 percent safe due to its open-ended nature. The precise parameters and infrastructure design whereby the strike is instantiated are a constant which can always be detected regardless of the sort of attack. This work now have the chance to simulate any abnormality and subsequent attack possibilities using network parameter values thanks to the increased usage of algorithms for machine learning and data-gathering tools. This work proposes a Gaussian model to forecast the likelihood of an attack occurring depending on certain system parameters. This work model a univariate and a multivariate Gaussian model on the training dataset. This work makes use of various threshold values to predict whether the data point is an inlier or an outlier. This research examines accuracies for various threshold values. An important challenge in an anomaly detection situation is class imbalance. As long as this work just utilizes training data, a class imbalance is not a problem. Our data-driven results show that combining machine learning with Gaussian-based models might be a useful tool for analyzing network intrusions. Although more steps are being made to boost digital space security, machine learning algorithms may be utilized to examine any abnormal behavior that is left uncontrolled.

Author 1: Pravin Ramdas Patil

Author 2: Geetanjali Kale

Keywords: Multi-tenant distributed system; anomaly detection; outlier detection; machine learning; Gaussian model

Paper 83: Cloud Service Composition using Firefly Optimization Algorithm and Fuzzy Logic

Abstract: Cloud computing involves the dynamic provision of virtualized and scalable resources over the Internet as services. Different types of services with the same functionality but different non-functionality features may be delivered in a cloud environment in response to customer requests, which may need to be combined to satisfy the customer's complex requirements. Recent research has focused on combining unique and loosely-coupled services into a preferred system. An optimized composite service consists of formerly existing single and simple services combined to provide an optimal composite service, thereby improving the quality of service (QoS). In recent years, cloud computing has driven the rapid proliferation of multi-provision cloud service compositions, in which cloud service providers can provide multiple services simultaneously. Service composition fulfils a variety of user needs in a variety of scenarios. The composite request (service request) in a multi-cloud environment requires atomic services (service candidates) located in multiple clouds. Service composition combines atomic services from multiple clouds into a single service. Since cloud services are rapidly growing and their Quality of Service (QoS) is widely varying, finding the necessary services and composing them with quality assurances is an increasingly challenging technical task. This paper presents a method that uses the firefly optimization algorithm (FOA) and fuzzy logic to balance multiple QoS factors and satisfy service composition constraints. Experimental results prove that the proposed method outperforms previous ones in terms of response time, availability, and energy consumption.

Author 1: Wenzhi Wang

Author 2: Zhanqiao Liu

Keywords: Cloud computing; service composition; QoS; firefly algorithm; fuzzy logic

Paper 84: Systematic Review of Deep Learning Techniques for Lung Cancer Detection

Abstract: Cancer is the leading cause of deaths across the globe and 10 million people died of cancer and particularly 2.21 million new cases registered besides 1.80 million deaths, according to WHO, in 2020. Malignant cancer is caused by multiplication and growth of lung cells. In this context, exploiting technological innovations for automatic detection of lung cancer early is to be given paramount importance. Towards this end significant progress has been made and deep learning model such as Convolutional Neural Network (CNN) is found superior in processing lung CT or MRI images for disease diagnosis. Lung cancer detection in the early stages of the disease helps in better treatment and cure of the disease. In this paper, we made a systematic review of deep learning methods for detection of lung cancer. It reviews peer reviewed journal papers and conferences from 2012 to 2021. Literature review throws light on synthesis of different existing methods covering machine learning (ML), deep learning and artificial intelligence (AI). It provides insights of different deep learning methods in terms of their pros and cons and arrives at possible research gaps. This paper gives knowledge to the reader on different aspects of lung cancer detection which can trigger further research possibilities to realize models that can be used in Clinical Decision Support Systems (CDSSs) required by healthcare units.

Author 1: Mattakoyya Aharonu

Author 2: R Lokesh Kumar

Keywords: Artificial intelligence; deep learning; lung cancer; lung cancer detection; machine learning

Paper 85: Question Classification in Albanian Through Deep Learning Approaches

Abstract: In recent years, there is growing interest in intelligent conversation systems. In this context, Question Classification is an essential subtask in Question Answering systems that determines the question type, therefore, also the type of the answer. However, while there is abundant research for English, little research work has been carried out for other languages. In this paper we deal with classification of questions in the Albanian language which is considered a complex Indo-European language. We employ both machine learning and deep learning approaches on a large corpus in Albanian based on the six-class TREC dataset with approximately 5000 questions. Experiments with and without stop-words show that the impact of stop-words is significant in the accuracy of the classifier. Extensive comparison of algorithms for the task of question classification in Albanian show that deep learning algorithms outperform conventional machine learning approaches. To the best of our knowledge this is the first approach in literature for classifying questions in Albanian and the results are highly comparable to English.

Author 1: Evis Trandafili

Author 2: Nelda Kote

Author 3: Gjergj Plepi

Keywords: Question classification; deep learning; BiLSTM; transformer; RoBERTa; Albanian corpus; natural language processing

Paper 86: An Efficient Source Printer Identification Model using Convolution Neural Network (SPI-CNN)

Abstract: Document forgery detection is becoming increasingly important in the current era, as forgery techniques are available to even inexperienced users. Source printer identification is a method for identifying the source printer and classifying the questioned document into one of the printer classes. According to what we know, most earlier studies segmented documents into characters, words, and patches or cropped them to obtain large datasets. In contrast, in this paper, we worked with the document as a whole and a small dataset. This paper uses three techniques dependent on CNN to find the document source printer without segmenting the document into characters, words, or patches and with small datasets. Three separate datasets of 1185, 1200, and 2385 documents are used to estimate the performance of the suggested techniques. In the first technique, 13 pre-trained CNN were tested, and they were only used for feature extraction, while SVM was used for classification. In the second technique, a pre-trained neural network is retrained using transfer learning for feature extraction and classification. In the third technique, CNN is trained from scratch and then used for feature extraction and SVM for classification. Many experiments are done in the three techniques, showing that the third technique gives the best result. This technique achieved 99.16%, 99.58%, and 98.3% accuracy for datasets 1, 2, and 3. The three techniques are compared with some previously published papers, and found that the third technique gives better results.

Author 1: Naglaa F. El Abady

Author 2: Hala H. Zayed

Author 3: Mohamed Taha

Keywords: Document forgery; source printer identification (SPI); convolution neural network (CNN); transfer learning (TL); support vector machine (SVM)

Paper 87: The Cross-Cultural Teaching Model of Foreign Literature Under the Application of Machine Learning Technology

Abstract: As globalization spreads, so does the world's ethnic makeup, leading to a surge in cultural diversity that has become a major issue for the educational systems of all countries. Many western countries advocate for cross-cultural education (CCE) as a means of dealing with cultural variety and promoting trust, tolerance, and interaction between individuals of different backgrounds. The way to achieve this goal is to work toward solving the issue while also fostering greater national unity. One of the newest developments in international education is the concept of CCE, which has also given rise to a whole new area of research within the subject of education. Too much time has passed since students were actively engaged in the learning process, and the new curriculum reform has seriously harmed the traditional approaches to teaching foreign literature. As a result, the proposed research has shown that around half of all education is dedicated to the FL cross-cultural teaching paradigm. Chinese students' data were first gathered for this study and divided into two groups: Control Class (CC) and Experimental Class (EC). The performance of the students in both groups is then forecasted using the extreme gradient boosting (XGBoost) technique, which is based on machine learning. Then, we use an optimization method known as the Flower Pollination Algorithm (FPA) to improve XGBoost's prediction performance. According to the descriptive findings, students who adhere to the suggested teaching strategy show more learning interest than those who adhere to existing strategies.

Author 1: Jing Lv

Keywords: Cross cultural education; foreign literature; extreme gradient boosting (XGBoost) algorithm; flower pollination algorithm (FPA); control class (CC)

Paper 88: Meta Heuristic Fusion Model for Classification with Modified U-Net-based Segmentation

Abstract: General cause of diabetes mellitus is Diabetic Retinopathy (DR), which outcomes in lesions on the retinas that impair vision. If it is not detected in time, the result is severe blindness issues. Regrettably, there is no treatment for DR. Early diagnosis and treatment of DR can greatly lower the risk of visual loss. In contrast to computer-aided diagnosis methods, the manual diagnosis of DR using retina fundus images is more time-consuming effort, and high cost as well, as it is highly prone to error. Deep learning has emerged as one of the most popular methods for improving performance, particularly in the classification and analysis of medical images. Therefore, a deep structure-based DR detection and severity classification has been demonstrated for treating the DR with the usage of fundus images. The major aim of this developed technique is to classify the severity level of the retinal region of the human eye from the fundus images. At first, the required retinal fundus images are collected from the standard benchmark data sources. Secondly, image enhancement techniques are applied to the collected fundus images to improve the quality of images. Thirdly, the abnormality segmentations are carried out by using the optic disc removal process using active contouring model and then, the regional segmentation is done via the Modified U-Net method. Finally, the segmented image is subjected to the hybrid classifier network named a Hybrid Soft Attention-based DenseNet with Multi-Scale Gated ResNet (HSADMGR Net) for classifying the retinal fundus images and finding the severity level of the retinal images with higher accuracy. Furthermore, the parameters present inside the hybrid classifier network are optimized with the help of implemented Multi-Armed Bandits Groundwater Flow Algorithm (MABGFA). The test results regarding the developed deep structure-based DR model are validated with the existing DR detection and classification approaches by using different performance measures.

Author 1: Sri Laxmi Kuna

Author 2: A. V. Krishna Prasad

Author 3: Suneetha Bulla

Keywords: Diabetic retinopathy segmentation and classification model; multi-armed bandits groundwater flow algorithm; hybrid soft attention-based DenseNet with multi-scale gated ResNet; modified U-Net-based segmentation

Paper 89: Legal Entity Extraction: An Experimental Study of NER Approach for Legal Documents

Abstract: In legal domain Name Entity Recognition serves as the basis for subsequent stages of legal artificial intelligence. In this paper, the authors have developed a dataset for training Name Entity Recognition (NER) in the Indian legal domain. As a first step of the research methodology study is done to identify and establish more legal entities than commonly used named entities such as person, organization, location, and so on. The annotators can make use of these entities to annotate different types of legal documents. Variety of text annotation tools are in existence finding the best one is a difficult task, so authors have experimented with various tools before settling on the best one for this research work. The resulting annotations from unstructured text can be stored into a JavaScript Object Notation (JSON) format which improves data readability and manipulation simple. After annotation, the resulting dataset contains approximately 30 documents and approximately 5000 sentences. This data further used to train a spacy pre-trained pipeline to predict accurate legal name entities. The accuracy of legal names can be increased further if the pre-trained models are fine-tuned using legal texts.

Author 1: Varsha Naik

Author 2: Purvang Patel

Author 3: Rajeswari Kannan

Keywords: Named Entity Recognition; NER; legal domain; text annotation; annotation tools

Paper 90: Research on the Model of Preventing Corporate Financial Fraud under the Combination of Deep Learning and SHAP

Abstract: Preventing financial fraud in listed companies is conducive to improving the healthy development of China’s accounting industry and the securities market, is conducive to promoting the improvement of the internal control system of China’s enterprises, and is conducive to promoting stability. Based on the combination of SHAP (Shapley Additive explanation), a prediction and identification model should be built to determine the possibility of financial fraud and the risk of fraud for the company. The research model has effectively improved the identification accuracy of financial fraud in listed companies, and the research model has effectively dealt with the gray sample problem that is common in the forecasting model through the LOF algorithm and the IF algorithm. When conducting comparative experiments on the models, the overall accuracy rate of the research model is over 85%, the recall rate is 78.5%, the precision rate is 42%, the AUC reaches 0.896, the discrimination degree KS reaches 0.652, and the model stability PSI is 0.088, compared with traditional financial fraud Forecasting models FS model and CS model has a higher predictive effect. In the empirical analysis, selecting a company’s fraud cases in 2020 can effectively analyze the characteristic contribution in the fraud process and the focus on fraud risks. The established model can effectively monitor the company’s finance and prevent fraud.

Author 1: Yanzhao Wang

Keywords: Financial fraud; deep learning; ensemble algorithm; feature selection

Paper 91: Conjugate Symmetric Data Transmission Control Method based on Machine Learning

Abstract: In conjugate symmetric data transmission, due to insufficient judgment of congestion during transmission, the amount of data is large and the transmission rate is low. In order to improve the data transmission rate, a conjugate symmetric data transmission control method based on machine learning is designed. Firstly, the data to be transmitted is tracked and determined, and then conjugate symmetric data fusion is completed according to the calculation result of the best tracking signal. According to the fusion results, the framework of the conjugate symmetric data coding system is established, and the data coding is completed. The average congestion mark value is calculated by the machine learning method, and the congestion judgment of data transmission is completed. On the basis of congestion determination, the efficient transmission control of conjugate symmetric data is realized by specifying the conjugate symmetric data transmission protocol. Experimental results show that compared with traditional control methods, this control method has the advantages of a high delivery rate, low message transmission overhead and low data transmission delay. Compared to the traditional two-way path model, the scheduling method proposed in this study increases the transmission delivery rate by 5%, while reducing the transmission cost and delay by 0.7 cost index and 1.1 min delay, respectively. In comparison with the performance of accurate error tracking equalization, the transmission delivery rate of the research method increased by 21%, and in transmission cost index and delay analysis, it also decreased by 3.1 and 2.9 minutes. Based on the above performance comparison analysis, it can be concluded that the machine learning method has more superior transmission control performance.

Author 1: Yao Wang

Keywords: Machine learning; conjugate symmetric data; transmission control; data coding; transmission delivery rate

Paper 92: Towards Finding the Impact of Deep Learning in Educational Time Series Datasets – A Systematic Literature Review

Abstract: Besides teaching in the education system, instructors do a bunch of background processes such as preparing study material, question paper setting, managing attendance, log book entry, student assessment, and the result analysis of the class. Moreover, Learning Management System(LMS) is mandatory if the course is online. The Massive Open Online Course (MOOC) is an example of the worldwide online education system. Nowadays, educators are using Google to efficiently formulate study material, question papers, and especially for self-preparation. Also, student assessment and result analysis tools are available to get instant results by feeding student data. Artificial Intelligence (AI) is driving behind these applications to deliver the most precise outcome. To accomplish that, AI requires historical data to train the model, and this sequential (year-wise, month-wise, etc) information is called time series data. This Systematic Literature Review (SLR) is conducted to find the contribution of time series algorithms in Education. There are enormous changes in algorithm architecture analogized to the traditional neural network to endure all kinds of data. Though it significantly raises the performance, it expands the complexity, resources, and execution time as well. Due to this, comprehending the algorithm architecture and the method of the execution process is a challenging phase before creating the model. But it is essential to have enough knowledge to select the suitable technique for the right solution. The first part reviews the time series problems in educational datasets using Deep Learning(DL). The second part describes the architecture of the time series model, such as the Recurrent Neural Network (RNN) and its variants called Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU), the differences between each other, and the classification of performance metrics. Finally, the factors affecting the time series model accuracy and the significance of this work are summarized to incite the people who desire to initiate the research in educational time series problems.

Author 1: Vanitha S

Author 2: Jayashree. R

Keywords: Deep learning; education; gated recurrent unit; long-short term memory; recurrent neural network; time series

Paper 93: AI in Tourism: Leveraging Machine Learning in Predicting Tourist Arrivals in Philippines using Artificial Neural Network

Abstract: Tourism is one of the most prominent and rapidly expanding sectors that contribute significantly to the growth of a country’s economy. However, the tourism industry has been most adversely affected during the coronavirus pandemic. Thus, a reliable and accurate time series prediction of tourist arrivals is necessary in making decisions and strategies to develop the competitiveness and economic growth of the tourism industry. In this sense, this research aims to examine the predictive capability of artificial neural networks model, a popular machine learning technique, using the actual tourism statistics of the Philippines from 2008-2022. The model was trained using three distinct data compositions and was evaluated utilizing different time series evaluation metrics, to identify the factors affecting the model performance and determine its accuracy in predicting arrivals. The findings revealed that the ANN model is reliable in predicting tourist arrivals, with an R-squared value and MAPE of 0.926 and 13.9%, respectively. Furthermore, it was determined that adding training sets that contain the unexpected phenomenon, like COVID-19 pandemic, increased the prediction model's accuracy and learning process. As the technique proves it prediction accuracy, it would be a useful tool for the government, tourism stakeholders, and investors among others, to enhance strategic and investment decisions.

Author 1: Noelyn M. De Jesus

Author 2: Benjie R. Samonte

Keywords: Tourist arrivals; machine learning; predictive analytics; artificial neural network; ANN; time series prediction

Paper 94: A Cloud and IoT-enabled Workload-aware Healthcare Framework using Ant Colony Optimization Algorithm

Abstract: In recent years, smart cities have gained in popularity due to their potential to improve the quality of life for urban residents. In many smart city services, particularly those in the field of smart healthcare, big healthcare data is analyzed, processed, and shared in real time. Products and services related to healthcare are essential to the industry's current state, which increases its viability for all parties involved. With the increasing popularity of cloud-based services, it is imperative to develop new approaches for discovering and selecting these services. This paper follows a two-stage process. The first step involves designing and implementing an Internet-enabled healthcare system incorporating wearable devices. A new load-balancing algorithm is presented in the second stage, based on Ant Colony Optimization (ACO). ACO distributes tasks across virtual machines to maximize resource utilization and minimize makespan time. In terms of both makespan time and processing time, the proposed method appears to be more efficient than previous approaches based on statistical analysis.

Author 1: Lu Zhong

Author 2: Xiaoke Deng

Keywords: Internet of things; cloud computing; healthcare; load balancing

Paper 95: Multi-String Missing Characters Restoration for Automatic License Plate Recognition System

Abstract: Developing a license plate recognition system that can cope with unconstrained real-time scenarios is very challenging. Additional cues, such as the color and dimensions of the plate, and font of the text, can be useful in improving the system's accuracy. This paper presents a deep learning-based plate recognition system that can take advantage of the bilingual text in the license plates, as used in many countries, including Saudi Arabia. We train and test the model using a custom dataset generated from real-time traffic videos in Saudi Arabia. Using the English alphanumeric alone, the accuracy of our system was on par with the existing state-of-the-art algorithms. However, it increased significantly when the additional information from the detection of Arabic text was utilized. We propose a new algorithm to restore noise-affected missing or misidentified characters in the plate. We generated a new test dataset of license plates to test how the proposed system performs in challenging scenarios. The results show a clear advantage of the proposed system over several commercially available solutions, including Open ALPR, Plate Recognizer, and Sighthound.

Author 1: Ishtiaq Rasool KHAN

Author 2: Syed Talha Abid ALI

Author 3: Asif SIDDIQ

Author 4: Seong-O SHIM

Keywords: Automatic License Plate Recognition (ALPR); Intelligent Transportation System (ITS); Optical Character Recognition (OCR); Deep Convolutional Neural Network; You Only Look Once (YOLO)

Paper 96: A Real-time ECG CTG based Ensemble Feature Extraction and Unsupervised Learning based Classification Framework for Multi-class Abnormality Prediction

Abstract: Cardiovascular diseases (CVDs) are a leading cause of death worldwide. Early detection and diagnosis of these diseases can greatly reduce complications and improve outcomes for high-risk individuals. One method for detecting CVDs is through the use of electrocardiogram (ECG) monitoring systems, which use various technologies such as the Internet of Things (IoT), mobile applications, wireless sensor networks (WSN), and wearable devices to acquire and analyze ECG data for early diagnosis. However, despite the prevalence of these systems in the literature, there is a need for further optimization and improvement of their classification accuracy. In an effort to address this challenge, a novel heterogeneous unsupervised learning model for real-time ECG classification was proposed. The main goal of this work was to reduce the error rate and improve the classification accuracy of the system. This study presents a framework for the classification of multi-class abnormalities in electrocardiograms (ECGs) using an ensemble feature extraction technique and unsupervised learning. The framework utilizes a real-time electrocardiogram-cardiotocography (ECG-CTG) system to extract features from the ECG signal, and then employs an ensemble of feature extraction techniques to enhance the discrimination of the extracted features. The extracted features are then used in an unsupervised learning-based classification algorithm to classify the ECG signals into different classes of abnormalities. The proposed framework is evaluated on a dataset of ECG signals and the results show that it can effectively classify ECG signals with high accuracy and low computational complexity.

Author 1: Y. Aditya

Author 2: S. Suganthi Devi

Author 3: B. D. C. N Prasad

Keywords: Ensemble; feature ranking; improved inter quartile range; outlier detection; heterogeneous optimized k-nearest neighbor; unsupervised learning

Paper 97: Bird Image Classification using Convolutional Neural Network Transfer Learning Architectures

Abstract: With the technological progress of human beings, more and more animal and bird species are being endangered and sometimes even going to the verge of extinction. However, the existence of birds is highly beneficial for human civilization as birds help in pollination, destroying harmful insects for crops, etc. To ensure the healthy co-existence of all species along with human beings, almost all advanced countries have taken up some conservation measures for endangered species. To ensure conservation, the first step is to identify the species of birds found in different locations. Deep learning-based techniques are best suited for the automated identification of bird species from the captured images. In this paper, a Convolutional Neural Network based bird image identification methodology has been proposed. Four different transfer learning-based architectures, namely Resnet152V2, Inception V3, Densenet201, and MobileNetV2 have been used for bird image classification and identification. The models have been trained using 58388 images belonging to 400 species of birds, and the models have been tested using 2000 images belonging to 400 species of birds. Out of these four models, Resnet152V2 and DenseNet201 performed comparatively well. The accuracy of Resnet152V2 was highest at 95.45%, but it faced a large loss of 0.8835. But based on the results, even though DenseNet201 had an accuracy of 95.05%, it faced less loss i.e., of 0.6854. The results show that the DenseNet201 model can further be used for real-life bird image classification.

Author 1: Asmita Manna

Author 2: Nilam Upasani

Author 3: Shubham Jadhav

Author 4: Ruturaj Mane

Author 5: Rutuja Chaudhari

Author 6: Vishal Chatre

Keywords: Deep learning; CNN; Image classification; DenseNet201; ResNet152V2; InceptionV3; MobileNetV2

Paper 98: An Automated Framework to Detect Emotions from Contextual Corpus

Abstract: The emotion extraction or opinion mining is one of the key tasks for any text processing frameworks. In recent times, the use of opinion mining has gained a lot of potential due to the application of the potential customized aspects of the consumer relations and other customized applications. However, the application of sentiment analysis or opinion mining is highly challenging as the accuracy of the sentiment analysis depends on the input text corpus. The input text corpus can be highly fluctuating due to the inclusion of emojis or local language influences and finally the use of a wide variety of the regional languages. A good number of parallel research outcomes have aimed to solve these challenges in the recent time. However, most of the parallel research outcomes have primarily three challenges kept unsolved as firstly, the emojis in the text corpus is mainly removed but not translated into sentiment scores, secondly, the translation of the texts from various regional languages and the translation is mainly true translations rather than the contextual translation. Finally, the use of the dictionaries in the actual translation tasks takes a lot of time to process and must be reduced. Henceforth, in order to solve these challenges, this work proposed a framework to automate the weighted emoji-based sentiment analysis, Unicode based translation process to reduce the time complexity and finally use the collaborative sentiment analysis scores to build the final sentiment models. This work results into nearly 97% accuracy and nearly 50% reduction in the time complexity.

Author 1: Ravikumar Thallapalli

Author 2: G. Narsimha

Keywords: Emoji translation; weighted annotation; text translation; reduced unicode based dictionary; relative sentiment score building; mean scoring technique; collaborative sentiment score building

Paper 99: Rapidly Exploring Random Trees for Autonomous Navigation in Observable and Uncertain Environments

Abstract: This paper proposes the use of a small differential robot with two DC motors, controlled by an ESP32 microcon-troller, that implements the Rapidly Exploring Random Trees algorithm to navigate from an origin point to a destination point in an unknown but observable environment. The motivation behind this research is to explore the use of a low-cost, versatile and efficient robotic platform for autonomous navigation in complex environments. This work presents a practical and cost-effective solution that can be easily replicated and implemented in various scenarios such as search and rescue, surveillance, and industrial automation. The proposed robotic platform is equipped with a set of sensors and actuators that allow it to observe the environment, estimate its position, and move through it. The Rapidly Exploring Random Trees algorithm is implemented to generate a path from an origin to a destination point, avoiding obstacles and adjusting the robot’s motion accordingly. The implementation of this algorithm enables the robot to navigate through complex environments with high efficiency and reliability, making it a suitable solution for a wide range of applications. The results obtained through simulations and experiments show that the proposed robotic platform and algorithm achieve high performance and accuracy in autonomous navigation, even in complex environments.

Author 1: Fredy Martinez

Author 2: Edwar Jacinto

Author 3: Holman Montiel

Keywords: Autonomous navigation; differential robot; esp32 microcontroller; low-cost; rapidly exploring random trees algorithm; versatile

Paper 100: Incremental Diversity: An Efficient Anonymization Technique for PPDP of Multiple Sensitive Attributes

Abstract: Data collected at the organizations such as schools, offices, healthcare centers and e-commerce websites contain multiple sensitive attributes. The sensitive information from these organisations such as marks obtained, salary, disease, treatment and traveling history are personal information that an individual dislikes to disclose to the public as it may lead to privacy threats. Therefore, it is necessary to preserve privacy of the data before publishing. Privacy Preserving Data Publishing(PPDP) algorithms aim to publish the data without compromising the privacy of individuals. In the recent years several algorithms have been designed for PPDP multiple sensitive attributes. The major limitations are, firstly among several sensitive attributes these algorithms consider one of them as primary sensitive attribute and anonymize the data, however there may be other dominant sensitive attributes that need to be preserved. Secondly, there is no consistent way to categorize multiple sensitive attributes. Lastly, increased proportion of records are generated due to usage of generalization and suppression techniques. Hence, to overcome these limitations the current work proposes an efficient approach to categorize the sensitive attributes based their semantics and anonymize the data using an anatomy technique. This reduces the residual records as well as categorizes the attributes. The results are compared with popular techniques like Simple Distribution of Sensitive Values (SDSV) and (l, e) diversity. Experiments prove that our method outperforms the existing methods in terms of categorization of multiple sensitive attributes, reducing the percentage of residual records and preventing the existing privacy threats.

Author 1: Veena Gadad

Author 2: Sowmyarani C N

Keywords: Data management; privacy preserving data publishing; data privacy; multiple sensitive attributes; data anonymization; privacy attacks

Paper 101: Demand Forecasting Models for Food Industry by Utilizing Machine Learning Approaches

Abstract: Continued global economic instability and uncer-tainty is causing difficulties in predicting sales. As a result, many sectors and decision-makers are facing new, pressing challenges. In supply chain management, the food industry is a key sector in which sales movement and the demand forecasting for food products are more difficult to predict. Accurate sales forecasting helps to minimize stored and expired items across individual stores and, thus, reduces the potential loss of these expired products. To help food companies adapt to rapid changes and manage their supply chain more effectively, it is a necessary to utilize machine learning (ML) approaches because of ML’s ability to process and evaluate large amounts of data efficiently. This research compares two forecasting models for confectionery products from one of the largest distribution companies in Saudi Arabia in order to improve the company’s ability to predict demand for their products using machine learning algorithms. To achieve this goal, Support Vectors Machine (SVM) and Long Short-Term Memory (LSTM) algorithms were utilized. In addition, the models were evaluated based on their performance in forecasting quarterly time series. Both algorithms provided strong results when measured against the demand forecasting model, but overall the LSTM outperformed the SVM.

Author 1: Nouran Nassibi

Author 2: Heba Fasihuddin

Author 3: Lobna Hsairi

Keywords: Machine learning; long short-term memory; support vector machine; food industry; supply chain management; demand forecasting; product sales

Paper 102: Method for Inferring the Optimal Number of Clusters with Subsequent Automatic Data Labeling based on Standard Deviation

Abstract: Machine learning is a suitable pattern recognition technique for detecting correlations between data. In the case of unsupervised learning, the groups formed from these correlations can receive a label, which consists of describing them in terms of their most relevant attributes and their respective ranges of values so that they are understood automatically. In this research work, this process is called labeling. However, a challenge for researchers is establishing the optimal number of clusters that best represent the underlying structure of the data subjected to clustering. This optimal number may vary depending on the data set and the grouping method used and influences the data clustering process and, consequently, the interpretability of the generated groups. Therefore, this research aims to provide an inference approach to the number of clusters to be used in the grouping based on the range of attribute values, followed by automatic data labeling based on the standard deviation to maximize the understanding of the groups obtained. This methodology was applied to four databases. The results show that it contributes to the interpretation of the groups since it generates more accurate labels without any overlap between ranges of values, considering the same attribute in different groups.

Author 1: Aline Montenegro Leal Silva

Author 2: Francisco Alysson da Silva Sousa

Author 3: Alysson Ramires de Freitas Santos

Author 4: Vinicius Ponte Machado

Author 5: Andre Macedo Santana

Keywords: Inference approach; range of attribute values; labeling; standard deviation; interpretation of the groups

Paper 103: Heart Disease Classification and Recommendation by Optimized Features and Adaptive Boost Learning

Abstract: In recent decades, cardiovascular diseases have eclipsed all others as the main reason for death in both low and middle income countries. Early identification and continuous clinical monitoring can reduce the death rate associated with heart disorders. Neither service is yet accessible, as it requires more intellect, time, and skill to effectively detect cardiac disorders in all circumstances and to advise a patient for 24 hours. In this study, researchers suggested a Machine Learning-based approach to forecast the development of cardiac disease. For precise identification of cardiac disease, an efficient ML technique is required. The proposed method works on five classes, one normal and four diseases. In the research, all classes were assigned a primary task, and recommendations were made based on that. The proposed method optimises feature weighting and selects efficient features. Following feature optimization, adaptive boost learning using tree and KNN bases is used. In the trial, sensitivity improved by 3-4%, specificity by 4-5%, and accuracy by 3-4% compared to the previous approach.

Author 1: Pardeep Kumar

Author 2: Ankit Kumar

Keywords: Heart disease prediction; heart disease; machine learning; optimization; multi-objective features

Paper 104: SSEC: Semantic Segmentation and Ensemble Classification Framework for Static Hand Gesture Recognition using RGB-D Data

Abstract: Hand Gesture Recognition (HGR) refers to identifying various hand postures used in Sign Language Recognition (SLR) and Human Computer Interaction (HCI) applications. Complex background in uncontrolled environmental condition is the major challenging issue which impacts the recognition accuracy of HGR system. This can be effectively addressed by discarding the background using suitable semantic segmentation method, where it predicts the hand region pixels into foreground and rest of the pixels into background. In this paper, we have analyzed and evaluated well known semantic segmentation architectures for hand region segmentation using both RGB and depth data. Further, ensemble of segmented RGB and depth stream is used for hand gesture classification through probability score fusion. Experimental results shows that the proposed novel framework of Semantic Segmentation and Ensemble Classification (SSEC) is suitable for static hand gesture recognition and achieved F1-score of 88.91% on OUHANDS test dataset.

Author 1: Dayananda Kumar NC

Author 2: K. V Suresh

Author 3: Chandrasekhar V

Author 4: Dinesh R

Keywords: Hand gesture recognition; semantic segmentation; ensemble classification; score fusion

Paper 105: COVID-19 Dataset Clustering based on K-Means and EM Algorithms

Abstract: In this paper, a COVID-19 dataset is analyzed using a combination of K-Means and Expectation-Maximization (EM) algorithms to cluster the data. The purpose of this method is to gain insight into and interpret the various components of the data. The study focuses on tracking the evolution of confirmed, death, and recovered cases from March to October 2020, using a two-dimensional dataset approach. K-Means is used to group the data into three categories: “Confirmed-Recovered”, “Confirmed-Death”, and “Recovered-Death”, and each category is modeled using a bivariate Gaussian density. The optimal value for k, which represents the number of groups, is determined using the Elbow method. The results indicate that the clusters generated by K-Means provide limited information, whereas the EM algorithm reveals the correlation between “Confirmed-Recovered”, “Confirmed-Death”, and “Recovered-Death”. The advantages of using the EM algorithm include stability in computation and improved clustering through the Gaussian Mixture Model (GMM).

Author 1: Youssef Boutazart

Author 2: Hassan Satori

Author 3: Anselme R. Affane M

Author 4: Mohamed Hamidi

Author 5: Khaled Satori

Keywords: COVID-19; clustering; k-means; EM algorithm; GMM

Paper 106: Chicken Behavior Analysis for Surveillance in Poultry Farms

Abstract: Poultry farming is an important industry that provides food for a growing population. However, the welfare of the birds is a major concern, as poor living conditions leads to abnormal behavior that affects the health and productivity of the flock. In order to monitor and improve the welfare of the birds, it is important to have a surveillance system in place that monitors the behavior of the chickens and alert farmers to potential issues. This paper reviews the current state of the art in behavior analysis for surveillance in poultry farms and discuss potential future directions for research in this area. This paper presents a computer-vision-based system that detects and monitors the behaviors of the chickens in poultry farms. The system classifies three behaviors which are eating, walking and sleeping. The system takes videos as input and then classifies the behavior of the chicken. The proposed system produces an accuracy of 94.7% using Light Gradient Boosting Machine on a collected data-set of chickens, and a 98.4% accuracy on a benchmarked Human Activity Recognition data-set.

Author 1: Abdallah Mohamed Mohialdin

Author 2: Abdullah Magdy Elbarrany

Author 3: Ayman Atia

Keywords: Chicken; poultry; abnormal; behavior; birds

Paper 107: Mammography Image Abnormalities Detection and Classification by Deep Learning with Extreme Learner

Abstract: Breast cancer has emerged as a leading killer of women worldwide in recent decades. Mammography is a useful tool for detecting abnormalities and doing screenings. The primary factors in the early identification of breast cancer are the quality of mammogram image and the radiologist’s appraisal of the mammography. The extensive use of deep learning (DL) as well as other image-processing technologies in recent times has tremendously aided in the categorization of breast cancer images. Image processing and classification methods may help us find breast cancer earlier, increasing the likelihood of a positive outcome from therapy and the likelihood of survival. employ picture segmentation methods on the datasets to draw attention to the area of interest, and then classify the findings as malignant or benign. In an effort to minimize the mortality rate from breast cancer among females, this research seeks to discover novel approaches to illness classification and detection, as well as new strategies for preventing the disease. In order to correctly categorize the results, the best possible feature optimization is carried out utilizing deep learning technology. The Proposed deep CNN (Convolutional Neural Network) is improved using two classification models such as SVM (Support Vector Machine) and ELM (Extreme Learning Machine). In the proposed deep learning model, the feature extraction with AlexNet is accomplished using deep CNN. Subsequently, different parameters are fine-tuned to enhance accuracy with various optimizers and learning rates.

Author 1: Saruchi

Author 2: Jaspreet Singh

Keywords: Breast cancer; mammography; deep learning; CNN; extreme learning

Paper 108: Information Retrieval Method of Natural Resources Data based on Hash Algorithm

Abstract: In order to improve the ability of searching and identifying information in natural resources data, this paper puts forward a method of searching information in natural resources data based on hash algorithm. Through the data center technology, the problems of information source positioning, data directory organization, data semantic definition and expression, and data entity relationship construction in natural resource data center are solved. Combined with the distribution of resource data stream, the information structure reorganization and data encryption in natural resource data center are realized by using hash algorithm, and the parameters of information quality control model in natural resource data center are established. Through natural resource data governance and semantic reconstruction, the characteristics detection and redundancy arrangement of information data in natural resource data center are realized by standardizing data collection rules, and they are stored in the intermediate database. Through data governance rules, the information in the natural resources data is structured and managed, and stored in the publishing library. Through all kinds of data processing tools, all kinds of data are processed, cleaned and reconstructed, and through Hash algorithm and data aggregation processing, information detection in natural resources data is realized. The simulation results show that the precision rate of natural resource data retrieval by this method is high.

Author 1: Qian Li

Keywords: Hash algorithm; natural resource data; information structure reorganization; search; data encryption

Paper 109: Research on Identifying Stock Manipulation using GARCH Model

Abstract: Continuous rising of economy and investors’ demand for funds give a window to easier market manipulation which includes abusing of one’s power to raise or lower the price of securities, colluding to affect the price or volume of securities transactions at a pre-agreed time, price and method. In the study, the article aimed to create a sound investment environment, detect abnormal behaviors in stocks, and avoid risks of intentional manipulation. This study is to identify market manipulation and summarize the accuracy of GARCH model analysis with the help of fluctuation forecast trend chart and construction of GARCH model which calculates the sum of the GARCH-α parameter and the GARCH-β parameter of turnover rate, logarithmic return rate, and the trading volume fluctuation. Through the study of this paper, it is found that the stock market manipulation has the following characteristics: the participants are complex and diverse, the manipulation is opaque and has serious consequences, and the stock market manipulation involves a wide range of aspects.

Author 1: Wen-Tsao Pan

Author 2: Wen-Bin Qian

Author 3: Ying He

Author 4: Zhi-Xiu Wang

Author 5: Wei Liu

Keywords: Stock prices; market; manipulation; GARCH model; stock exchange

Paper 110: Cloud Task Scheduling using the Squirrel Search Algorithm and Improved Genetic Algorithm

Abstract: With cloud computing, resources can be networked globally and shared easily between users. A range of heterogeneous needs are met on demand by software, hardware, storage, and networking. Dynamic resource allocation and load distribution pose challenges for cloud servers. In this regard, task scheduling plays a significant role in enhancing the performance of cloud computing. With the increase in the number of users and the capability of cloud computing, cloud data centers are experiencing concerns regarding energy consumption. To leverage cloud resources energy efficiently and provide real-time services to users, a viable cloud task scheduling solution is required. To address these problems, this paper proposes a new hybrid task scheduling algorithm based on squirrel search and improved genetic algorithms for cloud environments. The proposed scheduling algorithm surpasses existing scheduling algorithms across multiple parameters, including makespan, energy consumption, and execution time.

Author 1: Qiuju DENG

Author 2: Ning WANG

Author 3: Yang LU

Keywords: Cloud computing; energy efficiency; task scheduling; genetic algorithm

Paper 111: Bisayan Dialect Short-time Fourier Transform Audio Recognition System using Convolutional and Recurrent Neural Network

Abstract: Speech is a form of oral communication that reinforces thoughts and ideas that have general purpose and meaning. In the Philippines, Filipinos can speak at least three languages. English, Filipino, native language. The Philippine government says the Philippines has more than 150 regional native languages, one of which he says is Cebuano. This research aims to implement automatic speech recognition (ASR) specifically for the Bisayan dialect, and researchers use machine learning techniques to create and operate the system. ASR has served its purpose in recent years not only in the official language of the Philippines, but also in various foreign languages. The required datasets were collected throughout the study to train and build the models selected for the speech recognition engine. Audio files are recorded in waveform file format and contain Visayan phrases and sentences. Audio was captured through hours of recorded audio and process using Tensorflow short time Fourier transform (STFT) algorithm to ensure the accurate representation. In order to analyze the audio data, the recordings were specially converted to digital format, specifically .wav and making it sure all records are uncorrupted with only one channel, and finally have a sample rate of 22050kHz. A data mining process was carried out by integrating CNN layers, dense layers, and RNNs to predict the transcription of speech input using multiple layers that determine the output of the speech data. The researchers used the JiWER Python library, which was used in parallel when evaluating WER. This is because the trained scripted data set contains at least 500 time recordings totaling 61.78 minutes. Overall, the WER output is at best 99.53% and the percentage of records used is acceptable.

Author 1: Patrick D. Cerna

Author 2: Rhodessa J. Cascaro

Author 3: Khian Orland S. Juan

Author 4: Bon Jovi C. Montes

Author 5: Aldrei O. Caballero

Keywords: Bisayan dialect; speech recognition; dense layer; CNN; RNN