Drug Repositioning for Coronavirus (COVID-19) using Different Deep Learning Architectures

—In December 2019, the COVID-19 epidemic was found in Wuhan, China, and soon hundreds of millions were infected. Therefore, several efforts were made to identify commercially available drugs to repurpose them against COVID-19. Inferring potential drug indications through computational drug repositioning is an efficient method. The drug repositioning problem is a top-K recommendation function that presents the most likely drugs for specific diseases based on drug and disease-related data. The accurate prediction of drug-target interactions (DTI) is very important for drug repositioning. Deep learning (DL) models were recently exploited for promising DTI prediction performance. To build deep learning models for DTI prediction, encoder-decoder architectures can be utilized. In this paper, a deep learning-based drug repositioning approach is proposed, which is composed of two experimental phases. Firstly, training and evaluating different deep learning encoder-decoder architecture models using the benchmark DAVIS Dataset. The trained deep learning models have been evaluated using two evaluation metrics; mean square error and the concordance index. Secondly, predicting antiviral drugs for Covid-19 using the trained deep learning models created during the first phase. In this phase, these models have been experimented to predict different antiviral drug lists, which then have been compared with a recently published antiviral drug list for Covid-19 using the concordance index metric. The overall experimental results of both phases showed that the most accurate three deep learning compound-encoder/protein-encoder architectures are Morgan/AAC, CNN/AAC, and CNN/CNN with best values for the mean square error, the first phase concordance index, and the second phase concordance index


I. INTRODUCTION
Since December 2019, Coronavirus disease (COVID- 19) has become a crucial public issue across the world. There is a real need to develop antiviral drugs for COVID-19 to stop viral infections. Recent efforts have been carried out to design novel inhibitors or utilize a drug repurposing strategy to determine anti-COVID-19 drugs that can serve as promising inhibitors versus coronavirus protease [1,2].
Drug discovery and development is a time-consuming, complicated and costly task, including the identification of candidates, synthesis, characterization, screening, assays for therapeutic efficacy, and clinical trials [3,4,5]. However, drug development success rates are extremely low. In clinical phrases, numerous investigational drugs have failed due to insufficient achievement, safety concerns, or commercial purposes [6]. Alternative drug development strategy drug repositioning seeks to identify novel uses for present drugs and can decrease the risk and costs associated with the development of new drugs [7,8]. Inferring potential drug indications via computational drug repositioning is an efficient method. The drug repositioning problem is a top-K recommendation function that presents the most probable drugs for certain diseases based on drug and disease-related data. For drug repositioning, it is crucial to accurately predict drug-target interactions (DTI), which define the binding of substances to protein targets [9]. The precise identification of molecular drug targets is essential for drug discovery and development [10,11] and is particularly important for discovering effective and safe treatments for new pathogens, such as SARS-CoV-2 [12].
Diverse issues in bioinformatics and cheminformatics applications [13], and more specifically, drug development and discovery [14], have been successfully solved using deep learning (DL) techniques [15,16]. By comparing DL techniques to traditional machine learning (ML) algorithms, DL algorithms designed to predict drug-target binding affinities (DTBA) occasionally do better [17]. These DL-based DTBA prediction algorithms differ in two key aspects. The representation of the input data is the first aspect. The input drug features can include, for instance, extended connectivity fingerprint (ECFP), ligand maximum common substructure (LMCS), simplified molecular input line entry system (SMILES), or a combination of these features. The second aspect relates to the DL system design created using various neural network (NN) types [18]. The construction of the many NN types varies and may include a number of layers, hidden units, filter sizes, or an integrated activation function. Each variety of NN has particular advantages that make them better suited for particular applications. Deep learning (DL) models for predicting drug-target interactions (DTI) often use encoder-decoder architectures [20]. The encoder converts ligand or protein representations into numerical vectors for training or evaluating the DL model. There are many different encoder-decoder architectures available for DTI prediction, but only a few have been explored in previous research. In this study, we propose a DLbased method for drug repositioning, which involves two experimental phases. Firstly, training and evaluating different  Training and evaluating different twenty-one deep network architectures of compound encoders and protein encoders for drug repositioning using the benchmark DAVIS Dataset.
 Predicting antiviral drug lists for Covid-19 using the trained twenty-one models, and then comparing them to a recently published antiviral drug list for Covid-19.
The rest of this paper is organized in the following manner. While Section II provides an overview of the key scientific concepts, Section III reviews the related work. The conceptual model, system architecture, used benchmark DAVIS dataset, and reference Coronavirus antiviral medicine list are all described in Section IV along with the suggested technique. Section V describes the tools and libraries exploited in implementation and then presents the experimental evaluation using two main experiments: Evaluating the trained models using the DAVIS dataset and predicting antiviral drugs for Covid-19 using the trained models. Finally, the paper is concluded in Section VI.

II. BACKGROUND
DTIs have a very important function in the drug discovery procedure. DTIs recognize the interaction sites between protein targets and drug compounds and describe the attributes of the interactions sites. DTI aims in recognizing new ligands versus specified protein targets. A large number of researches have gotten advantages from recognizing DTIs containing drug repositioning [23,24]. Costly and time-consuming laboratory tests are required to determine the affinity value for a sizable number of drug-target combinations. Therefore, computational approaches have gotten more attention in the recent years [25].
A crucial step in predicting drug-target interactions DTI is feature extraction (also called feature encoding) from the input data. Feature extraction obtains useful, discriminating, and non-trivial information from input data to facilitate subsequent learning phases. Fig. 1 illustrates the two types of feature encoding techniques: data-driven and non-data-driven. The main distinction between these two groupings is that datadriven approaches develop characteristics for each input automatically. In strategies that are not data-driven, features are calculated in a fixed way for each input. The data-driven approaches are mainly based on deep learning methods, which is a set of machine learning algorithms that uses a model of the human visual system to create new hierarchical feature representations [26]. A neural network with two or more hidden layers is deemed a deep learning. The input layer receives the input features directly, while the output layer generates predictions through a series of non-linear transformations utilizing hidden layers. Each output node corresponds to a class-based prediction task. If there is only one node in the output layer, then the network is considered a single-task deep learning. Otherwise, it is known as a multi-task deep learning [27].
A basic sequence, a derived atomic fingerprint, or a mixture of both can constitute the input feature space for a deep network. Numerous studies have used a network's input to be a raw molecular sequence. Other works convert a raw sequence to a more appropriate form, such as one-hot coding, to feed it to deep networks like CNN. In one-hot coding, each character in the sequence is represented by a binary vector with its matching bit set to one and the other bits set to zero.
There are many feature encoding architectures in DTI prediction such as Convolutional Neural Network CNN and Message-Passing Neural Network MPNN. For example, the Transformer encoders [28] are multi-layered bidirectional Transformer encoders following the initial Transformer model. The Transformer encoder is capable of modelling a sequence without the aid of a CNN or RNN. Transformer, in contrast to these earlier sequence processing layers, can efficiently encode the relationship between far-flung tokens (atoms) in a sequence. Various Transformer-based NLP models surpass earlier techniques in many benchmarks due to this effective context modelling.
Molecular descriptors must be created from symbolic representations of molecules, such as the SMILES (Simplified Molecular Input Line Entry System) format, in order to do deep learning. "Morgan Fingerprints," a vector representation of molecular attributes, is a commonly adopted way to describe www.ijacsa.thesai.org a molecule. Morgan fingerprint, also known as extendedconnectivity fingerprint (ECFP) [29], is often used algorithm, or encoder specification. ECFPs are innovative category of topological fingerprints for molecular characterization. For the purpose of modelling structure-activity, ECFPs were created. ECFPs are circular fingerprints with a variety of useful properties: their ability to be calculated quickly; they are not predefined and can represent an essentially infinite number of different molecular features; and their features indicate the presence of specific substructures, which makes it simpler to interpret analysis results. The most popular deep neural network encoder is CNN, which operates on a grid data structure like digital images. As depicted in Fig. 2 [26], CNN consists of multiple convolutional and pooling layers arranged in an arbitrary order. Convolutional layers discover a series of filters that derive a group of local patterns from a specific receptive field of the layer input. In subsequent convolutional layers, the receptive field also expands. By down-sampling the layer's input, the pooling layer expands the receptive field. Moreover, the pooling layer does not define any additional parameters. The CNN input might consist of one-dimensional or twodimensional matrices that are scanned along the sequence in only one or two directions, respectively. Until now, 1D CNNs have been used for DNA sequences for categorization, DNAprotein binding, and motif extraction, among other tasks [30,31,32].
In CNN, Fig. 2 depicts how to apply the one-dimension convolutional layer on a small molecule sequence or a protein sequence. In Fig. 2A, a protein sequence is depicted as an amino acids' sequence contained in a matrix where every amino acid is encoded by a single-hot code (all bits are zeros except the corresponding bit of the symbol is one). As seen, by reducing the loss function on both positive and negative task samples, the filter is moved along the sequence. Fig. 2B depicts the SMILE sequence as a string of characters. Every character represents an atom or molecule structural indicator. Every character is then encoded with a one-hot code and inserted in each matrix column. In both instances, the learned filter is displayed.

III. RELATED WORK
Deep learning methods are now widely used in many research fields like; speech recognition [33,34,35] and image processing [36,37,38], involving bioinformatics such as genomics works [39,40] and quantitative-structure activity relationship (QSAR) researches in drug discovery [41]. The primary benefit of deep learning architectures aims to offer improved raw data representations through non-linear modifications in every layer [42]; hence facilitates the learning of the data's hidden patterns.
Several studies utilizing Deep Neural Networks (DNN) for the prediction of DTI binary class with various input models for drugs and proteins have been conducted previously [43,44,45], as well as a few studies utilizing stacked autoencoders [46] and deep belief networks [47]. Likewise, stacked auto-encoder-based models using Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) were used to describe genomic and chemical structures as realvalued vectors [48,49]. The protein-ligand interaction scoring is performed by using deep learning methods. The proteinligand interaction scoring commonly utilizes CNNs, which learn from the three-dimensional compositions of the proteinligand complexes [50,51,52,53].
Pahikkala et al. [54] used a method called KronRLS (Kronecker Regularized Least Squares), which only requires two dimension-based chemical similarity-based representations of the medications and the Smith-Waterman similarity representation of the targets. Latterly, the SimBoost approach was suggested to forecast the scores of binding affinity with a gradient boosting machine by utilizing feature engineering to provide DTI [55]. They exploited similarity-based information of DT pairings and attributes acquired from the pairs' networkbased interactions. Both kinds of research obtained similaritybased information using 2D representations of the substances and typical machine learning algorithms.
The ensemble of deep learning models (EnsembleDLM) for DTI prediction was presented [56]. EnsembleDLM utilizes a set of chemical compounds and proteins and assembles the predictions from numerous deep neural networks. It provides good achievements in cross-domain applications spanning various bio-activity types and protein classes. By using transfer learning, the EnsembleDLM obtained a good performance (Pearson correlation coefficient and concordance index).
To construct negative DTIs, a new similarity-based strategy was presented [57]. Multiple least absolute shrinkage and selection operator (LASSO) models were presented to incorporate various collections of feature sets in order to investigate the strength of prediction and forecast DTIs. In addition, LASSO Deep Neural Network (LASSO-DNN) model was developed to predict DTIs based on the features retrieved from the LASSO models with the highest achievement. LASSO-DNN was compared to LASSO, standard logistic (SLG) regression, support vector machine (SVM), and conventional DNN models. The LASSO-DNN outperformed the SLG, LASSO, SVM, and regular DNN models, as demonstrated by the results of the experiments. www.ijacsa.thesai.org Wang et al. [58] provided the three-step strategy for identifying obscure DTIs using deep learning. The first step is an illustration of drug-target pairings, where the drug compounds are encoded as fingerprint characteristics. In contrast, the protein sequence features are produced by applying Legendre Moments (LM) to a position-specific scoring matrix (PSSM), including evolutionary protein information. The second step involves compression and fusion of features. The sparse principal component analysis (SPCA) was utilized to reduce the dimension and redundancy of the features. Eventually, the prediction task that used the deep long short-term memory (DeepLSTM) model was exploited. The experimental results proved that the suggested technique outperforms other DTI prediction methods.
DTI prediction model with 2D paired distance maps of proteins and molecular graphs as inputs for targets and medicines was presented [59]. To retrieve the interactive effects of targets and drugs, the mutual interaction neural network (MINN) by integrating two interacting transformers (Interformer, for short) with an enhanced communicative message passing neural network (CMPNN) (titled Inter-CMPNN) was proposed.
In conclusion, the majority of computational algorithms suggested forecasting DT interactions concentrated on binary classification, wheresoever the primary objective is to assess whether or not a drug-target pair interacts. Yet, the interactions of protein-ligand suppose a continuum of binding affinity values, often known as binding strength. Increased availability of affinity information in drug-target KBs (knowledge bases) enables the application of sophisticated learning approaches like deep learning architectures to predict binding affinities. The main contribution of this research is to train and evaluate twenty-one different deep network architectures of compound encoders and protein encoders for drug repositioning. Fig. 3 shows the proposed approach conceptual model, which exploits a benchmark dataset called DAVIS [21] and a reference Coronavirus antiviral drug list [22], and then outputs a ranked list of candidate drugs for Coronavirus for each trained deep learning model. According to various compounddecoders/protein-decoders, twenty-one DL models have been trained. Fig. 3. The proposed approach conceptual model. This research work has two main objectives. Firstly, training and evaluating models of different deep learning compound-encoders and protein-encoders using the benchmark DAVIS Dataset. Secondly, predicting antiviral drugs for Covid-19 using the trained models. To achieve these objectives, Fig. 4 shows the system architecture, which includes two phases: the training phase and the prediction phase. In the training phase, both drug compounds and disease target proteins of the DAVIS dataset are encoded using different neural network architectures and then their embeddings are decoded to generate the trained model. For example, as shown in Fig. 4, the drug compound is encoded using the Transformer architecture, and the disease target protein is encoded using the CNN architecture. The procedure of converting ligand or protein representations into numerical vectors utilized to train or assess a model of machine learning is known as encoding. Table I and Table II show the compound-encoders and protein-encoders exploited in this research, respectively. The cross-validation technique was applied on all DAVIS dataset instances.  In the prediction phase, the proposed solution exploits the trained model to predict a ranked list of drugs candidate for Coronavirus. The predicted ranked list of drugs is evaluated versus a reference list of antiviral drugs published recently for www.ijacsa.thesai.org

A. Proposed Approach
Coronavirus. Fig. 5 shows how different trained models are generated and how different deep learning architectures are exploited. For example, the trained model 1 was generated by encoding drug compounds and disease target proteins using DNN and CNN, respectively. In the same way, the trained model 2 was generated by encoding drug compounds and disease target proteins using DNN and Transformer, respectively. In this research, twenty-one combinations of neural network architectures were experimented.

B. The Benchmark Dataset
The selectivity assays of the set of kinase proteins and the pertinent inhibitors with their definite dissociation constant (Kd) values are included in the DAVIS dataset [21]. The DAVIS dataset compound SMILES strings were retrieved from the Pubchem compound database using their Pubchem CIDs [60]. The DAVIS dataset's protein sequences were retrieved from the UniProt protein repository using gene names and RefSeq accession codes [61]. Table III shows the number of Drug-Target Interaction Instances.

C. The Reference Coronavirus Antiviral Drug List
Recently, [22] used virtual screening and molecular docking methods to locate prospective inhibitors from existing drugs that can respond to COVID-19. Based on their binding energy (kcal/mol), the authors ranked 121 drugs as potent drugs against SARS-CoV-2 since they tightly bind to their main protease. Table IV presents a list of ranked Drugs based on their docking scores Binding energy (BE).

A. Implementation Setup
In this research, the DeepPurpose DTI prediction tool was exploited. This tool provides a set of eight ligand encoders that can be combined with other protein representations and architectures to generate new models [62]. The DeepPurpose tool contains seven distinct protein encoders, which can be divided into two distinct categories: expert-designed algorithms and text-processing based on neural networks. Among algorithmic encoders, the ACC encoder produces a vector of 8420 elements describing the frequency of all amino acid k-mers for k values up to three [63]. In contrast, the conjoint triad encoder offers a 3-mer frequency count utilizing a restricted amino acid alphabet [64]. Neural network encoders, in contrast, apply directly on the sequence. Every amino acid is transformed into a numerical value as a fixed-length one-dimension array using the DeepPurpose CNN encoder. Subsequently, it employs a convolutional neural network [65] to learn spatial information from the sequence (local amino acid neighborhoods) that may be pertinent to the DTI binding model. Fig. 6 shows different compound-encoders and different protein-encoders developed by the DeepPurpose DTI prediction tool [60]. The DeepPurpose DTI prediction tool enables developers to set some pamperers for the model training phase. Table V shows some parameter settings for NN architectures, which had been used during the model training phase. Where P is the vector of predictions, Y is the vector of actual outputs, and n represents the total number of samples.

2) Concordance index (CI):
The Concordance Index (CI) was used to evaluate the effectiveness of a model that outputs continuous values [17]. CI determines whether the anticipated binding affinity values of two random drugtarget pairs were predicted in the same sequence as their true values were. Equation 2 shows how the CI is calculated.
Where represents the prediction value for the greater affinity , represents the prediction value for the lesser affinity , Z is a normalization constant, and h(x) is the step function.

C. Phase 1: Evaluating the Trained Models using the Davis Dataset
Based on exploiting different deep learning compoundencoders/protein-encoders, twenty-one trained models were evaluated using the benchmark DAVIS Dataset. The crossvalidation technique was applied. Both Fig. 7 and Fig. 8 show the evaluation results for the different twenty-one trained models in terms of MSE (Mean Square Error) and CI (Concordance Index) evaluation metrics. It is noted that the most superior five DL compound-encoder/protein-encoder architectures are Morgan/Transformer, Morgan/AAC, CNN/AAC, Morgan/CNN, and CNN/CNN.

D. Phase 2: Predicting Antiviral Drugs for Covid-19 using the Trained Models
To evaluate the predicting accuracy for the trained models versus the 121-reference antiviral ranked drugs, the Concordance Index was used to compare each predicted ranked list versus the reference antiviral ranked drugs. Fig. 9 shows the Concordance Index calculating algorithm for a Predicted Drug List versus the Covid-19 Drug Reference List. Fig. 10 shows the Concordance Index for the twenty-one Predicted Drug Lists versus the Covid-19 Drug Reference List. As shown in Fig. 10, the most superior five DL compoundencoder/protein-encoder architectures are Morgan/Conjoint_triad, Morgan/AAC, CNN/AAC, CNN/CNN, and CNN/CNN_RNN.
The overall experimental evaluation for the two experimental phases can be summarized in Fig. 11, which shows the evaluation results using the CI metric for the different twenty-one trained models in the model testing phase using the DAVIS dataset and in the predicting phase versus the coronavirus drug reference list. As shown in Fig. 12, it is worth noting that the most accurate three deep learning compoundencoder/protein-encoder architectures are Morgan/AAC, CNN/AAC, and CNN/CNN with best values for the mean square error, the first phase concordance index, and the second phase concordance index.

VI. CONCLUSION
This research proposed a deep learning-based drug repositioning method to train and evaluate twenty-one models based on deep learning compound-encoders and proteinencoders. The trained models have been evaluated using two experiments. Firstly, testing the trained models by applying the cross-validation technique on the benchmark DAVIS Dataset. Secondly, comparing the predicted antiviral drug lists by the trained models to a recently published antiviral drug list for Covid-19. The experimental evaluation showed the most accurate three deep learning compound-encoder/protein-encoder architectures are Morgan/AAC, CNN/AAC, and CNN/CNN with best values for the mean square error, the first phase concordance index, and the second phase concordance index. As a future work, the same 21 different deep network architectures of protein and compound encoders are suggested to be trained and to be evaluated using other datasets and other viral diseases.