Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Modified linear regression predicts drug-target interactions accurately

  • Krisztian Buza ,

    Contributed equally to this work with: Krisztian Buza, Ladislav Peška

    Roles Methodology, Software, Writing – original draft

    buza@biointelligence.hu

    Affiliations Faculty of Informatics, ELTE – Eötvös Loránd University, Budapest, Hungary, Center for the Study of Complexity, Babes-Bolyai University, Cluj Napoca, Romania

  • Ladislav Peška ,

    Contributed equally to this work with: Krisztian Buza, Ladislav Peška

    Roles Software, Writing – original draft

    Affiliation Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

  • Júlia Koller

    Roles Writing – original draft

    Affiliation Institute of Genomic Medicine and Rare Disorders, Semmelweis University, Budapest, Hungary

Abstract

State-of-the-art approaches for the prediction of drug–target interactions (DTI) are based on various techniques, such as matrix factorisation, restricted Boltzmann machines, network-based inference and bipartite local models (BLM). In this paper, we propose the framework of Asymmetric Loss Models (ALM) which is more consistent with the underlying chemical reality compared with conventional regression techniques. Furthermore, we propose to use an asymmetric loss model with BLM to predict drug–target interactions accurately. We evaluate our approach on publicly available real-world drug–target interaction datasets. The results show that our approach outperforms state-of-the-art DTI techniques, including recent versions of BLM.

Introduction

When developing new drugs and identifying their side effects [1], pharmaceutical science relies on findings from related branches of science, including statistics and computer science. An essential step in this process is the identification of interactions between drugs and pharmacological targets. Although the existence of interactions can be reliably confirmed by in vitro binding assays, see e.g., [25], such methods are expensive and time consuming [6]. In order to address this bottleneck, computational approaches have been designed and implemented for the estimation of the probability of interactions. Therefore, most promising candidates for in vitro experiments may be selected based on in silico approaches.

The importance of drug–target interaction prediction is further emphasised by the costs of drug development. While estimates vary, they agree that it costs hundreds of millions of dollars to bring a new drug to the market, see e.g. [7] for an overview. Furthermore, the process may take more than 10 years in total.

Drug–target interaction prediction (DTI) techniques promise to reduce the aforementioned costs and time, and to support drug repositioning [8], i.e., the use of an existing medicine to treat a disease that has not been treated with that drug yet.

Drug repositioning is especially relevant for the treatment of rare diseases, including neurological disorders. While each of the rare diseases affect only few people, due to the large number of rare diseases, in total 6-8% of the entire population is affected by one of those diseases. This results in a paradox situation: although a significant fraction of the population is suffering from one of the rare diseases, it is economically irrational to develop new drugs for many of them. However, drug repositioning may potentially lead to breakthroughs in such cases.

In silico approaches for DTI include techniques based on docking simulations [9], ligand chemistry [10], text mining [11, 12] and machine learning. Text mining is inherently limited to the identification of entities and interactions that have already been documented, although the output of approaches based on text mining, i.e., the identified interactions, may serve as input data for other approaches, such as the ones based on machine learning. A serious limitation of docking simulations is that information about the three-dimensional structure of candidate drugs and targets is required. In many cases, e.g. for G-protein coupled receptors (GPCR) and ion channels, such information may not be available. Moreover, the performance of ligand-based approaches is known to decrease if only few ligands are known.

For the aforementioned reasons, state-of-the-art DTI techniques are based on machine learning [1317]. Moreover, the increasing interest is also catalysed by the analogies between DTI and the well-studied recommendation tasks [1820], which resulted in DTI approaches based on matrix factorisation [2123]. Further recent DTI techniques are based on support vector regression [6], restricted Boltzmann machines [24], network-based inference [25, 26], decision lists [27], positive-unlabelled learning [16] and bipartite local models (BLM) [28]. Extensions of BLM include semi-supervised prediction [29], improved kernels [30], the incorporation of neighbour-based interaction-profiles [31] and hubness-aware regression [19].

Despite all the aforementioned efforts, accurate prediction of drug–target interactions still remained a challenge. In this paper, we propose a new regression technique for accurate DTI predictions. We use a novel loss function that reflects the needs of drug–target interaction better than wide-spread loss functions, such as mean squared error or logistic loss. Our generic framework of asymmetric loss models (ALM) works with various regressors. For simplicity, we instantiate ALM with linear regression which leads to asymmetric loss linear regression (ALLR). We propose to use this new regressor in BLM for drug–target interaction prediction. Note that ALM is substantially different from hubness-aware regressors that we used with BLM in our previous work [19]. As ALLR is a modified version of linear regression, we call our approach Drug–Target Interaction Prediction with Modified Linear Regression, or MOLIERE for short. We evaluate MOLIERE on publicly available real-world datasets and show that our approach outperforms state-of-the-art DTI techniques, including recent versions of BLM and the cases when conventional loss functions are used. Furthermore, we show that MOLIERE is able to predict medically relevant drug–target interactions that are not contained in the original datasets.

Materials and methods

Data

We used four publicly available real-world drug–target interaction datasets (Table 1), namely Enzyme, Ion Channel (IC), G-protein coupled receptors (GPCR), Nuclear Receptors (NR). The datasets are available at http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/. These datasets have been used in various studies, such as [17, 19, 20, 22, 28, 29].

thumbnail
Table 1. Number of drugs, targets and interactions in the datasets used in our study.

https://doi.org/10.1371/journal.pone.0230726.t001

Each dataset contains an interaction matrix between drugs and targets, a drug–drug similarity matrix and a target–target similarity matrix . Similarities between targets were determined by the Smith-Waterman algorithm, see [17, 32] for details. Chemical structure similarities between drugs were computed using the SIMCOMP algorithm [33].

Each entry mi,j of the interaction matrix indicates whether the interaction between the i-th drug (denoted as di) and j-th target (denoted as tj) is known: (1)

Note that in case of these datasets, only the information about the presence of interactions is explicit, there is no explicit information about the absence of interactions. In particular, the semantics of mi,j = −1 is that the corresponding drug di and target tj may or may not interact. In fact, some of the drug–target pairs denoted as −1 actually interact, however, the interaction was unknown when these datasets were created, roughly 10 years ago. In order to allow for a fair comparison with other works in the literature, in our experiments reported in Tables 24, we used the original datasets without these “new” interactions.

thumbnail
Table 2. The performance of our approach, MOLIERE, compared with BLM and weighted profile (WP) with kd = kt = 5.

https://doi.org/10.1371/journal.pone.0230726.t002

thumbnail
Table 3. The performance of our approach, MOLIERE, compared with the cases when using standard linear regression or logistic regression instead of the proposed regression technique.

https://doi.org/10.1371/journal.pone.0230726.t003

thumbnail
Table 4. The performance of our approach, MOLIERE, compared with state-of-the-art DTI techniques.

The best results are highlighted using bold font, the symbol •/∘ denotes whether the difference compared with the best approach is statistically significant (•) or not (∘).

https://doi.org/10.1371/journal.pone.0230726.t004

In order to illustrate that our approach is indeed able to predict unknown interactions, we show that using the original data, we could predict many of those interactions that have been discovered meanwhile (Tables 57).

thumbnail
Table 5. Top 20 new interactions predicted by our approach, MOLIERE, on the Enzyme dataset.

As additional information, we provide whether the interaction is validated (★) or not (−), and if it was predicted by other DTI techniques.

https://doi.org/10.1371/journal.pone.0230726.t005

thumbnail
Table 6. Top 20 new interactions predicted by our approach, MOLIERE, on the GPCR dataset.

As additional information, we provide whether the interaction is validated (★) or not (−), and if it was predicted by other DTI techniques.

https://doi.org/10.1371/journal.pone.0230726.t006

thumbnail
Table 7. Top 20 new interactions predicted by our approach, MOLIERE, on the IC dataset.

As additional information, we provide whether the interaction is validated (★) or not (−), and if it was predicted by other DTI techniques.

https://doi.org/10.1371/journal.pone.0230726.t007

Problem formulation

We define the Drug–Target Interaction Prediction problem as follows. We are given a set of n drugs, a set of m pharmaceutical targets, an n × n drug similarity matrix , an m × m target similarity matrix and an n × m interaction matrix . For some of the drug–target pairs the presence or absence of interaction is unknown (or simulated to be unknown in order to evaluate our approach). The task is to predict the likelihood of interaction for these unknown pairs.

At the first glance, the above DTI problem seems to be similar to the problems considered in the recommender systems community. Note, however, that most recommender techniques consider only the interactions (“ratings”) because even a few ratings are thought to be more informative than metadata, such as users’ similarity based on their demographic information [34]. In contrast, drug–drug and target–target similarities play an essential role in DTI.

Bipartite local models

BLM considers DTI as a link prediction problem in bipartite graphs [28]. The vertices in one of the vertex classes correspond to drugs, whereas the vertices in the other vertex class correspond to targets. There is an edge ei,j between drug di and target tj if and only if mi,j = 1.

The likelihood of unknown interactions is predicted as follows: we consider an unknown pair ui,j = (di, tj) and calculate the likelihood of interaction as the aggregate of two independent predictions.

The first prediction, called drug-centric prediction (Fig 1, left panel), is based on the relations between di and the targets. Each target tk (except tj) is labelled as “+ 1” or “−1” depending on mi, k. Then a model is trained to distinguish “+ 1”-labelled and “−1”-labelled targets. Subsequently, this model is applied to predict the likelihood of interaction for the unknown pair ui,j. This first prediction is denoted by . (When describing BLM, in accordance with our data, we assumed that only the information about the presence of an interaction is explicit, and therefore we train the model to distinguish known interacting pairs from pairs with unknown status. In contrast, if both known interacting and known non-interacting drug–target pairs are given, one may train the model using only the known interacting and known non-interacting pairs).

thumbnail
Fig 1. Predictions with BLM.

Two predictions are calculated for the likelihood of each unknown interaction, i.e., for the presence of an edge ei,j. When calculating the first (second, respectively) prediction, targets (drugs, respectively) are labelled, and a local model is trained using these labels. Subsequently, the local model is used to predict the likelihood of the interaction between di and tj.

https://doi.org/10.1371/journal.pone.0230726.g001

The second prediction, called target-centric prediction, , is obtained similarly, but instead of considering the interactions of drug di and labelling the targets, the interactions of target tj are considered and drugs are labelled (Fig 1, right panel). The models that make the first and second predictions are called drug-centric and target-centric local models.

In order to obtain the final prediction of BLM, we average the predictions of the aforementioned local models: (2)

Note that instead of averaging, other aggregation functions, such as minimum or maximum are possible as well. According to our observations, our approach achieves most accurate results when the two predictions are averaged. However, the effect of the aggregation function can be considered as minor: when we repeated our experiments reported in Table 4 with min and max aggregation functions, we observed that our approach consistently outperformed its competitors for all the three aggregation functions. For example, on the GPCR dataset, our approach achieved an AUPR of 0.737 and 0.730 using min and max respectively, whereas we obtained an AUPR of 0.753 in case of averaging the two predictions.

BLM is a generic framework in which various regressors or classifiers can be used as local models. For example, Bleakley and Yamanishi [28] used support vector machines with a domain-specific kernel, whereas Buza and Peška used a hubness-aware regressor [19]. In our current work, we use BLM with asymmetric loss linear regression which will be described in the next section.

Asymmetric loss models

Local models are the heart of BLM. Next, we propose a new regression technique that we use as a local model.

Given a regression model fθ where θ is the vector of parameters, fθ estimates the value of the target y for an instance x as . In order to determine the appropriate parameter values θ*, usually, a loss function LD(θ) is minimised: (3)

Note that the actual value of LD(θ) depends both on the dataset D and parameters θ. However, once the dataset is fixed, in particular, while the model is being trained using a given training dataset D, the loss can be seen as a function of the parameter vector θ. Therefore, we aim at finding parameters θ* that minimise the loss. A wide-spread loss function is mean squared errors: (4) where |D| is the number of instances in D.

While the sum of squared errors is popular, we argue that in case of DTI, it is not fully consistent with the underlying chemical reality. In particular, binding energy may be different for various interactions. Consequently, in case of the presence of an interaction (y = + 1), we should not penalise a model that predicts a score that is higher than + 1. Similarly, in case of an unknown interaction (y = −1), we do not want to penalise a model that predicts a score that is lower than −1. Therefore, we propose an asymmetric loss function. First, we define the error of the model fθ for a single prediction fθ(x), for instance x with label y as (5)

We define mean asymmetric loss (MAL) as the mean of the above errors for all instances of the dataset D: (6)

The above loss can be minimised with various optimisation techniques ranging from gradient-based methods to more advanced approaches, see e.g. [35]. For simplicity, we decided to use gradient descent. The partial derivative of MALD(θ) is: (7) where (8)

In case of linear regression where x = (x1, …, xk), θ = {w0, w1, …wk}, and the model is , the partial derivatives of err(fθ, x, y) according to wi, 1 ≤ ik, are (9) while the partial derivative according to w0 is (10)

We propose to use stochastic gradient descent to optimise MALD. The pseudocode of the resulting asymmetric loss linear regression (ALLR) is shown in Fig 2.

thumbnail
Fig 2. Pseudocode of asymmetric loss linear regression (ALLR).

https://doi.org/10.1371/journal.pone.0230726.g002

Weighted profile

One of the shortcomings of the BLM approach is that it does not handle the case of new drugs/targets. With new drug (or new target, respectively), we mean a drug d (target t) that does not have any known interaction in the (training) data. In such cases, BLM labels all targets (drugs) as “−1”, consequently, no reasonable local model can be learned. In order to alleviate this problem, we use the weighted profiles [17] of the most similar drugs/targets to obtain predictions for new drugs/targets.

Given a new drug di, and a target tj, we predict the likelihood of the interaction between di and tj as follows: (11) where denotes the set of indices of the kd most similar drugs to di (not including di itself) based on the drug–drug similarities .

The intuition behind Eq (11) is that similar drugs are likely to behave similarly in terms of their interaction with a given target. Therefore, drugs are weighed according to their similarity to the new drug di and we calculate the weighted average of the known interactions of other drugs with the same target.

The case of new targets is analogous. Given a new target tj and a drug di, the weighted profile approach can be used to calculate the prediction for the likelihood of the interaction between di and tj as follows: (12) where denotes the set of indices of the kt most similar targets to tj (not including tj itself) based on the target-target similarities .

Although the weighted profile approach is more general than BLM, in the sense that it can be used for new drugs/targets as well, the predictions of the weighted profile approach are usually less accurate than the predictions of BLM. Therefore, we use the weighted profile approach instead of BLM only in case of new drugs and targets.

Our approach

We summarise our approach as follows. We use BLM for drug–target interaction prediction with the proposed asymmetric loss linear regression as local model in cases when the corresponding drug (target) has at least one known interaction and therefore the local model has at least one positive training instance. When initialising the parameters of ALLR, we use σ = 10−8. We train each ALLR model with a learning rate η = 10−3 for e = 100 epochs. According to our observations, ALLR is robust in the sense that the aforementioned settings allowed ALLR to converge to a model that outperformed other DTI techniques on all the examined datasets (see Table 4).

While predicting the interaction score between drug d and target t with ALLR, we represent each drug (target) as a vector of its similarities to all the drugs (targets) and its interactions, except the interactions with t (or d respectively), because the interactions with d (t) serve as labels for the local models, see Fig 3 for an illustration.

thumbnail
Fig 3. Representation of drugs and targets and labels of local models.

In this example, the prediction is made for the interaction denoted by the question mark. Similarities with all drugs (targets, respectively) and interactions with all the targets (drugs), except the interactions with the target (drug) corresponding to the question mark, are used as features. The interactions with the target (drug) corresponding to the question mark are used as labels of the local models. Tables on the right represent the data used by the local model, i.e., ALLR in our case.

https://doi.org/10.1371/journal.pone.0230726.g003

In case of new drugs (targets), we predict the likelihood of interactions using the weighted profile approach with kd = kt = 5.

Results

Comparison with baselines

As our approach, MOLIERE, is based on BLM, and uses weighted profile (WP) in case of new interactions, first, we compared the performance of MOLIERE to that of the original BLM and WP according to the widely used leave-one-interaction-out cross-validation protocol, see e.g. [28, 30, 31].

The predictions were evaluated both in terms of Area Under ROC Curve (AUC) and Area Under Precision-Recall Curve (AUPR). Table 2 shows that MOLIERE clearly outperforms both BLM and WP, both in terms of AUC and AUPR.

As the proposed asymmetric loss linear regression is a key component of MOLIERE, we examined the performance in case of alternative regression techniques, in particular we examined the cases when we use (a) standard linear regression and (b) logistic regression instead of ALLR. As expected, the proposed asymmetric loss linear regression indeed improves the quality of predictions, see Table 3.

Comparison with recent DTI techniques

We compared MOLIERE with state-of-the-art drug–target interaction prediction techniques: two recent versions of BLM and three further prominent DTI approaches. The former include BLM with neighbour-based interaction-profile inferring (BLM-NII) [31] and hubness-aware regressors as local models (HLM) [19], while the later refer to net Laplacian regularized least squares (NetLapRLS) [29], a combination of weighted nearest neighbour and Gaussian interaction profile kernels (WNN-GIP) [36], and Bayesian Ranking for Drug–Target Interaction Prediction (BRDTI) [20].

Pahikkala et al. [37] pointed out that leave-one-out cross-validation may lead to overoptimistic results. Therefore, in this section, we used the interaction-based 5 × 5-fold cross-validation protocol, i.e., 5-fold cross-validation is repeated 5-times with different initial data splits. In each of the 5 × 5 rounds of the cross-validation, one fifth of the drug–target pairs were in the test data and AUC and AUPR values were calculated. The reported results are averaged values. In order to judge if the observed differences are statistically significant, we used paired t-test with significance threshold of p = 0.01.

Essential hyperparameters of BLM-NII, HLM, WNN-GIP, NetLapRLS and BRDTI were learned via grid-search in internal 5-fold cross-validation on the training data. For other hyperparameters that are not expected to have major impact on the results, we used default values according to the publication in which the approach was published.

In particular, for BLM-NII, as proposed by Mei et al. [31], the max function was used to generate final predictions and the weight α for the combination of structural and collaborative similarities was learned from {0.0, 0.1, …, 1.0}.

In case of HLM, according to [19], we performed experiments with N = 25 base prediction models, while the number of nearest neighbours for the local model ECkNN and the number of selected features, were learned from {3, 5, 7} and {10, 20, 50} respectively.

The decay hyperparameter of WNN-GIP was learned from {0.1, 0.2, …, 1.0} and the weight α for combination of structural and collaborative similarities was learned from {0.0, 0.1, …, 1.0}.

The hyperparameters of NetLapRLS (β = βdrug = βtarget and γ = γdrug = γtarget), were learned from {10−6, 10−5, …, 102}.

The content regularisation λc of BRDTI was learned from {0.1, 0.5, 0.9, 1.5}. The number of latent factors f, number of iterations, global regularisation λg and initial learning rate η were set to 100, 50, 0.01 and 0.1 respectively.

The results are shown in Table 4 and Fig 4 which show the precision-recall curves for MOLIERE and its competitors. Our approach, MOLIERE, outperforms all the examined competitors in case of Enzyme, Ion Channel, GPCR and NR datasets, both in terms of AUC and AUPR. In the vast majority of the cases, the difference is statistically significant.

thumbnail
Fig 4. Precision-recall (PR) curves (averaged over the 5 × 5 folds of the cross-validation).

As one can see, our approach, MOLIERE consistently outperforms its competitors: the PR-curve of MOLIERE is consistently above the PR-curves of its competitors.

https://doi.org/10.1371/journal.pone.0230726.g004

The results indicate that our approach, MOLIERE, is the overall best-performing approach out of the examined DTI techniques.

Prediction of new interactions

In order to demonstrate that our approach is able to predict new interactions, we followed the same protocol as in [19], i.e., we trained our approach, MOLIERE, as well as its competitors, BLM-NII, HLM, NetLapRLS and WNN-GIP using all the interactions of the original datasets. As mentioned before, these datasets have been extracted roughly a decade ago and several additional interactions have been validated meanwhile. Our experiment aims to check whether these recently validated interactions can be predicted based on the original interactions.

In particular, we considered those drug–target pairs that have unknown interaction status in the original datasets. We ranked these pairs according to their predicted interaction scores. For simplicity, we use the term predicted interaction for the top-ranked 20 drug–target pairs. We say that a predicted interaction is validated if it is included in the current version of KEGG [38], DrugBank [39] or Matador [40].

The results are shown in Tables 57 for Enzyme, GPCR and IC datasets (for drug and target identifiers, see: http://www.kegg.jp/). As one can see, many of the predicted interactions are validated. We point out that some of the validated interactions were only predicted by our approach, especially in case of the GPCR dataset.

MOLIERE for drug repurposing

In order to illustrate how our predictions may contribute to drug repurposing, we discuss some of the predicted interactions in more details.

First, we consider Diazoxide (KEGG ID: D00294) and “adenosine triphosphate binding cassette, sub-family C member 9” (ABCC9), also known as “sulfonylurea receptor 2” (SUR2) (KEGG ID: hsa:10060), i.e., the second predicted interaction listed in Table 7. According to KEGG, Diazoxide is an adenosine triphosphate (ATP) sensitive potassium channel opener. It opens potassium channel in beta cells of the pancreas and causes insulin secretion inhibition thus elevating blood sugar level. It is used in insulinoma [41] and congenital hyperinsulinism [42]. Diazoxide treatment can cause pulmonary hypertension and relax smooth muscle [43]. ABCC9 gene encodes a protein that is a subunit of an ATP sensitive potassium channel (ATP-binding cassette transporter) [44]. It is expressed in skeletal and heart muscle and in smooth muscles of the vasculature [44]. Mutation of the gene can cause dilated cardiomyopathy type 10 [45], and reduced cardiac stress adaptation [44]. ABCC9 knock-out mice showed elevated blood pressure and coronary artery vasospasm [46]. Other mutations of the ABCC9 gene (https://omim.org/entry/601439) can cause familial atrial fibrillation type 12 and hypertrichotic osteochondrodysplasia (Cantú syndrome) [47]. Diazoxide is not used in the treatment of these diseases. However, as Diazoxide can open the ATP sensitive potassium channel, it would be worth to examine the possible usage of Diazoxide in some ABCC9 gene defects where the transporter still can be activated to some extent.

Next, we consider the predicted interaction between Isradipine (KEGG ID: D00349) and “Calcium Voltage-Gated Channel Subunit Alpha1 A” (CACNA1A), also known as “spinocerebellar ataxia type 6” (SCA6) (KEGG ID: hsa:773), listed in the 11th line of Table 7. According to KEGG, Isradipine is an L type dihydropyridine calcium channel blocker that is used in hypertension. CACNA1A gene encodes the alpha-1A subunit of P/Q type voltage-dependent calcium channel. Mutations of this gene can cause spinocerebellar ataxia, early infantile epileptic encephalopathy, episodic ataxias, hemiplegic migraine and hemiconvulsion-hemiplegia-epilepsy syndrome. Some mutations of the gene increases the density of functional channels and their open probabilities in familial hemiplegic migraine [48]. Since our method takes into consideration the similarity of the two different calcium channels, it may be worth to try Isradipine inhibition of the P/Q type voltage-dependent calcium channel in experimental settings.

Finally, we consider the predicted interaction between Caffeine (KEGG ID: D00528) and Cystic fibrosis transmembrane conductance regulator (CFTR, KEGG ID: hsa:1080), listed in the 16th line of Table 7. Caffeine is a central nervous system stimulant, adenosine receptor antagonist and phosphodiesterase inhibitor (1). CFTR is a chloride channel that conducts chloride ions in lung, pancreas, liver, digestive tract and reproductive tract epithelial cell membranes. According to KEGG, gene mutations can cause cystic fibrosis (CF), hereditary pancreatitis and congenital bilateral absence of vas deferens. In rats caffeine intake increased CFTR chloride secretion in intestine [49]. Although, caffeine consumption is basically not recommended for CF patients, if some of the patients actually drink coffee, it would be interesting to compare their disease status with other CF patients not drinking coffee. Such a survey should be carefully designed in order to avoid biases. For example, the number of patients involved in the study should be large enough, one should take into account that people who have more sever disease may pay more attention to the health and lifestyle suggestions, while the type of mutations is also important.

Conclusion

In this paper, we focused on drug–target interaction prediction and proposed a new method, MOLIERE for this task. Despite the fact that MOLIERE is a relatively simple approach, experiments on real-world datasets show that MOLIERE outperforms state-of-the-art DTI methods. By discussing some of the predictions in detail, we showed how our approach may lead to medically relevant hypothesis and support drug repositioning.

As mentioned, the DTI problem shares inherent characteristics with recommender systems tasks, therefore, we expect that MOLIERE will be adapted for recommendation tasks in the future. Furthermore, we point out that the proposed framework of asymmetric loss models is not limited to drug–target interaction prediction, but it may be useful in other cases where the class label is originally continuous (due to the underlying physical, chemical, biological phenomena), but it has been transformed to a binary label.

As for the limitations of our study, we note that our approach is not designed to predict interactions in case of new drugs/targets, i.e., for drugs/targets for which not even one interaction is known. In our current work, we assumed that only few new drugs/targets are considered, and we used the simple weighted profile approach in this case. Therefore, further methodical enhancements are required, if predictions are desired for new drugs/targets.

References

  1. 1. Ding Y, Tang J, Guo F. Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing. 2019;325:211–224.
  2. 2. Liu LJ, Lu L, Zhong HJ, He B, Kwong DW, Ma DL, et al. An iridium (III) complex inhibits JMJD2 activities and acts as a potential epigenetic modulator. Journal of medicinal chemistry. 2015;58(16):6697–6703. pmid:26225543
  3. 3. Kang TS, Mao Z, Ng CT, Wang M, Wang W, Wang C, et al. Identification of an iridium (III)-based inhibitor of tumor necrosis factor-α. Journal of medicinal chemistry. 2016;59(8):4026–4031. pmid:27054262
  4. 4. Liu LJ, He B, Miles JA, Wang W, Mao Z, Che WI, et al. Inhibition of the p53/hDM2 protein-protein interaction by cyclometallated iridium (III) compounds. Oncotarget. 2016;7(12):13965. pmid:26883110
  5. 5. Yang C, Wang W, Li GD, Zhong HJ, Dong ZZ, Wong CY, et al. Anticancer osmium complex inhibitors of the HIF-1α and p300 protein-protein interaction. Scientific reports. 2017;7:42860. pmid:28225008
  6. 6. Ullrich K, Mack J, Welke P. Ligand Affinity Prediction with Multi-pattern Kernels. In: International Conference on Discovery Science. Springer; 2016. p. 474–489.
  7. 7. Morgan S, Grootendorst P, Lexchin J, Cunningham C, Greyson D. The cost of drug development: a systematic review. Health policy. 2011;100(1):4–17. pmid:21256615
  8. 8. Zhang J, Li C, Lin Y, Shao Y, Li S. Computational drug repositioning using collaborative filtering via multi-source fusion. Expert Systems with Applications. 2017;84:281–289.
  9. 9. Cheng AC, Coleman RG, Smyth KT, Cao Q, Soulard P, Caffrey DR, et al. Structure-based maximal affinity model predicts small-molecule druggability. Nature biotechnology. 2007;25(1):71–75. pmid:17211405
  10. 10. Pérot S, Regad L, Reynès C, Spérandio O, Miteva MA, Villoutreix BO, et al. Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction. PloS one. 2013;8(6):e63730. pmid:23840299
  11. 11. Cellier P, Charnois T, Plantevit M. Sequential patterns to discover and characterise biological relations. In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer; 2010. p. 537–548.
  12. 12. Fayruzov T, De Cock M, Cornelis C, Hoste V. Linguistic feature analysis for protein interaction extraction. BMC Bioinformatics. 2009;10(1):374. pmid:19909518
  13. 13. Davis J, Santos Costa V, Ray S, Page D. An integrated approach to feature invenction and model construction for drug activity prediction. In: Proceedings of the 24th International Conference on Machine Learning; 2007. p. 217–224.
  14. 14. Fan X, Hong Y, Liu X, Zhang Y, Xie M. Neighborhood Constraint Matrix Completion for Drug-Target Interaction Prediction. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer; 2018. p. 348–360.
  15. 15. Jamali AA, Ferdousi R, Razzaghi S, Li J, Safdari R, Ebrahimie E. DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins. Drug discovery today. 2016;21(5):718–724. pmid:26821132
  16. 16. Lan W, Wang J, Li M, Liu J, Li Y, Wu FX, et al. Predicting drug–target interaction using positive-unlabeled learning. Neurocomputing. 2016;206:50–57.
  17. 17. Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics. 2008;24(13):i232–i240. pmid:18586719
  18. 18. Buza K, Peška L. ALADIN: A New Approach for Drug–Target Interaction Prediction. Lecture Notes in Computer Science. 2017;10535:322–337.
  19. 19. Buza K, Peška L. Drug–target interaction prediction with Bipartite Local Models and hubness-aware regression. Neurocomputing. 2017;260:284–293.
  20. 20. Peska L, Buza K, Koller J. Drug-target interaction prediction: A Bayesian ranking approach. Computer methods and programs in biomedicine. 2017;152:15–21. pmid:29054256
  21. 21. Bolgar B, Antal P. Bayesian Matrix Factorization with Non-Random Missing Data using Informative Gaussian Process Priors and Soft Evidences. Journal of Machine Learning Research. 2016;52:25–36.
  22. 22. Gönen M. Predicting drug–target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics. 2012;28(18):2304–2310. pmid:22730431
  23. 23. Zheng X, Ding H, Mamitsuka H, Zhu S. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In: 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2013. p. 1025–1033.
  24. 24. Wang Y, Zeng J. Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics. 2013;29(13):i126–i134. pmid:23812976
  25. 25. Chen X, Liu MX, Yan GY. Drug–target interaction prediction by random walk on the heterogeneous network. Molecular BioSystems. 2012;8(7):1970–1978. pmid:22538619
  26. 26. Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, et al. Prediction of drug-target interactions and drug repositioning via network-based inference. PLoS Comput Biol. 2012;8(5):e1002503. pmid:22589709
  27. 27. Sönströd C, Johansson U, Norinder U, Boström H. Comprehensible Models for Predicting Molecular Interaction with Heart-Regulating Genes. In: 7th IEEE International Conference on Machine Learning and Applications; 2008. p. 559–564.
  28. 28. Bleakley K, Yamanishi Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics. 2009;25(18):2397–2403. pmid:19605421
  29. 29. Xia Z, Wu LY, Zhou X, Wong ST. Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces. BMC Systems Biology. 2010;4(Suppl 2):S6. pmid:20840733
  30. 30. van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drug–target interaction. Bioinformatics. 2011;27(21):3036–3043. pmid:21893517
  31. 31. Mei JP, Kwoh CK, Yang P, Li XL, Zheng J. Drug–target interaction prediction by learning from local information and neighbors. Bioinformatics. 2013;29(2):238–245. pmid:23162055
  32. 32. Davis MI, Hunt JP, Herrgard S, Ciceri P, Wodicka LM, Pallares G, et al. Comprehensive analysis of kinase inhibitor selectivity. Nature Biotechnology. 2011;29(11):1046–1051. pmid:22037378
  33. 33. Hattori M, Okuno Y, Goto S, Kanehisa M. Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. Journal of the American Chemical Society. 2003;125(39):11853–11865. pmid:14505407
  34. 34. Pilászy I, Tikk D. Recommending new movies: even a few ratings are more valuable than metadata. In: 3rd ACM Conf. on Recommender Systems; 2009. p. 93–100.
  35. 35. Suciu M, Lung RI, Gaskó N. Noisy extremal optimization. Soft Computing. 2017;21(5):1253–1270.
  36. 36. van Laarhoven Twan and Marchiori Elena. Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile. PloS one. 2013;8(6):e66952. pmid:23840562
  37. 37. Pahikkala T, Airola A, Pietilä S, Shakyawar S, Szwajda A, Tang J, et al. Toward more realistic drug-target interaction predictions. Briefings in Bioinformatics. 2015;16(2):325–337. pmid:24723570
  38. 38. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic acids research. 2006;34(Suppl 1):D354–D357. pmid:16381885
  39. 39. Wishart David S and Knox Craig and Guo An Chi and Shrivastava Savita and Hassanali Murtaza and Stothard et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research. 2006;34(Suppl 1):D668–D672. pmid:16381955
  40. 40. Günther Stefan and Kuhn Michael and Dunkel Mathias and Campillos Monica and Senger Christian and Petsalaki et al. SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic acids research. 2008;36(Suppl 1):D919–D922. pmid:17942422
  41. 41. Ferrer-Garcia J, Gonzalez-Cruz VI, Navas-DeSolis S, Civera-Andres M, Morillas-Arino C, Merchante-Alfaro A, et al. Management of malignant insulinoma. Clinical and Translational Oncology. 2013;15(9):725–731. pmid:23460559
  42. 42. Banerjee I, Salomon-Estebanez M, Shah P, Nicholson J, Cosgrove K, Dunne M. Therapies and outcomes of congenital hyperinsulinism-induced hypoglycaemia. Diabetic Medicine. 2018;.
  43. 43. Timlin MR, Black AB, Delaney HM, Matos RI, Percival CS. Development of Pulmonary Hypertension During Treatment with Diazoxide: A Case Series and Literature Review. Pediatric cardiology. 2017;38(6):1247–1250. pmid:28642988
  44. 44. Schumacher T, Benndorf RA. ABC transport proteins in cardiovascular disease—A brief summary. Molecules. 2017;22(4):589.
  45. 45. Bienengraeber M, Olson TM, Selivanov VA, Kathmann EC, O’Cochlain F, Gao F, et al. ABCC9 mutations identified in human dilated cardiomyopathy disrupt catalytic K ATP channel gating. Nature genetics. 2004;36(4):382. pmid:15034580
  46. 46. Chutkow WA, Pu J, Wheeler MT, Wada T, Makielski JC, Burant CF, et al. Episodic coronary artery vasospasm and hypertension develop in the absence of Sur2 K ATP channels. The Journal of clinical investigation. 2002;110(2):203–208. pmid:12122112
  47. 47. Harakalova M, van Harssel JJ, Terhal PA, van Lieshout S, Duran K, Renkens I, et al. Dominant missense mutations in ABCC9 cause Cantu syndrome. Nature genetics. 2012;44(7):793. pmid:22610116
  48. 48. Hans M, Luvisetto S, Williams ME, Spagnolo M, Urrutia A, Tottene A, et al. Functional consequences of mutations in the human α1A calcium channel subunit linked to familial hemiplegic migraine. Journal of Neuroscience. 1999;19(5):1610–1619. pmid:10024348
  49. 49. Wei X, Lu Z, Yang T, Gao P, Chen S, Liu D, et al. Stimulation of Intestinal Cl-Secretion Through CFTR by Caffeine Intake in Salt-Sensitive Hypertensive Rats. Kidney and Blood Pressure Research. 2018;43(2):439–448. pmid:29558753