Drug Repositioning Network System Using the Power of Network Analysis and Machine Learning to Predict New Indications for The Approved Drugs

Sherief Ahmed Hassan El-Rweney1*

1Computer science systems and information technology, Royal Holloway, University of London, UK

*Corresponding Author:Sherief Ahmed Hassan El-Rweney,Computer science systems and information technology, Royal Holloway, University of London, UK, TEL:+447717317626 ; FAX:+201023993902;E-mail:shriefelrweney22@hotmail.com

Citation:Sherief Ahmed Hassan El-Rweney (2017) Drug Repositioning Network System Using the Power of Network Analysis and Machine Learning to Predict New Indications for The Approved Drugs "Drug Repositioning and Rate the Level of The Drug Similarity".Arch Mol Med & Gen 1:103.

Copyright: : © 2017 Sherief Ahmed Hassan El-Rweney, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Received date:December 04, 2017; Accepted date:December 30, 2017; Published date:January 06, 2018

Abstract

Statement of the Problem: Drug discovery is a lengthy process, taking on average 12 years for the drugs to reach the market –but as Sir James Black OM once said “the best way to discover a new drug is to start with the old one”. As result, this will drive to Drug repositioning concept.

Drug Repurposing and repositioning is Finding a new clinical use for an approved drug. There are many factors that can be used to predict new target disease. I.e. protein-protein interaction, chemical structure, gene expression and functional genomics, Phenotype and side effect, genetic variation and Machine learning.

Protein-protein interaction PPI is Physical contacts with molecular docking between proteins that occur in a cell or in a living organism in vivo. There are Two Alternative Approaches PPI “Binary: yeast two‐hybrid (Y2H) and co‐complex: (TAP‐MS)”.

Drug Repositioning System, is a system built based on protein-protein Binary interaction to predict new targets for the approved drugs. The system curate the data sets for human PPI, Drugs and diseases from well- known online sources (PPI from HRPD, drugs from DrugBank, Diseases from DisGeNET), Drug Repositioning System relates the 3 data sets based on genes name.

Drug Repositioning Network System consisting of two interfaces: backend system where the curated data sets stored based on rational database and using Big Data tools, and frontend web interface where the end users can use many search engines to search inside the system for diseases, genes and drugs to predict and find new targets for the approved drugs based on protein interactions, from the web interface the user can make analysis based on his search result and build network between the genes, diseases and drugs and generate statistics to be able to answer his question.

There are many Questions that can be answered by Drug Repositioning System and generate statistics: for example, the main question is can we find new indications for existing approved drugs.

Drug similarity: from the Drug Repositioning System we able to measure the percentage of drugs similarity between any pair genes interaction based on the number of shared drugs between them to rate the level of drug repositioning strength and then use the ROC analysis.

Introduction

Many definition approaches for Drug repositioning

  • Drug Repurposing or repositioning is finding a new clinical use for an approved drug. From the perspective of the repositioning drug, we going to use the drugs that already have been approved which is the first step in the drugs discovery.
  • Drug repositioning (also referred as drug repurposing, re‐profiling, therapeutic switching and drug re‐tasking) is the identification of new therapeutic indications for known drugs. These drugs can either be approved and marketed compounds used daily in a clinical setting, or they can be drugs that have been “shelved”, namely molecules that did not succeed in clinical trials. But in this Research we will just use the approved drugs [1].
  • Drug repositioning is the application of known drugs and compounds to treat new indications (i.e., new diseases) or by other meaning the goal of a repositioning initiative is to establish a link between a drug and a disease.

Drug discovery

  • Drug discovery is a lengthy process, taking on average 12 years for the drugs to reach the market –but as Sir James Black OM once said “the best way to discover a new drug is to start with the old one [3].
  • EMBL-EBI defined the process of searching inside its databases during the phases of the drug design [4].

Basic Concept

In the following figure1 based on the interaction between ProteinA and ProteinB we can make network path from ProteinA to DiseaseB and DrugB and network path from ProteinB to DiseaseA and DrugA

Big concept

Using the power of network analysis and machine learning to predict new indications for the approved drugs.

  • Drug Repositioning
  • Protein–Protein Interactions PPI
  • Network analysis
  • Biological Networks

Aims from research

Output example from our experiment:

In the left side of the picture, it’s clear that there are two groups of drugs target numerous diseases related to HDAC6 gene, and also on the right side we will find one group of drugs targets two groups of disease that are related to TUBB gene while there link between TUBB and HDAC6 indicates the interaction between them. As result we can make drug repositioning between the two genes.

Output example from our experiment:

In the left side of the picture, it’s clear that there are two groups of drugs target numerous diseases related to HDAC6 gene, and also on the right side we will find one group of drugs targets two groups of disease that are related to TUBB gene while there link between TUBB and HDAC6 indicates the interaction between them. As result we can make drug repositioning between the two genes.

Success story of drug repositioning, Thalidomide

Many drugs have been successfully repositioned in the past; classical examples such as sildenafil (Viagra) and thalidomide [6].

A significant advantage of drug repositioning over traditional drug development is that since the repositioned drug has already passed a significant number of toxicity and other tests, its safety is known and the risk of failure for reasons of adverse toxicology are reduced. More than 90% of drugs fail during development, and this is the most significant reason for the high costs of pharmaceutical R&D [6].

Drug repositioning has been growing in importance in the last few years for many reasons, for example:

Pharmaceutical companies see their drug pipelines drying up and realize that many previously promising technologies have failed to deliver ‘as advertised.

Computational approaches based on virtual screening of comprehensive libraries of approved and other human use compounds against large numbers of protein targets simultaneously have been developed to enhance the efficiency and success rates of drug repositioning [6].

Thalidomide

Was marketed to treat morning sickness in pregnant women. The drug was assumed to be safe, based on an in vivo study in rodents. The drug caused severe skeletal birth defects in children born from women taking the drug. Over 15,000 new-borns were affected, suffering from anatomical malformations. Because of this disastrous side-effect, the molecule was quickly withdrawn and triggered important reforms in the drug regulatory system. The story could have ended here, if it were not for an incidental discovery by Jacob Sheskin. The practitioner was trying to treat patients affected by erythema nodosum leprosum, a particularly painful inflammatory condition characterised by red nodules under the skin. An evening of 1964, an affected patient could not sleep as the pain was so intense. Sheskin decided to ultimately use some thalidomide, as the compound was known for its potent sleep-inductive properties and was available in this hospital. The drug worked and the patient was well rested in the morning. And as a general surprise, all pain and soreness disappeared overnight too. Sheskin further studied the action of thalidomide in clinical trials and successfully showed that the drug can indeed treat erythema nodosum leprosum in two weeks’ time in most subjects. Thalidomide found a new life and became the first and only drug approved for this indication [5]. 

Basic development Points included on the research

This research will go through basic points to achieve the aims from this study:

a. Curate data for drugs, proteins, diseases from online on line known resources

b. Clearing collected data and make data mining and statistics.

c. Build a large dataset containing drugs, proteins, diseases with known interaction between them with programing interface to able to query the dataset to answer questions.

d. Build network medicine to analyses the new targets for the approved drugs.

e. Strength the success of drug repositioning hypothesis between genes pairs by Apply the machine learning to Find and calculate the percentage of the drug similarity

Research Backgrounds Will Be Discussing Within This Research

a. Protein–Protein Interactions Essentials

b. Basic terminologies of networks and networks analysis

c. Biological Networks

d. Elements and principles of network theory

e. The principles of Network Medicine

Protein–Protein Interactions Essentials: Key Concepts to Building and Analyzing Interactome

PPI Definition

Physical contacts with molecular docking between proteins that occur in a cell or in a living organism in vivo.

Definition considerations:

1. Any protein in the ribosome or in the basal transcriptional apparatus shares a functional contact with the other proteins in the complex, but certainly not all the proteins in the particular complex interact.

2. The interaction interface should be intentional and not accidental.

3. The interaction interface should be non-generic.

4. That PPIs imply physical contact between proteins does not mean that such contacts are static or permanent.

5. Not all possible interactions will occur in any cell at any time.

Two Alternative Approaches PPI [7]

Binary and Co‐Complex:

Interactions between proteins are done at either a large or small scale by using two techniques:

-Binary: yeast two-hybrid (Y2H)

Measure direct interactions between proteins.

-Co-complex : tandem affinity purification coupled to mass spectrometry (TAP-MS)

Measure both direct and indirect interactions between proteins. Both are widely applied in large‐scale investigations

The following figure: Binary methods and co‐complex methods: two approaches to determine PPIs.

The interactions shown in the left panel (green links) correspond to the true interactions existing between two groups of proteins (set A with four proteins and set B with three proteins). The interactions shown in the right panels correspond to the networks derived from the experimentally measured interactions existing between the six proteins analyzed: the network in the top right panel (blue links) presents the interactions obtained using a binary method; the network in the bottom right panel (red links) presents the interactions obtained using a co‐complex method. The red links are calculated applying the spoke model To the TAP‐MS experimental data, but three of the interactions deduced (links with an X) do not occur.

The Main Databases and Repositories That Include PPIs [7]

As we mentioned before we are going to build our drugs re-poisoning network based on protein interaction, so practical users have to know which types of interaction databases are available, what are the differences between them, and which are the most comprehensive and stable repositories.

Over the past few years, the number of known protein–protein interactions has increased substantially. To make this information more readily available, a number of publicly available databases have set out to collect and store protein–protein interaction data. Protein–protein interactions have been retrieved from six major databases, integrated and the results compared The six databases (the Biological General Repository for Interaction Datasets [BioGRID], the Molecular INTeraction database [MINT], the Bimolecular Interaction Network Database [BIND], the Database of Interacting Proteins [DIP], the IntAct molecular interaction database [IntAct] and the Human Protein Reference Database [HPRD]).

With respect to human protein–protein interaction data, HPRD seems to be the most comprehensive. To obtain a complete dataset, however, interactions from all six databases have to be combined. To overcome this limitation, meta-databases such as the Agile Protein Interaction Database (APID) offer access to integrated protein–protein interaction datasets, although these also currently have certain restrictions.

A comparison of the main databases and repositories that include protein interactions is shown in Table 1