Read this article about the healthcare sector. Compare and contrast the challenges associated with each industry.
Medicine, Drugs, and Vaccines
Big Data for Systems Pharmacology and Drug Repurposing
If more data can serve the purpose of a better and a more effective prevention giving a whole view of the health status, more data and the ability to handle, prioritize and analyze them faster can also be a significant advantage for drugs producers, whose attention is particularly focused on drug repurposing.
How complex is to make a drug? The complexity and length of the process is directly reflected in the numbers: average time to develop a drug ranges from 10 to 15 years, and <12% of drugs entering clinical trials are approved as medicines. Costs of development, that include cost of failures, surged from 413 million dollars in the 1980s to 2.6 billion dollars in the 2000s, while industry investments in research passed from 2 billion dollars to 50 billion dollars in the same period (source FDA). These figures show how research became less effective, by a 4-fold factor, in channeling its results in drugs development pipelines, thus the imperative to look for alternative processes. Repurposing, or repositioning, drugs is the process of finding new therapeutic indications for existing drugs. A repurposed drug does not need brand new research processes and already obtained approval for preclinical phases and/or phase I clinical trials. Securing FDA approval of a repurposed drug costs only $40–$80 million in total, compared to the average of $1–$2 billion it takes to develop a new drug.
Drug repurposing has been the subject of many computationally-driven efforts. Science and pharmaceutical research, are not new to computer aided research, and they are not new to hyped confidence in computational methods and failures. Computer assisted drug discovery, or CADD, was used and welcome by press with quite enthusiastic tones in the 1980s: the so called rational drug design seemed to anticipate an industrial revolution that actually did not happen. With these cultural and historical antibodies, the scientific community should be today more prepared to better handle and embed computer power in research and development processes. Today's computers are significantly improved in processing power, memory size and in the software and algorithms they run, and better analyses and methods, such as free energy perturbation analysis and quantum mechanics modeling, are available in the field of drug discovery.
Nevertheless, the primary failure of new drug candidates is due to a very simple fact: that human biology is hugely complicated. Off-targets or mechanism-based toxicities are the very common result of unpredictable interactions or processing and delivery.
Network Based Methods
Network based approaches are used to allow more comprehensive views of complex systems. Networks are a convenient way to describe molecular and biochemical interactions, either experimentally validated or predicted; drugs and diseases relationships have been also investigated with network metrics or to predict novel targets for existing drugs.
Networks, which are the description of apparently unstructured entities that interact each other, can be considered big data, depending on the number of entities connected and on their annotated properties. The visually compelling blobs of a network offered by many graphic displays deliver high visual impact and serve the purpose to show the complexity and number of interactions, but precise metrics are needed to unwind the intricacies of a highly connected network, such as hierarchies and upstream/downstream effects (activation, inhibition). Much of these metrics have been developed in the context of graph theory, and they are generally used as a way of summarizing the complexity in a convenient way. Some metrics provide information about individual nodes (the entities in the network), others about the edges (the connections between entities). Centrality metrics for instance identify the most important nodes in a network, those that are most influential, while the node influence metrics serve to measure the influence of every node in the network.
In the systems pharmacology domain global drug networks obtained integrating protein-protein interactions or gene regulation data with information of many kinds of drugs have been used to define druggable targets. This is probably the most general method, able to describe the borders within which to look for candidate targets. A typical way is to look for hubs, i.e., highly connected nodes: such nodes are likely to have key roles in the regulation of multiple biological processes. Ligands' chemical similarity is another property that can be used to define edges between drug targets (nodes): biochemically important properties enter into play in the network-mediated discovery process.
For a narrower focus, disease-gene networks are obtained adding disease information beside biological datasets and drug information, to find possible targets for specific therapies. Proteins which interact with each other are frequently involved in a common biological process or are involved in a disease process: to this end, networks focusing on specific disease processes have been built and mined for new candidate drug target.
Instead of focusing on specific pathological domains, another method that proved useful for drug repositioning was the introduction of new or refined network metrics, able to capture the essence of a potential drug target in a more unbiased way. Traditional metrics relies on the actual topology of described connectomes, like the shortest path between targets or common targets between drugs. Such methods are potentially biased by the known interactomes already described in details in the literature, thus biased in favor of the most studied genes or proteins. For this reason, new unsupervised network metrics have been recently proposed by the Barabasi group, based on the observation that disease genes preferentially cluster in the same network neighborhood. Thus, they reasoned, the immediate vicinity of target proteins to disease modules should have been a proxy for effectiveness of the drug through the action of those targets. They introduced a proximity index that quantifies the topological relationship between drugs and disease proteins and used it to investigate relationship between drug targets and disease proteins. Thousands of drug-disease associations either reported in the literature or unknown were grouped, for a total of more than 36 thousand associations. Both known and unknown drug-disease associations were tested with the proximity index method and it was found that drugs do not target the whole disease modules, but only a smaller subset of molecules in it. Proximity measure was also used to find similarity between drugs covering a larger number of associations than other methods, and to mine for potential repurposing candidates for rare diseases.
Physical Chemistry in Pharmacology
The making of a drug is complex by design. Computational methods for drug design (Computer aided drug design, or CADD) belong to two major categories based on molecular mechanics (MM) or quantum mechanics (QM). The first methods are essentially used to determine molecular structures and the potential energies of their conformation and atomic arrangement. The elementary particle under investigation in these methods is the atom, taken as a whole, without taking into account electrons, contribution. On the contrary, methods of the second class consider the electrons and the systems behavior is investigated as an ensemble of nuclei and electrons.
As a result, MM describes molecules as atoms which are bonded each other: not considering electrons motion MM requires precise and explicit information about bonds and structure. QM is able to compute systems energy as a function of electrons and atomic nuclei (not just as a function of atomic position) and incorporates physical principles such as quantum entanglement which refers to the correlated interaction of particles or group of particles providing a quantum view of the concept of chemical bond. This quality is particularly useful in the study of interactions of drugs and active sites of enzymes.
With QM methods it's possible to calculate crucial system properties which cannot be achieved by experimental procedures: vibrational frequencies, equilibrium molecular structure, dipole moments. These properties are useful in computer models predicting how a particular chemical compound might interact with a target of interest, for instance a drug and a pocket of an enzyme involved in a disease. This is traditionally done by modeling and molecular docking, but these methods better fit the aim of data reduction when screening millions of compounds and high reliability of the model is not strictly requested. Starting from a much lower number of candidates, i.e., a few hundreds, and needing to shortlist them to 10, extreme accuracy of the model is a must, and current techniques are not sufficient. To achieve higher accuracy both machine learning and QM methods come into help.
In a recent proof of principle study Ash and Fourches studied ERK2 kinase, which is a key player in various cancers, and 87 ERK2 ligands in search of new kinase inhibitors. They incorporated the molecular dynamics results into prediction models generated by machine learning and obtained an hyper predictive computer model using molecular dynamic descriptors able to discriminate the most potent ERK2 binders. The availability of modern computer with high processing power, especially GPU accelerated computers, allow even longer simulations for a larger number of proteins.
The concept of "magic bullet", as drugs binding a single molecular disease target, is probably a minority case. Since small molecule drugs still account for the majority of the therapeutics in today's pharmaceutical market, finding new targets for them is a highly sought after opportunity, even if it's nearly impossible to find small molecules with no off-target effects. This is mainly due to the fact that the research attention is traditionally focused on the details and that investigations are usually not at the genomic scale.
Due to the complexity of protein-ligands interactions a view at the genomic scale can make the difference: an off target can turn to be a repurposable drug thanks to the complete change of genomic context. Thus, being able to maintain attention to molecular details and to the single biochemical properties while doing it at a genomic scale is a major challenge of today pharmacology.
Such a wide view is typical of systems biology approaches, and it is more commonly exploited with network and interactions databases. analyzed all known drug-target interactions creating a human pharmacology interaction network connecting proteins that share one or more chemical binders. Mestres et al. integrated seven drug-target interaction databases and found that drugs interact on average with six different targets. Quantum mechanics applied to drug discovery operated in a machine learning context allows to significantly scale up the process. Even if application of this approach requires a massive computational effort and ability to handle and analyse big data, thousands of generics are now scanned for interactions with computational methods and off targets are new opportunities as potential new repurposing drugs.