Machine Learning in Health Care

Recent improvements in health technology, and computer technology more generally, have dramatically increased the quantity of data available in the health care industry. Electronic medical records, genomic and genetic data, and patient-reported data give insights into therapies and drugs that can provide high value at low costs. Emerging innovations and methods will allow us to perform several procedures faster and more precisely, which entails an abundance of even better data in the very near future. If technology continues to improve without corresponding advances in our ability to digest, interpret, and use its data, our technological innovations are effectively wasted. Recent developments in machine learning provide a promising avenue to ensure technological innovations do not go to waste.

Machine learning involves computer algorithms that learn highly complex and intricate relationships in high-dimensional data; they do not pre-impose constraints on the relationship between inputs and outcomes. These algorithms have been largely motivated by the high performance of biological learning systems. In this way, the relationship between machine learning and the health care industry is one of mutualism. A better knowledge of how our brains function improves the performance of machine learning algorithms. Better machine learning algorithms, in turn, are heavily used in the health care industry to discover relationships important for diagnosis and our understanding of the human body.

Recent applications of machine learning in the health care industry, implemented by Analysis Group and others, have been extremely varied in nature

  • Estimating the actual prevalence of under-diagnosed diseases
  • Segmentation of various medical images, e.g., wounds, nuclei, bacteria colonies, infant brains, and hearts, to improve and automate research
  • Classifying images of protein structure, the eye, lungs, and brain to improve the detection of diseases in their early phases, when they are more treatable
  • Identifying abnormal electrical brain wave patterns from electroencephalogram (EEG) readings for the diagnosis of epilepsy
  • Developing brain-computer interfaces to make communication possible for those unable to do so in conventional ways, e.g., due to severe motor disabilities
  • Natural language processing of hand-written doctor prescriptions and narratives
  • Sentiment analyses of social media: classifying large quantities of online communications to characterize discussions of particular drugs
  • Using DNA sequencing to identify the genetic determinants of rare and hereditary diseases
  • Automatically identifying and extracting drug-drug interactions documented in unstructured text, e.g., scientific articles or technical reports, and classifying them into predefined categories
  • Identifying proteins that can interact with drugs for target-based drug discovery
  • Classifying compounds that interact with proteins, e.g., as inhibitors vs. non-inhibitors, to improve the design of drug-resistant protein inhibitors
Recent analyses