Machine Learning Algorithms in Health Care Litigation
The health care industry has experienced exponential growth in the variety and richness of data, driven in part by the advent of electronic medical records and introduction of industry reporting requirements such as the Sunshine Act.
It has been further fueled by technological innovations that provide both greater data storage and ever-increasing computing power. However, the growing volume and complexity of available data are testing the limits of familiar analytical tools such as spreadsheets and statistical software.
Enter machine learning. Machine learning uses algorithms to detect complex and unforeseen relationships in high-dimensional data (i.e., where there is an abundance of different types of variables, including numbers, text, and/or visual images). In a litigation context, in particular, the proliferation of health care data can be daunting. Here are a few examples of how attorneys can leverage machine learning to strengthen their cases while optimizing their efforts.
Crafting a legal strategy
Machine learning can be applied during the discovery phase of litigation to quickly find relevant information in large quantities of data. Consider a dispute over alleged off-label promotion of prescription drugs. Conventional analyses might serve as a blunt instrument, grouping together all patients with a particular condition (e.g., lung cancer). Machine learning methods, on the other hand, can identify similarities among patients based on a wider and deeper range of variables or characteristics, leading to finer groupings. Such clustering could reveal clinical differences (e.g., advanced age, failure on other cancer therapies, genetic markers) among groups of patients that might explain use of the drug independent of any promotion. Uncovering these types of patterns at an early stage in the litigation can be beneficial to attorneys as they contemplate the theory of the case.
Assessing the value of a patent when its validity is challenged in court
In patent infringement cases, machine learning can be used to sort through reams of filings using natural language processing capabilities in order to reveal features common to desired outcomes. Unlike conventional statistical methods, machine learning algorithms can be “taught” to recognize the importance of particular word and phrase combinations or other characteristics within patent claims that are associated with a specified outcome, and then use these associations to improve predictions. This information can be combined with other data to approximate the process that leads to final judgments at the patent office. In a patent dispute, such predictions can help the parties decide whether to negotiate a settlement or engage in costly litigation.
Machine learning algorithms offer a flexible approach to modeling complex, non-linear relationships among data.
Mining data efficiently to strengthen a case
Machine learning can make use of the vast amounts of data in a company’s possession to conduct much more sophisticated analyses that support testimony or provide counterfactual scenarios. For example, attorneys defending a pharmaceutical manufacturer against allegations of kickbacks paid to physicians might use machine learning to identify doctors who did not receive any payments but had similar prescribing patterns to those who did. Deposing such physicians could shed light on factors that drive prescribing patterns in the absence of any possible inducements.
Conventional methods can be cumbersome, taking up valuable time and resources, and require analysts to specify selected parameters of interest. If the wrong parameters are selected, the most useful candidates may be overlooked. But with machine learning, there is no restriction on the number of—or interrelationships among—parameters the computer can account for, which increases the efficacy of the search while controlling time and effort. Information that might once have been discarded as impractical or irrelevant for expert modeling purposes, such as unstructured data like patient/physician perceptions, can be mined for use in discovery or economic analysis.
In the increasingly complex and technical world of litigation, the widespread adoption of machine learning will no doubt prove to be a significant advantage. These new techniques can be harnessed to help attorneys develop better legal strategies, conduct informed fact discovery, provide testifying experts with the most complete set of relevant information, and prepare analyses at a previously unseen level of granularity. ■
Lisa B. Pinheiro, Managing Principal
Jimmy Royer, Principal
Mihran Yenikomshian, Managing Principal
Nick Dadson, Vice President
Paul E. Greenberg, Managing Principal
Adapted from “Machine-Learning Algorithms Can Help Health Care Litigation,” by Lisa B. Pinheiro, Jimmy Royer, Nick Dadson, and Paul E. Greenberg, published on Law360.com, June 8, 2016; and “Practical Uses For Machine Learning In Health Care Cases,” by Mihran Yenikomshian, Lisa B. Pinheiro, Jimmy Royer, and Paul E. Greenberg, published on Law360.com, September 22, 2016.
When to Consider a Machine Learning Approach
There are many potential uses of machine learning algorithms in a litigation context. While they are not the solution to every analytical problem, they are poised to add significant value to the analyses, especially when three conditions are met:
The goal of the analysis is to predict an outcome;
Out-of-sample performance is the desired measure of success; and
A rich dataset is available to take advantage of interactions among many potential predictors with complex interrelationships (e.g., a nonlinear function of many factors for which it is difficult to specify its form in advance).
Of course, as was the case with other new technologies that have been introduced to the courtroom (e.g., fingerprints, DNA evidence), testifying experts’ reliance on machine learning might invite initial skepticism. When using such a methodology, the expert will need to rigorously validate the chosen model and evaluate whether results are meaningful and sufficiently accurate (e.g., a model that accurately predicts an outcome 90 percent of the time but has a high false positive rate might not be appropriate). Testifying experts using machine learning methods will also need to educate and convince the court of the validity of these less familiar models.