A Look Inside Health Care Big Data
What makes big data analysis in health care unique?
Kevin Brennan: The Affordable Care Act has increased the focus on delivering, measuring, and reporting on health outcomes. This, in turn, has created new requirements for customized information. Due to the falling costs of storage technology, access to patient-specific data that might have once been considered impractical is now a reality. Moreover, other federal regulations also require data collection and access, as in adverse event reporting. The nature of the third-party payer system effectively means that information is managed based on the priorities and interests of the owner, which can make it extremely difficult to develop a unified view of data sources and requirements throughout the system.
How do you approach such large data sets in your work?
Mark Gustafson: The biggest issue is that the data we analyze are a summary of extremely complex interactions between numerous health care stakeholders. In my work with Managing Principals Bruce Deal and Bruce Strombom, we often encounter data that are not only stored in multiple formats, but are also housed in multiple databases covering different time periods. In a case involving hospital admissions and transfer patterns, for example, we might analyze electronic health records (EHRs), claim adjudication notes, administrative claims data, and transfer records for hundreds of thousands of individuals. Each type of data set requires different professional expertise (e.g., clinical expertise to assess EHRs, medical and insurance coding expertise to evaluate claims data sets). Many data sets are distinct and were not designed to be analyzed in combination. And don’t underestimate the size factor: it’s not uncommon for inexperienced analysts to become overwhelmed by the sheer volume of available data, which can lead to poor analysis and unreasonably large time investments. Today, key differentiators have less to do with access to raw computer power than with institutional knowledge about the workings of the industry.
What capabilities are required for this type of analysis?
Kevin Brennan: To integrate heterogeneous data from multiple underlying sources, four interrelated elements are usually required: (1) technical expertise to create appropriate algorithms; (2) technical capacity to analyze and process large amounts of information; (3) human expertise to obtain, understand, and report the appropriate results; and (4) human experience to appreciate the most efficient and effective approaches, based on similar or related projects. Less technical, but no less important, is the ability to work smoothly with the providers’ or payers’ technical staff to obtain and understand the data. While this may seem secondary, the client’s experience is often shaped by how well consultants interact with their IT departments.
What opportunities for new health care analyses are enabled by big data?
Mark Gustafson: Big data enables the application of new analytical methods. Machine learning procedures can uncover unhypothesized patterns in large data sets and suggest new avenues of research. Predictive analysis techniques enable the development of statistical models that can forecast the likelihood of health outcomes based on treatment patterns. Sophisticated text analytics may be employed to derive analyzable data sets from unstructured data, such as physician clinical notes in an EHR. Careful and thoughtful application of these types of new methods, coupled with appropriate technical and human capabilities, can generate powerful opportunities to address the most challenging health economics questions. ■