Understanding Algorithms, Data Structures, and Technical Systems in Litigation
In the digital era, technology, networks, and software programming often lie at the heart of disputes involving allegations of anticompetitive behavior and intellectual property (IP) infringement. As a result, an expert’s ability to explain and present highly technical evidence to non-technical lawyers, judges, and juries can be crucial.
Michael Mitzenmacher, who is the Thomas J. Watson, Sr. Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences, has extensive experience straddling the worlds of technology, academia, and litigation. He has testified in over a dozen trials on a wide range of technology-related issues, and has worked on cases involving network security, video/audio transmission, databases, error-correcting codes, video compression, network communication protocols, and digital similarity for text, audio, and video. His case work spans a variety of practice areas, such as patent infringement, copyright, trade secrets, and legal compliance.
Analysis Group Principal Almudena Arcelus and Vice President Christopher Llop talked to Professor Mitzenmacher about how he translates the specialized technical world of analyzing computer code, systems, and algorithms into the language of the layperson.
Mr. Llop: You’ve testified in a number of litigation matters that required a detailed understanding of how technology works, and in particular the use of algorithms, data structures, and source code to drive business processes. Are there particular areas or businesses where your expertise is most in demand?
Michael D. Mitzenmacher: Thomas J. Watson, Sr. Professor of Computer Science, Harvard John A. Paulson School of Engineering and Applied Sciences
Professor Mitzenmacher: My academic research covers the use of algorithms and data structures in a wide range of contexts. This breadth of focus has been useful in understanding the technology in all sorts of industries and markets – algorithms and data structures are everywhere, and not only in so-called technology or digital companies! In these days of machine learning and highly connected devices, algorithm-based technology can be embedded in almost any product or at the core of almost any service.
For that reason, my kind of specialized knowledge can be fundamental in litigation – and particularly IP litigation – involving any business that provides digital services or offers a product with embedded software. This could include things like entertainment services, communications, ride sharing, delivery, or online dating services. For example, I’ve run experiments to show the way that major streaming platforms transmit data through their content distribution networks, and have analyzed the source code for major cloud computing platforms. This work required developing an in-depth knowledge of the technology, its underlying source code, and the algorithms central to its functionality.
But algorithms and data also are at the core of any company that relies on technology to conduct its business or gather information about its customers. I am often also called upon to examine business functions, processes, or tasks that may originally have been manual but now are carried out by computers instead. For example, I have studied text messaging platforms to determine whether or not they are automatic telephone dialing systems (ATDS) – that is, do they automatically generate numbers to send messages to? I’ve also been involved in cases involving digital rights management for video delivery to devices like networked smart TVs.
I’ve also applied my expertise in algorithms to non-technology cases where you otherwise might not expect it. For example, I have testified in a case where I worked with Analysis Group to detect plagiarism in reports generated by different companies, where we used advanced copy-detection algorithms to show where the plagiarism appeared to have occurred.
“In these days of machine learning and highly connected devices, algorithm-based technology can be embedded in almost any product or at the core of almost any service.”
– Professor Michael Mitzenmacher
Ms. Arcelus: Many of your cases require you to develop a detailed picture of complex technologies. Can you walk us through the process of how you approach understanding a new technical system?
Professor Mitzenmacher: I usually start by determining the core issues and technologies involved in my part of the legal case. From there, I try to gather and internalize all the relevant sources of information available to me in understanding a technology at a high level.
When possible, I then complement this high-level approach with a close review of the source code, code documentation, version control metadata, deployment logs, and other technical files underlying a product. Particularly when analyzing very large code bases, I find it helpful to work with a team, which is often where consultants from Analysis Group play a critical role in my research process. Throughout this review, I can also help counsel conduct interviews with software engineers familiar with the product at hand, who may not be accustomed to explaining the concepts underlying source code to lawyers or non-technical people.
The goal of my review is to identify and study the functionality at issue, to apprise counsel of the relevant technical details and their implications for the case, and ultimately to explain these technical components to the judge or jury in a clear and compelling fashion. Developing an in-depth understanding of a technology, and being able to communicate that to the layperson, is especially critical in cases where triers of fact must understand specifically how a technology works to make the correct decision.
Mr. Llop: One common thread in many of your cases is the use of source code forensics. What is source code forensics?
Professor Mitzenmacher: Source code forensics involves studying the source code that underlies a product to understand what the code does and how it works. Source code can be a particularly compelling piece of evidence in determining how a product works, because in many legal matters, the exact details of how an algorithm or data structure is implemented become a crux of the case.
In some cases, this process might involve analyzing just small sets of code, but in other cases, I may need to review hundreds of thousands of files. In copyright cases, I may need to find specific lines of code among literally millions, and then analyze coders’ comments, compare two sets of code, or opine on creativity or uniqueness. In other situations, I’ve needed to write and deploy programming scripts to study the structure of the source code itself.
For example, I worked with a great Analysis Group team on a case involving the Telephone Consumer Protection Act (TCPA). In that matter, I analyzed the code that was being used to process phone numbers to determine whether the numbers were randomly generated, and which parts of the process required humans interacting with the user interface. In other matters with Analysis Group, I have reviewed patents and compared them to processes laid out in source code to assist lawyers in understanding whether patent infringement may have occurred.
Ms. Arcelus: How does your experience help you to distill down a very complex product into details a judge or jury can connect to?
Professor Mitzenmacher: I think being a teacher is great preparation for an expert witness. As an educator, I am used to thinking carefully about how best to present subject material and proactively resolving any confusion that may occur. I utilize this skill set frequently in trial, when I am often trying to teach a layperson about a technology they may not be familiar with – my goal is to provide them with accessible and accurate information. I also find that simple figures and illustrations can go a long way to helping give intuition to those less familiar with technology and algorithms.
Mr. Llop: Are there any particularly unusual or interesting moments you run into in your line of work?
Professor Mitzenmacher: Since source code is often considered a highly confidential business asset, access to its review can be limited. One unusual challenge I face is traveling across the country (and in non-pandemic times, across the world) to review source code in locked-down “code rooms.” This type of review requires me to do all my work in situ and without the benefit of an internet connection, so I have to make sure that I bring all the tools I need myself.
In terms of interesting moments, I particularly enjoy opportunities to bring my specific academic expertise to bear in a case. In fact, in one case, some of the research I did twenty years ago was brought up as prior art! ■