Evaluating the But-For World: Surveys, Experiments, and Market Data

Evaluating the But-For World: Surveys, Experiments, and Market Data

Class actions alleging harm from defective automobiles have unique characteristics, due to the vast array of features that can factor into consumer choice. Managing Principal Mark Gustafson talked with academic affiliate Olivier Toubia, Glaubinger Professor of Business and Faculty Director of The Eugene Lang Entrepreneurship Center at Columbia Business School, about different methodologies for evaluating the “but-for” world in complex cases like these.

Olivier Toubia: Glaubinger Professor of Business and Faculty Director of The Eugene Lang Entrepreneurship Center, Columbia Business School

How do auto defect class action cases differ from other types of false advertising or consumer products class actions?

Prof. Toubia: Auto defect class actions are unique in a number of ways due to the nature of the product involved and how vehicles are sold. First, unlike many consumer products, vehicles are highly complex products with scores of features, some “big” or important to consumers, and some that matter less or of which consumers may not even be aware. Even if one feature is allegedly defective, the vehicle generally still provides mobility to its owner and the benefits associated with the other features remain unchanged.

Moreover, features that are important to one buyer may be of little or no value to another or may even have negative value. To other buyers, the feature may not even have been a factor in the purchase decision. For example, some buyers are committed to certain vehicle makes and others want what they consider to be the best car regardless of the manufacturer.

Second, the transaction itself is complex. The transaction usually involves a test drive followed by individual negotiations between buyers and sellers, and depends on numerous transaction-specific characteristics, such as the information available to both parties, whether there was a vehicle traded in, and the incentives facing each party at the time of the transaction. For example, a dealer may be more willing to accept a lower price at the end of the month in order to qualify for an incentive bonus.

What are some of the methods used in auto defect class actions to evaluate consumer harm?

Prof. Toubia: Plaintiffs seek to quantify damages in a number of ways. For example, some seek to determine the amount of money needed to correct the alleged defect, while others seek to quantify any increased costs allegedly incurred by vehicle owners. Another method plaintiffs commonly use to evaluate how much less putative class members would have paid for the vehicle in dispute in the “but-for” world is conjoint analysis. Unfortunately, while conjoint analysis is a great marketing research tool, in my experience it is often misapplied in this type of litigation.

Please provide a high-level summary of what conjoint analysis is and how it works.

Prof. Toubia: Conjoint analysis is a widely-employed survey method and associated analysis that has long been used to quantify how consumers would trade off among various features of a product. In a standard, “choice-based” conjoint survey, each respondent is asked to make a series of choices. Each time, the respondent is asked to choose his or her preferred product (or “none of the above”) from among a set of hypothetical products. Such hypothetical products are constructed as a set of features or attributes, each with specific “levels.” For example, if a hypothetical car is characterized by the attributes of price, manufacturer, MPG, cylinder type, and color, the color feature could have “levels” of blue, red, white, and yellow.

The fact that the respondent is asked to make a series of choices, with the level for the feature of interest varying across choice sets and products within them, makes the design “within subject.” This means that the respondent becomes aware of which features of the product change from choice set to choice set and how they change.

The choices of the survey respondent reveal the way in which that consumer trades off features for each other. For example, if a respondent chooses a red car that is $300 more expensive than an identical blue car, it means that the respondent is willing to pay at least $300 more for the color of a car to be switched from blue to red. If, in another choice set, the same respondent chooses a blue car over an identical red car that costs $400 more, we can conclude that the respondent’s maximum willingness to pay for red over blue is somewhere between $300 and $400.

Assuming that the choice sets are appropriately designed given marketplace conditions, analyzing responses to multiple choice situations from the same respondent allows us to estimate the respondent’s maximum willingness to pay, or WTP, for each level of each feature in the study.

However, it is important to note that respondents have limited time and attention, so you cannot test for all combinations of all features. So, researchers usually select six to eight features, each with two to six levels. That means that the respondent may be presented with a handful of features, some “big,” like price, make, and model, and some much “smaller,” like the quality of the sunroof.

When a respondent is focused on a small feature that is presented alongside larger features, the small feature may suddenly appear important to the respondent, which may result in a higher willingness to pay. This is an inherent issue with using conjoint analysis in litigation for products with scores of features, including minor ones. Placing a small feature alongside larger features may inflate the relative importance of the small feature, and bias WTP estimates upwards.

Many of the alleged defects in auto defect class actions are relatively minor or are just one feature out of scores of features on a vehicle. Can conjoint analysis generate unbiased results in this context and, if so, are there ways that the analysis needs to be tailored to avoid bias?

Prof. Toubia: Indeed, minor product features are usually not the purchase drivers for the entire product. One possible solution is to ask survey respondents to make choices among parts of the product rather than entire products. For example, instead of asking survey respondents to choose an entire car, you could ask them to choose just from upgrade packages that contain the product feature at issue in the case. In doing so, you avoid, or at least reduce, any potential bias associated with minor features not being the purchase driver of the entire product.

However, if the outcome of such a survey is ultimately used in a damages calculation, one has to be careful to use it consistently with the survey design. For example, a survey offering respondents a choice among upgrade packages makes the assumption that the respondent will buy the underlying vehicle. So the damages calculation in this case can only account for those who would buy the car, even if allegedly damaged. The damages calculation cannot assume that some of the consumers would not buy the car at all.

More generally, a survey has to approximate the underlying market. If the feature of interest, such as the car seat heater, is brought to the respondent’s attention, it is automatically assumed that consumers pay at least some attention to the feature in the actual marketplace, which is not necessarily the case for every consumer.

Another issue in auto defect class actions is that the manifestation rates of the alleged defects are often very small and may represent only a marginal increase of an already existing baseline risk. For example, an allegedly defective sunroof may have a higher probability of shattering than a sunroof not at issue, but both are very unlikely to shatter. Can you overcome the fact that people have difficulty weighing small changes in risk for low probability events?

Prof. Toubia: This is a tricky issue. As we know from behavioral economics research, people are generally bad at making judgments about small probability events. That is, “weighting” of probabilities is not linear. Generally, consumers tend to overweigh smaller probabilities in their decisions. This means that the difference between a zero probability of malfunction and a small positive probability of malfunction may actually have a significant impact on consumers’ decisions.

This also means that if a conjoint analysis only varies a probability of a potential malfunction within a given range, extrapolating to other ranges may be tricky. We may mitigate this problem by using attributes with more than two levels. For example, instead of just “no probability” / “some probability,” one could use very specific probabilities such as 0.5%, 1%, 2%, 3%, and 5%.
"But a typical mistake in using conjoint analysis in litigation is equating WTP with the market price. How much consumers are willing to pay is not the same as how much they would have actually paid but for the alleged defect (i.e., the but-for market price)."
A common damage claim in auto defect class actions is that putative class members would have paid a different price but for the alleged defect. Can conjoint analysis determine how much a buyer would have paid?

Prof. Toubia: In economic terms, we want to establish the market price of the feature allegedly not delivered. Market price is determined by the “invisible hand” from the demand side of the market (e.g., how much consumers are willing to pay) and the supply side of the market (e.g., costs of labor and parts). Conjoint analysis, like any survey of consumers, only collects information on the demand side.

As I mentioned, conjoint analysis can tell you the most that a given respondent will be willing to pay for a given feature. Averaged across respondents, that amount is called average WTP or average maximum WTP. If the survey sample is representative of the market of interest, this amount can be treated as the estimate of the market WTP.

But a typical mistake in using conjoint analysis in litigation is equating WTP with the market price. How much consumers are willing to pay is not the same as how much they would have actually paid but for the alleged defect (i.e., the but-for market price). For some consumers, the market price is below their WTP, so they just pay the market price. For other consumers, the market price is above their WTP and they do not buy the product all.

For example, assume the iPhone XS is offered at $999 with a 5.8-inch screen size and at $1,099 with a 6.5-inch screen size. That is, the market price of the extra 0.7 inches of screen size is $100. Even if someone is willing to pay up to $200 for the extra 0.7 inches of screen size, that person will still only have to pay $100 more (not $200). And someone who is only willing to pay up to $50 for the extra 0.7 inches of screen size will not purchase the iPhone XS with the larger screen at all. Thus, in this example the average willingness to pay for the feature of these two consumers is $125 (based on the average of one WTP of $200 and one of $50), but the market price of the feature is just $100.

You mentioned that the estimate of the market WTP is an average across individual respondents’ WTPs. Ignoring any of the issues discussed above, can conjoint analysis tell us the value any particular putative class member places on the alleged defect?

Prof. Toubia: Conjoint analysis provides individual WTPs for the survey respondents. The average WTP across respondents and the distribution of individual WTPs across respondents serve as estimates for the average WTP in the market and the distribution of what consumers will be willing to pay for a given feature. Neither of these provides the researcher with any information on how a particular consumer who did not participate in the conjoint analysis would value the given feature. The distribution of individual valuations can be helpful in demonstrating the amount of variability in the individual WTP values. If there is variation across respondents, this would call into question the wisdom of relying on an average WTP value.

Also, as I mentioned, due to its “within subject” design, conjoint analysis highlights the feature of interest for the respondents because it is one of just a few features used in the analysis. In contrast, in the real world there are scores of features buyers consider when purchasing a vehicle. As a result, WTP predictions from a conjoint analysis may be overstated.

Is there a way to correct for this bias?

Prof. Toubia: One alternative to a full-blown conjoint analysis could be a simple “between subjects” choice experiment, where respondents choose from several options just once, rather than making a series of choices in which they are aware of how features change from choice set to choice set. A choice experiment is similar to an A/B test where respondents choose from, say, three vehicles (or upgrade packages). For the test group, one option would be the at-issue vehicle with the alleged defect and the other two options would be competing vehicles without the alleged defect. For the control group, everything is the same, except that the at-issue vehicle is also presented without the alleged defect.

Assuming the choice set sufficiently approximates the choices of relevant consumers in the marketplace, if the percentage of respondents choosing the at-issue vehicle is not statistically significantly different between the test and control groups, the researcher can conclude that the defect is not material to consumer choice.

Are choice experiments subject to the same criticisms as the conjoint analysis?

Prof. Toubia: The main advantage of a choice experiment over a conjoint analysis is that the only feature that will vary for respondents is the feature of interest. In addition, a choice experiment is “between subjects,” which means that no single respondent gets exposed to both the version of the test with the allegedly defective feature and the version without the allegedly defective feature. Respondents also choose only once, and the descriptions of the choice options can be as long and as detailed as needed. Consequently, it is much less likely that a “small” feature that may be at issue in a litigation, but is not as central to consumer choice, will get an artificial boost simply from being included in the study.

Notably, a conjoint study can also be conducted “between subjects” with the feature of interest varying only between two study groups and not for any given respondent. The results of the conjoint analyses for the two groups are then compared to learn whether the WTP varies between the two groups.

Are there any methods available to estimate more directly whether an alleged defect affected the price of a vehicle?

Prof. Toubia: The appropriate procedure depends on the specifics of the case and the choices available in the actual world. In some instances, you can use marketplace data rather than conduct a survey or an experiment. For example, you could employ used-vehicle transaction data, such as from Kelley Blue Book. In that case, we could compare how the valuation of the at-issue cars differs from similar cars that were not included in the litigation. ■

Related Resources

Practices

From Analysis Group Forum: Spring 2019

AG Feature

Evaluating the But-For World: Surveys, Experiments, and Market Data

How do auto defect class action cases differ from other types of false advertising or consumer products class actions?

What are some of the methods used in auto defect class actions to evaluate consumer harm?

Please provide a high-level summary of what conjoint analysis is and how it works.

Many of the alleged defects in auto defect class actions are relatively minor or are just one feature out of scores of features on a vehicle. Can conjoint analysis generate unbiased results in this context and, if so, are there ways that the analysis needs to be tailored to avoid bias?

A common damage claim in auto defect class actions is that putative class members would have paid a different price but for the alleged defect. Can conjoint analysis determine how much a buyer would have paid?

You mentioned that the estimate of the market WTP is an average across individual respondents’ WTPs. Ignoring any of the issues discussed above, can conjoint analysis tell us the value any particular putative class member places on the alleged defect?

Is there a way to correct for this bias?

Are choice experiments subject to the same criticisms as the conjoint analysis?

Are there any methods available to estimate more directly whether an alleged defect affected the price of a vehicle?