New Product Research: A Dynamic Approach to Feature Prioritization
White Paper about Bracket™
Have you had this problem? You need to measure preference for the features in your product/service but there are just so many of them it seems like an impossible task. Using conventional approaches you would have asked about the importance of each feature on a scale. But we all know how that story goes. With no other constraints, respondents don't have an incentive to say that anything is unimportant. You could use constraint-based methods like constant sum scales, but cannot realistically deal with more than a handful of features at a time. Over the last few years, the most popular technique to address this kind of feature prioritization problem has been Max-Diff (see white paper on Max-Diff). But using Max-Diff when there are more than a dozen attributes becomes a real chore. So what can you do when you have dozens of features that need to be efficiently culled? Let's first start with a look at a standard Max-Diff approach.
A New Approach
Sounds like a winner, but how do we know that we are actually getting good quality data on the back end? We ran tests to confirm this and here is what we found.
A Bracket™ Example
The subject in this case was how movie-goers make decisions about which movie to see and where to see it. One can imagine many such factors: the stars, the director, the theater location, the show timing, etc. We imagined 18 of them and constructed a study where one cell of respondents was provided a standard Max-Diff task, while another cell was provided a Bracket™ task.
We know which profile options respondents picked in the holdout validation tasks in the study. Based on the individual level utilities (preferences for features) that we got from the analysis we can also make a prediction of which options they would pick. Comparing the two tells how well we did in estimating respondent preferences. So what did we find?
What we found surprised us. The validation hit rate for Max-Diff was 47% and for Bracket™ it was 45% (these numbers are in the expected range given the type of validation task and the domain). In other words, there was almost no difference between the more rigorous Max-Diff approach and the less rigorous, but more engaging, Bracket™ approach. How could this be? We further investigated the issue by examining carefully the information from each round of Bracket™ and found that information respondents provided in the very first round were highly useful. Think about that for a second. In the first round, respondents see each feature just once in random groupings of three. But the information they provide there, combined with the Hierarchical Bayesian analysis is capable of producing a very robust foundation on which the next two rounds can sit. This tells us that Bracket™ is capable of producing very nearly Max-Diff type information content.
As mentioned before, we deliberately set the test up to be fair to Max-Diff and therefore chose to use only 18 features. Our contention is that when there are far more features Bracket™ will be the only useable method as Max-Diff would be too tedious. This validation study, well, validates that notion by establishing the robustness of Bracket™. However, it also raises the question of whether Bracket™ could be used in cases where there are fewer features to test and we are hard pressed to argue against that.
We have applied this approach to several studies with feature sets varying from the teens to the fifties and have found it possible to elicit individual level preferences while keeping respondents engaged. We don't see why even larger feature sets cannot be practically prioritized.
Another advantage we have learned by applying Bracket™ is that the tournament-based approach is particularly well suited in situations where the features are not well distinguished from each other. In such cases standard Max-Diff tasks become even more difficult for respondents, whereas Bracket™ tasks are well suited to help respondents focus on the most important ones. Consider message testing where many of the messages being tested may be only subtly different from each other. As we found out in a test, regular Max-Diff can lead to a confusing mess, whereas Bracket™ can provide quite clear results.
Feature prioritization is a very common new product research problem. However, as the number of features increases into the teens and beyond it becomes difficult to use state-of-the-art methods like Max-Diff without substantially increasing the tedium quotient of the study. Bracket™ is a tournament-based approach that produces Max-Diff like results and can easily prioritize fifty or more features.