We love Max-Diff! It is the industry gold standard for feature prioritization, and with good reason. It has been documented in journals, articles and white papers countless times how it is superior to typical Likert rating scales. The nature of the task forces respondents to make a trade-off among subsets of items, choosing the “best” and “worst” item within each group. After some modeling, the items are typically scored on a relative scale from 0-100, where both the rank order and the distance from one item to another is observed. And unlike rating scales which tend to have scores clustering on the high end, Max-Diff results in a nice spread of scores clearly indicating which items are relatively superior.
But, how do we know that the winning items are actually appealing to respondents, and not just the best of a set of bad options? Max-Diff scores are relative, meaning they only compare the items to each other. But we don’t have any information about an item’s absolute preference.
Luckily, we have a couple options.
Suppose a potato chip manufacturer wants to test out 10 new flavors and we run a Max-Diff exercise to get the order of preference. From the figure below, we see flavor A is leading the pack, with flavors B & C not far behind, and the rest further down.
And while this is good information, our chip manufacturer needs to be sure the winning flavors aren’t the best of bad options. One approach, I like to think of as the brute-force method, is to include a couple current chip flavors in the exercise for comparison. Since we know how well they perform in the real market, their comparison to the other chip flavors will be telling.
In the figure above, flavors X & Y are our current flavors. Chip X performs a bit above average, and chip Y a bit below. Now when we take a look at the ordering of our new flavors, we can make some conclusions about their absolute appeal that we weren’t able to before.
But, what if our chip manufacturer is new to the market and doesn’t have any current flavors for comparison? An alternative method is to calculate a threshold, above which the items are perceived as appealing, and below they are unappealing. This is the finesse method. It requires asking a handful of additional questions in the survey to gauge appeal, and incorporate those responses in our modeling. The result is similar to our original Max-Diff of all new flavors, except now we can draw a threshold line that informs us about our true appealing flavors.
So Max-Diff is great, but we need to be aware of what the results are really telling us. And with a couple tweaks to our study, we can walk away with a greater understanding of item preference.
Wes, ever the observationalist, is struck by perceived curiosities in current world happenings, anywhere from culture to politics to sports. His mathematical background and thinking serves to highlight the disparity between emotional and rational happenings, and attempts to find answers via logical and calculated means.