Using a recent example to illustrate his points, the author discusses the importance of using the correct statistical test.itor’s note: Stephen J. Hellebusch is president of Hellebusch Research and Consulting, Inc., Cincinnati .
One phenomenon that has always baffl is why anyone would hire an expert to work on a project and then order the expert to ignore his/her knowlge and do it wrong.
A recent experience drove home
The country-focused e-mail list will definitely help in targeting your marketing campaign to reach the specific audience. You can create customized content for better engagement, local trust, and unique customer preferences. The result is pretty country email list simple: increased engagement, higher conversion rate, and greater return on investment. Moreover, the locally trended and regulated campaigns are going to be more relevant, ensuring greater attachment to the possible customers. Be at the right place in front of your targeted audience and leverage maximum results out of your marketing strategy!
As many marketing researchers
Familiar with statistics are aware, you use a different statistical test when you have three or more groups than you do when you have only two. The automatic statistical testing that is very helpful in survey data tables does NOT “know” this, and cheerfully uses the two-group test in every situation, regardless. Each statistical test addresses a slightly different question, and the question is very important in the selection of the correct test to use.
In a recent “pick-a-winner” study, we had three independent groups, each one bas on a different version of a concept – Concepts A, B and C. We us a standard purchase intent scale (definitely will buy, probably will buy, might or might not buy, probably will not buy, definitely will not buy), and the question was: How do we test to see if there is any difference in consumer reaction to the three?
The first test was analysis of variance (ANOVA), which addresses the question: Do the mean values of these three groups differ? We us weights of 5 (for “definitely will buy”), 4, 3, 2 and 1 (for “definitely will not buy”) and generat mean values for each of the three concepts.
The ANOVA show that the means did not differ significantly at the 90 percent confidence level, which leads to the conclusion that consumer reaction to these three concepts on purchase intent does not differ, on average.
At this point in the project
The client was displeas, and told us to test using the “definitely/probably will buy” percentages (the top two box). This is another testing option that makes sense. The chi-square test addresses the question: Do these three percentages differ?
It is the proper test to use she marveled at the power of fate and the unexpected when there are three or more percentages to test across three different groups of people. We conduct it, and learn that the percentages did not differ significantly at the 90 percent confidence level. It told us that, with respect to positive purchase interest, across all three products, the consumer reaction in terms of the top two box was the same.
The wrong test
The client was displeas. Having conduct the testing himself, he learn that Concept B was significantly lower than Concept A, both in the top two box and in the top box, at the 90 percent confidence level. He told us not to use the chi-square, but to use the test the data tables use.
The Z test address the question
Do these two percentages differ? When it is misus in a situation where there are three or more groups, this testing method disregards key information, and makes the determination after having thrown out data. To please the client, we conduct multiple Z tests and determin that there were no statistically significant differences between any of the three pairs (A vs. B; A vs. C; B vs. C) at the 90 percent confidence level.
The client had another person in his department conduct the test, and that testing show, as the client’s had, that the top two box for A was significantly higher than B’s at the 90 percent confidence level.
Fairly confus at this point, we ran the data tables, which show, exactly as the client said, that A was significantly higher than B at the 90 percent confidence level, both on the top two box and on top box percentages.
The less-preferr formula
We then conduct the three tests by hand, and compar our Z values with the client’s. We learn that the client, his department mate, and the statistical testing in the survey table program all us the less-preferr Z test formula. There are two versions of this test. One of them does not use the recommend correction for continuity.
This, essentially, is a very small adjustment that should be made because the basic formula assumes a continuous variable (peanut butter) and we are actually working with a discrete variable – people (peanuts; the count of respondents making up the top two box).
Normally, it makes no difference in the results, because it is so small. In this case, however, it made the difference between crossing the line into significance and not crossing it.
With that resolv, we discuss the phone number my client’s desire to test every row of the scale, with the wrong statistical test, using the less-preferr formula. We were told that the client always does this, and that we should do so. So, we did.
The wrong way
This procure violates the fundamental logic behind testing. By testing the top two box, we have test the difference between these three concepts on this scale. When we test the five rows of the scale (and various other combinations of rows), using multiple Z tests, the probabilities are so distort that it is doubtful anyone knows the confidence level at which we are really operating.
So, we successfully us the less appropriate formula with the wrong test and follow the wrong procure for testing. We remain baffl.