Understanding Chi Square
Chi Square lets you know whether two groups have significantly different opinions, which makes it a very useful statistic for survey research. It's applied to cross-tabulations (AKA pivot tables) which are simply breakdowns like this:
This article starts with the theory, and then has guidelines for using the statistic:
When we eyeball our table above, it looks like women are much more likely to answer Yes, but is it random variation or something we can count on? What Chi Square does is compare the actual or Observed data we have from respondents with an Expected value. In our two questions, the total answers are:
If there were no relationship between the questions, then you would Expect a table that allocates those totals to look like this:
The formula for the upper-left cell is:
(TotalYes * TotalFemale) / TotalTable
( 60 * 50 ) / 100
In less tidy examples, the Expected values often have a decimal or two. Once we have all the Expected values, we need to find the difference squared (so they're all positive) between the individual cells' Expected and Observed values:
D = ((O - E)2 / E)
|Total||E&O: 60||E&O: 40||E&O: 100
Adding all the differences, we get a Total Chi Square of 37.5—which is yet another interim value in this calculation. So on to the next stage.
Many statistics rely on a concept called Degrees of freedom. The details vary stat to stat, but it's based on the number of variables involved in a calculation. For Chi Square, the degrees of freedom are:
df = (# rows - 1) * (# columns - 1)
= ( 2 - 1) * ( 2 - 1) = 1
In our cast we now have:
- Assorted Observed and Expected values
- Total Chi Square = 37.5
- Degrees of freedom = 1
We have two more players, and those are the Probability and Critical Value.
Any time you have a statistic designed to "predict" for a larger population or tell you a value's validity or reliability, part of the calculation is a level of confidence. Sometimes you'll see this indicated as the level of risk such as 5%, and at other times it will be noted as the level of certainty, 95%. For Chi Square, the tables are based on the level of risk, with common thresholds of 10%, 5%, 2.5%, 1% and 0.1%. Each one of those risk levels has a Critical Value associated with it:
when df = 1
|(More values—see the "Upper" table)|
Our final step to calculate Chi Square is to compare our Total to the Critical Values. In our case, 37.5 > 10.83 which means it's even more than 99.9% significant. If instead we only came up with a Total of 4.5, that's > 3.84 so we'd say it was 95% significant.
If you're lucky, you have a survey software or statistics program which will take your Observed values and crunch everything for you—some won't even make you specify a probability first.
If you don't have an application which makes this easy, try the excellent on-line calculator Professor Jeff Connor-Linton has posted on his Georgetown University site. (Note: Link removed after resource disappeared—suggested replacements appreciated!) Not only does this come up with the final significance in relatively plain English, it has a "verbose" mode which will give you all the calculations along the way.
While Microsoft Excel has a CHITEST function, it takes a bit of hand work. You have to manually generate all the Expected values, and all it does is give you the Total Chi Square (our 37.5). To get the probability, you have to pair it with the CHIDIST function, manually giving it the degrees of freedom.
Chi square can be used with any pair of single answer discrete questions. This includes:
- Likert scales
- Cities, product names, instructor names, etc.
- Dates once they've been grouped into periods
- Numbers once they've been grouped into ranges
The answers do not need to be ordered, equal or symmetrical—just discrete. This is part of what makes Chi Square a handy statistic for surveys.
"Mark all that apply" questions cannot be used as an individual respondent cannot exist in more than one cell of our table. For example, a woman answering the survey can't appear in both the Yes and No columns.
Presenting the information:
While the statistic has to be calculated on the counts, that's not necessarily the best approach for our brains to spot patterns. For example in this table we have over 3 times the number of In Store respondents as On-line:
In a report, it's easier for our brains to compare percentages:
You still want to keep the count totals in the report so that readers know the relative sizes of the groups.
Cross-tabs can also be well suited to graphical views, including stacked bar charts, bar graphs and line/profile graphs.
Low count cells:
The guidelines on this vary, but if you have more than one cell with 5 or fewer respondents, the final calculation may overstate your level of probability. If you do have this situation, either wait on this statistic until you have more data, or combine categories.
Dropping answer options:
In our original example, our column scale might have been "Yes/Uncertain/No." If the Uncertain column totaled 0, we would have to drop it as the Expected values for it would have all been 0. This means the difference calculation would be attempting to divide by 0, which is challenging.
Completely empty rows or columns are the only answers you should ever drop. Even if there was just 1 response in the Uncertain column, you need to include that individual in the table for the statistic to be reliable. We can, however, combine Uncertain with Yes or No if needed.
This is used to increase the counts of cells when you have too many with infrequent responses, or simply to clarify the relationships for your analysis.
With an ordered scale such as a 5 level Likert, this could take the form of combining the upper and lower categories into a 3 level "Agree/Neither/Disagree" breakdown.
With unordered data such as product names, you might combine into categories. With city names you might group the information into geographic regions or urban/rural classifications.
The main issue is to make sure the categories are sufficiently related that you're not masking a relationship. When in doubt, first run the cross-tabulation and Chi Square on an expanded table, then start combining.
Questions left blank:
In surveys respondents will often skip one or both of the questions in your comparison. If this represents more than a couple people, you may want to add a "No Answer" or "Empty" row and column. Just as with non-response sampling errors, sometimes there's a relationship in the people who don't give an answer.
And that's Chi Square in a nutshell! (Or as close to nutshells as inferential statistics get.)
I rely on Query Group for prompt, accurate turnaround of my reporting projects. Being able to call on Ann lets me take on projects when my full time staff is already busy.