| About Us | Contact | Donate | User Blogs | Login |
A Quick Reminder Regarding Error Margins
As we draw ever closer to election day, we see more and more people referring to polls being "within the error margin" or "outside the error margin" or even worse, "a statistical tie." Even Tom Sowell made the statement today.
So let's remind ourselves quickly: When you are dealing with polls, you are always within the error margin. If a poll shows one candidate at 60% and another at 40%, the poll is still within the error margin.
Polling and error margins can be thought of this way: There is a giant bag of 1,000,000 red, blue, and green marbles. You want to know what the bag looks like. If you draw three marbles, there's a chance that you will draw one blue, one green, and one red marble, and know the "true" value. But there's also a good chance that you'll draw all blue, all green, or all red marbles.
The more marbles you draw out, the less likely it is that you have drawn a sample of marbles that does not resemble the bag. But it is almost always a possibility that, if you draw, say, 10,000 marbles, that you could draw 10,000 marbles of the same color, or 10,000 marbles made only of red and blue marbles. Even doing everything possible correctly, you can whiff. When you are sampling, there are always error margins, and they are determined in part by the sample size.
I say "in part" because -- and this is important -- with a given sample size, there are an infinite number of error margins. The question is, how confident do we want to be. If a poll of 750 people is conducted and the result is 50%, we can be 95% confident that the "true" result is located somewhere between 53.6% and 46.4%. HOWEVER, we can also be 90% confident that the "true" result is located somewhere between 53% and 47%. We can be 75% confident that the "true" result is located somewhere between 52.1% and 47.9%. And we can be 51% confident (ie, it is more likely than not the case) that the "true" result is somewhere between 51.25% and 48.75%. Almost all polls use 95% as their level of confidence, but it is important to remember that this is what they are saying, and if something is barely within in the error margin at 95% confidence, we're still pretty darned sure there is a difference.
You should also remember that most polls have two error margins, not one. Even very smart commentators mix this up. Since you are sampling two candidates, each has its own number. This is somewhat complicated by the fact that the two numbers are semi-binary -- to oversimplify, Obama's vote share can only go up so much while McCain's also goes up, and the "undecideds" category, plus the shape of the bell curve, complicates things. The rule of thumb is that the error margin for the spread is about 1.7 times the actual error margin, but you aren't going to kill yourself if you simplify and use 2x the error margin as your guide (to 95% certainty).
Because there are an infinite number of error margins, its impossible to isolate methodological error from sampling error. So just remember that both are almost always present, and you'll be fine.
Play around with this for a while, if you want to see better how this works.
And remember, this is just sampling error. There is also what can be thought of as "methodological error." In reality, society is not distributed like a bag of marbles (more or less evenly). It is distributed like a Snickers bar. If you grab into a shaken bag of marbles, you'll probably get a good distribution, and it shouldn't vary much regardless of where you grab into the bag of marbles.
A Snickers bar is different. If you slice even slightly off-vertical, you'll get too much caramel/peanut mix, or too much nougat. Heck, you might even end up with only chocolate.
In the real world, you end up dealing with methodological problems that prevent pollsters from getting a good "slice" of society such as people lying about whether they will vote (solved somewhat by likely voter screens), trying to reach cellphone-only households, or the biggest problem for modern pollsters: Increased rates of declined response. You also have pollsters who inadvertantly "prime the pump," by asking, say, a question about Bush Job approval right before a question about who people will vote for. Even slight changes in questionairre wording can affect the response elicited by the poll question. So even when you see two pollsters outside of the 95% error margin, there might not be a difference between what they are reporting; they might be effectively conducting two completely different examinations of society!


Comments
All at once class: "Thank you Mrs. Donovan"
If you're going to treat your readers as though they were third graders, perhaps you should use jelly beans instead of marbles. That might help you hold their attention.
There, using candy is better. But didn't you know that Snickers is for liberal pussies that don't live in the "pro-America part" of America?
Try using Baby Ruth. Now there's an all American candy bar real Americans can sink their teeth into.
But I'm sorry to go off topic,... you were talking about McCain losing his marbles or something...
Cheers.
Can't ... resist... mocking...
Sean must have at least gotten a C+ in statistics class durin' all dem yeers when he off gettin' his fancy book lernin'.
How'd you do that Sean? I can see it now.
"Look, perfessor... I got this here giant bag full of marbles... and when I reach in and pull one out you'd think it'd be green... BUT LOOK!!! NO!!! Itl's BLUE!!!.... Go figure."
Devastating: Obama +10
America shows the love!*
http://www.msnbc.msn.com/id/27297013/
*YMMV
Thanks Sean
Excellent article. And I am a liberal that wants Obama to win!
A little editing could have
A little editing could have got that down to a sentence.