You have all seen news reports or ads saying "A scientific study shows that ...". This may or may not be telling you much. How about "3 out of 4 dentists surveyed recommend Barfo toothpaste?" That may mean that only four dentists were asked and the fourth one didn't agree with the other three. What is the size of the sample; HOW MANY dentists were surveyed?
How do you find out such things? Suppose you wanted to find out something about a preference of all the students on campus. The best way would be to get a list of all students and ask every one of them for their preference. This would be a "census", in which every member of the group is polled. There's no error here - you ask everyone.
Unfortunately, taking a census is normally not feasible. You have to resort to studying a smaller number. This is called sampling, and it is the bane of statisticians and pollsters. If it is done right, you can sample a few percent of a population and come up with a reasonable estimate of the opinion or preference of the whole population. It's relatively easy to understand what is required for a good sample and doggone near impossible to do it.
Back to our campus example. Since a census is not feasible, we must take a sample. How to do it? If you are lazy, you might go out and stand by the campus flagpole and nab students who walk by. Random sample? Nope. It will ignore all students whose schedule does not have them walking by the flagpole that day. How about printing a response form in the campus newspaper and asking students to send in their answers. Better? Nope. You'll get responses from those who have a strong feeling about the subject. You won't hear from those who don't give a hoot.
The Random Sample
So how are we going to get a TRUE random sample of the student population? There's only one way - get a list of ALL students registered. Select a sample with a random number generator or table. Talk to those students - ALL of them. You may have to chase some down, but you have to reach ALL of them.
Here's the principle. There are two things you have to do. First - you need to correctly define the population you are interested in sampling. This is not as easy is it may sound. The other requirement is making sure that EVERY member of that population has an EQUAL probability of being chosen for the sample. Once you have chosen your sample, you have to really work to contact EVERY ONE of those chosen.
The real bugaboo of sampling is something called a "selection effect." This is something which influences the choice of a sample and wipes out the required randomness. Here are some possible situations that involve selection effects.
Volunteer sample. Suppose our student sampler decides to advertise in the campus newspaper for people to be interviewed for the preference study. The "sample" will consist of people who volunteer. People who don't care about the question won't bother.
Volunteer response. A lot of magazine "polls" fall into this category. The magazine prints a hot-button question and gives you a PO Box, an e-mail address, or a phone number. You respond to the poll be mail, e-mail, etc. This is a volunteer response. People who don't care won't bother.
Convenience sample. Nabbing students who walk by the flagpole is taking a convenience sample. You do this because it is easy. The problem is that a large number of students are excluded.
Failure to define population. This would occur if the student doing the study sampled students in the business school only. Here the sample excludes most of the university.
A selection effect causes some members of the population to be chosen preferentially over others. This is NOT random sampling. An undetected selection effect can invalidate a study. On the other hand, a selection effect can be creatively used to make a study come out in a specific and desired way. Read "Tainted Truth" by Cynthia Crossen. It's in the auxiliary reading list page.
The magazine "Literary Digest" was popular in the 1920's and 1930's. They had a practice of polling their readers to predict the outcome of presidential elections in the U.S. The Digest has run up a good record of predicting elections - before 1936. Franklin Roosevelt was running for a second term against Republican Alf Landon. The magazine mailed their questionnaires to the chosen population: magazine subscribers, automobile owners, telephone subscribers and voters. They got 23% of 10,000,000 questionnaires, tabulated the results, and confidently announced that Landon would win by a 3 to 2 margin. When the actual election came around, Roosevelt won by a large margin. How could the Digest have blown the poll so badly? To add insult to the injury, a young pollster named Gallup surveyed only 50,000 people and called the election correctly. What went wrong? What did Gallup do right?
The Digest made not one, but TWO major errors of sampling. First - look at the population they mailed the questionnaires to. In 1936 those who owned cars, had telephones or subscribed to various magazines tended to be more educated, have higher incomes (this during the Depression) and be Republican! This group was decidedly NOT a sample of the entire U.S. voting population. Second - and probably more serious - was the way the poll was conducted. The Digest mailed out the questionnaires and asked the people to respond by mail. Recognize this? It's a volunteer sample! Those who took the time to fill out the questionnaire and return had strong feelings - they didn't like Roosevelt and WANTED Landon to win! Those who liked Roosevelt didn't bother!
Those two sampling errors bombed the poll. Oh - and what about Gallup? His sample was far smaller - only 50,000. The difference was that Gallup had figured out the importance of good samples and obtained a reasonably good RANDOM sample of the U.S. POPULATION. His poll prediction was correct. The final insult is wonderful - Gallup took a sample of 3,000 people at random from the same lists that the Digest used and sent them a postcard asking them how they planned to vote. Using the same lists and the same volunteer response method yielded a result almost identical to the Digest's poll. Gallup not only predicted (before the election) that the Digest would get it wrong, but also correctly predicted by how much!
How Gallup's poll worked
To predict elections, the Literary Digest generated its samples from vehicle registrations and telephone directories. These were unrepresentative samples because, in the 1930s, cars and telephones were more often owned by middle and upper classes. In previous elections, the Digest had correctly predicted the result because rich and poor had voted alike. But in 1936, during the Depression, wealthier Americans more often favored the Republican, Landon, while poorer Americans more often favored the Democrat, Roosevelt.
Gallup's polling method in 1936 was a quota sample. "The idea was to canvass groups of people who were representative of the electorate. Gallup sent out hundreds of interviewers across the country, each of whom was given quotas for different types of respondents; so many middle-class urban women, so many lower-class rural men, and so on." Thus, even though Gallup had 3,000[?] respondents and the Digest had 10 million, Gallup's results better represented the entire population. [The class notes say Gallup had 50,000 respondents, but the source below says 3,000.]
Source: George Gallup and the Scientific Opinion Poll, from PBS's The First Measured Century, Program Segment 7.
However, quota sampling is not the best method of sampling. In the 1948 election, Gallup's prediction was wrong. Investigations found that quota sampling introduced error, as well as the interviewing method and misinterpreting the future of undecided voters. Starting with the 1956 election, Gallup switched from quota sampling to probability sampling, where everyone in population under consideration has an equal chance of being chosen.
So how does The Gallup Organization ensure a random sample today? Modern Gallup polls are conducted by telephone. (Today, unlike 1936, 95% of all households have a telephone.) However, they do not use telephone directories to generate their samples, since unlisted numbers are not represented. Instead, they use a computer algorithm to generate "a list of all household telephone numbers in the United States," then select a subset from that list to call.
Source: "How polls are conducted" by Newport, et al, excerpt from Where America Stands. posted on The Gallup Organization's FAQ page.Gallup help.
Refer to the article for more information and other important factors on how they conduct their polls.
American Idol Voting
The EXTREMELY popular TV show "American Idol" used a phone-in method of voting to "choose" the winner from the final two contestants. Viewers were given two phone numbers; you call one number to vote for contestant A and the other number for contestant B. Prof. Cotton's wife recalled one such choice between two very good performers. The decision of the voters was VERY close, but there was, of course, a winner. Both the winner and the runner-up put out CDs of their music. Funny thing - the runner-up sold more records.
Was the phone-in vote really valid? Unfortunately, in that case, no. What you have are automated systems that take the calls and count them. They answer the phone, respond in some way, and count a vote. Some class members had actually tried to phone in a vote but got a busy signal. They never got in to vote. This defines the problem - the vote-counting systems were saturated, counting votes at their maximum rate and still not being able to handle all incoming calls. A large, and unknown, number of calls never got through because of the overload. In this situation of saturation, the final vote count reflects ONLY the number of calls each system can handle during the voting period.
If the systems are fairly well matched, their maximum call handling rates will be nearly the same, which will ALWAYS result in the final vote being very close. The winner is the one whose vote phone number got the faster system! The ACTUAL number of viewers favoring each contestant is NOT KNOWN! That's because there is no way to know how many calls bounced on busy signals. All you know is that both contestants generated enough interest to saturate both systems.