Quiz: Designing Studies — Sampling and Experiments

Q: True or False: If a study has a very large sample size, it cannot be biased.

False. Bias is a systematic problem with how data is collected, not a random problem that shrinks with more data. The 1936 Literary Digest poll had 2.4 million responses and was catastrophically biased. A large biased sample just gives you a more precise wrong answer. Reference: Section 4.4

Q: True or False: In cluster sampling, the researcher randomly selects individuals from within each cluster.

False. In cluster sampling, the researcher randomly selects entire clusters and then includes all individuals within the selected clusters. This is what distinguishes cluster sampling from stratified sampling, where you randomly sample within each subgroup. Reference: Section 4.2

Q: True or False: A/B testing is a form of experimentation that uses random assignment.

True. A/B testing is an experiment in which users are randomly assigned to see version A or version B of a product, feature, or interface. The random assignment ensures that the groups are comparable, allowing causal conclusions about which version performs better. Reference: Section 4.7

Contributors

Quiz: Designing Studies — Sampling and Experiments

Test your understanding before moving on. Target: 70% or higher to proceed confidently.

Section 1: Multiple Choice (1 point each)

1. What is the key difference between an observational study and an experiment?

A) Observational studies use larger samples than experiments
B) In an experiment, the researcher deliberately imposes a treatment; in an observational study, the researcher observes without intervening
C) Experiments always use random sampling; observational studies never do
D) Observational studies are always biased; experiments are never biased

Answer

**B)** In an experiment, the researcher deliberately imposes a treatment; in an observational study, the researcher observes without intervening. *Why B:* This is the defining distinction. In an experiment, the researcher actively changes something (applies a treatment) and measures the effect. In an observational study, the researcher records what already exists without influencing it. *Why not A:* Sample size has nothing to do with whether a study is observational or experimental. *Why not C:* Both types of studies can use random sampling. Random *assignment* (different from random *sampling*) is what distinguishes experiments. *Why not D:* Both can be biased. Experiments can suffer from biased samples, poor blinding, or other design flaws. *Reference:* Section 4.1

2. In a stratified sample, the population is first:

A) Randomly assigned to treatment and control groups
B) Divided into naturally occurring clusters
C) Divided into subgroups based on a characteristic, then randomly sampled within each subgroup
D) Listed in order, and every kth member is selected

Answer

**C)** Divided into subgroups based on a characteristic, then randomly sampled within each subgroup. *Why C:* Stratified sampling first divides the population into strata (subgroups like income levels, age groups, or geographic regions), then takes a random sample from each stratum. This guarantees representation of all subgroups. *Why not A:* That describes random assignment in an experiment, not a sampling method. *Why not B:* Dividing into naturally occurring groups and selecting entire groups describes cluster sampling. *Why not D:* Selecting every *k*th member describes systematic sampling. *Reference:* Section 4.2

3. The 1936 Literary Digest poll predicted the wrong winner of the presidential election primarily because of:

A) A sample that was too small
B) Selection bias and nonresponse bias
C) Confounding variables
D) The placebo effect

Answer

**B)** Selection bias and nonresponse bias. *Why B:* The poll used telephone directories and car registrations — which overrepresented wealthier Americans (selection bias). And only 24% of those contacted responded, with respondents differing systematically from non-respondents (nonresponse bias). *Why not A:* The sample was actually enormous (2.4 million responses). Size wasn't the problem — the bias was. *Why not C:* Confounding involves relationships between variables; this was a sampling problem, not a variable-relationship problem. *Why not D:* The placebo effect applies to experiments with treatments, not to polls. *Reference:* Section 4.4

4. A confounding variable is best described as:

A) A variable that is difficult to measure accurately
B) A variable that is associated with both the explanatory variable and the response variable
C) A variable that has no effect on the outcome
D) A variable that is only present in experiments

Answer

**B)** A variable that is associated with both the explanatory variable and the response variable. *Why B:* A confounder creates a misleading association between two variables because it is related to both. Temperature confounds the ice cream–drowning relationship because it drives both ice cream sales and swimming (hence drowning). *Why not A:* Measurement difficulty is a separate issue from confounding. *Why not C:* By definition, a confounder does affect (or is associated with) the outcome — that's what makes it a confounder. *Why not D:* Confounders exist in both observational studies and experiments. Randomization in experiments helps balance confounders across groups, but doesn't eliminate their existence. *Reference:* Section 4.6

5. In an A/B test, what is the role of randomization?

A) To make sure the sample is large enough
B) To ensure the treatment and control groups are comparable in all characteristics except the treatment
C) To make the results statistically significant
D) To prevent participants from knowing which group they are in

Answer

**B)** To ensure the treatment and control groups are comparable in all characteristics except the treatment. *Why B:* Randomization distributes all characteristics — age, preferences, behavior patterns, device type, and any other variable — roughly equally between groups. This means any difference in outcomes can be attributed to the treatment (the layout change, the feature, etc.) rather than to pre-existing differences between the groups. *Why not A:* Sample size is determined by the study design, not by randomization. *Why not C:* Randomization helps ensure valid results, but doesn't guarantee statistical significance. Significance depends on the size of the effect and the sample size. *Why not D:* That describes blinding, which is a separate (though related) concept. *Reference:* Sections 4.5, 4.7

6. Which of the following is an example of survivorship bias?

A) A survey on social media gets more responses from younger people
B) A study of successful entrepreneurs reveals they all took big risks, ignoring the many risk-takers who failed
C) A researcher asks leading questions that push respondents toward a particular answer
D) A clinical trial doesn't use a placebo control group

Answer

**B)** A study of successful entrepreneurs reveals they all took big risks, ignoring the many risk-takers who failed. *Why B:* Survivorship bias occurs when you only observe the "survivors" (successes) and miss the failures. Looking only at successful entrepreneurs who took risks ignores all the entrepreneurs who took similar risks and failed — leading to the false conclusion that risk-taking leads to success. *Why not A:* This is selection bias (the sample overrepresents younger demographics), not survivorship bias specifically. *Why not C:* This describes response bias from leading questions. *Why not D:* This is a design flaw (lack of control group), not survivorship bias. *Reference:* Section 4.3

7. A double-blind experiment is one in which:

A) Two different treatments are compared
B) The sample size is doubled for accuracy
C) Neither the participants nor the researchers who interact with them know which group each participant is in
D) The experiment is repeated twice to confirm results

Answer

**C)** Neither the participants nor the researchers who interact with them know which group each participant is in. *Why C:* In a double-blind study, both layers of potential bias — participant expectations and researcher behavior — are eliminated. Participants can't change their behavior based on which group they're in, and researchers can't unconsciously treat groups differently. *Why not A:* Many experiments compare two treatments, but that doesn't make them double-blind. *Why not B:* Sample size has nothing to do with blinding. *Why not D:* Repeating a study is replication, not double-blinding. *Reference:* Section 4.7

8. A researcher randomly selects 50 city blocks from a city, then surveys every household on those blocks. This is an example of:

A) Simple random sampling
B) Stratified sampling
C) Cluster sampling
D) Systematic sampling

Answer

**C)** Cluster sampling. *Why C:* The city blocks are the clusters (naturally occurring groups). The researcher randomly selects entire clusters, then includes all members within the selected clusters. This is the hallmark of cluster sampling. *Why not A:* In simple random sampling, each individual household would have an equal chance of being selected independently. Here, households are selected in groups. *Why not B:* Stratified sampling would divide the city into subgroups (strata) and randomly sample from each. Here, entire blocks are selected — not a random subset within each block. *Why not D:* Systematic sampling would select every *k*th household from a list, not entire blocks. *Reference:* Section 4.2

9. Which statement best describes why we can't always run experiments?

A) Experiments are always more expensive than observational studies
B) Some research questions involve variables that can't be ethically or practically assigned to participants
C) Experiments always require more participants than observational studies
D) Observational studies are always more accurate than experiments

Answer

**B)** Some research questions involve variables that can't be ethically or practically assigned to participants. *Why B:* You can't randomly assign people to smoke, to live in poverty, to experience racism, or to have a specific gender or age. Ethical constraints prevent experiments that would harm participants, and practical constraints prevent experiments on inherent characteristics. These questions must be studied observationally. *Why not A:* While experiments can be expensive, cost isn't the primary reason they can't always be conducted. *Why not C:* Sample size requirements vary by study, not by study type. *Why not D:* Observational studies are not inherently more accurate — they're actually more vulnerable to confounding. *Reference:* Section 4.7

10. A website asks visitors, "What is the most important issue facing the country today?" and publishes the results as "what Americans think." What is the biggest problem with this claim?

A) The sample size is too small
B) The question is biased
C) This is a convenience/voluntary response sample that is not representative of all Americans
D) The survey should have been double-blind

Answer

**C)** This is a convenience/voluntary response sample that is not representative of all Americans. *Why C:* Only people who visit the website and choose to respond are included. This group is likely very different from the general American population — they may be younger, more internet-savvy, more politically engaged, or more aligned with the website's editorial perspective. The results cannot be generalized to "all Americans." *Why not A:* The sample could be quite large, but size doesn't fix bias. *Why not B:* The question itself might be fine — the problem is who answers it. *Why not D:* Blinding is relevant to experiments, not opinion surveys. *Reference:* Sections 4.2, 4.3

Section 2: True/False (1 point each)

11. True or False: If a study has a very large sample size, it cannot be biased.

Answer

**False.** Bias is a *systematic* problem with how data is collected, not a *random* problem that shrinks with more data. The 1936 Literary Digest poll had 2.4 million responses and was catastrophically biased. A large biased sample just gives you a more precise wrong answer. *Reference:* Section 4.4

12. True or False: Random assignment in an experiment helps protect against confounding variables.

Answer

**True.** Random assignment distributes all variables — both known and unknown confounders — roughly equally across the treatment and control groups. This means any observed difference in outcomes is likely due to the treatment, not to pre-existing differences between groups. *Reference:* Sections 4.5, 4.6

13. True or False: An observational study can never provide useful evidence about causal relationships.

Answer

**False.** While observational studies cannot definitively *prove* causation the way a randomized experiment can, they can provide strong *evidence* — especially when the association is large, consistent across multiple studies, biologically plausible, and dose-responsive (e.g., more smoking = more lung cancer). The evidence that smoking causes cancer came primarily from observational studies, since randomly assigning people to smoke would be unethical. *Reference:* Sections 4.1, 4.8

14. True or False: In cluster sampling, the researcher randomly selects individuals from within each cluster.

Answer

**False.** In cluster sampling, the researcher randomly selects *entire clusters* and then includes *all* individuals within the selected clusters. This is what distinguishes cluster sampling from stratified sampling, where you randomly sample *within* each subgroup. *Reference:* Section 4.2

15. True or False: A/B testing is a form of experimentation that uses random assignment.

Answer

**True.** A/B testing is an experiment in which users are randomly assigned to see version A or version B of a product, feature, or interface. The random assignment ensures that the groups are comparable, allowing causal conclusions about which version performs better. *Reference:* Section 4.7

Section 3: Short Answer (2 points each)

16. A study finds that people who meditate regularly have lower stress levels. Someone reads this and says, "I should start meditating to reduce my stress." What important question should they ask about the study before drawing that conclusion?

Answer

They should ask: **Was this an experiment or an observational study?** If it was observational, the lower stress levels might not be *caused* by meditation. There are plausible confounders: people who meditate might also exercise more, have more flexible schedules (less work stress), be more health-conscious overall, or have higher incomes (allowing more leisure time). The association between meditation and lower stress could be driven by these confounders rather than by meditation itself. To support a causal claim, the study would need to be a randomized experiment — randomly assigning some people to meditate and others not to, then comparing stress levels. Even then, blinding would be difficult (you can't give someone a "placebo meditation"), which is a limitation of this type of research. *Reference:* Sections 4.1, 4.6, 4.8

17. Explain the difference between stratified sampling and cluster sampling using a university campus as an example.

Answer

**Stratified sampling:** You divide the student body into meaningful subgroups (strata) — say, freshmen, sophomores, juniors, and seniors. Then you randomly select students *from within each group.* For example, you might randomly select 50 freshmen, 50 sophomores, 50 juniors, and 50 seniors. Every class year is guaranteed to be represented. **Cluster sampling:** You divide the campus into naturally occurring groups (clusters) — say, dormitory buildings. Then you randomly select *entire buildings* and survey *every student* in those buildings. You might randomly select 5 buildings out of 20 and survey all residents in those 5. **Key difference:** In stratified sampling, you sample from *every* subgroup. In cluster sampling, you select some complete groups and skip others entirely. Stratified sampling guarantees representation of all subgroups; cluster sampling does not. *Reference:* Section 4.2

18. Why is blinding important in experiments? Describe a scenario where the absence of blinding could affect results.

Answer

Blinding prevents participants (and/or researchers) from knowing which group they're in, eliminating bias from expectations and behavior changes. Without blinding, the **placebo effect** can inflate treatment results, and researchers may unconsciously treat groups differently. **Scenario:** A researcher tests whether a new tutoring method improves math scores. Students who know they're receiving the "special" new tutoring method (treatment group) might feel more motivated and study harder — not because the method is better, but because they believe it is. Students in the control group who know they're getting the "old" method might feel discouraged. The difference in scores would reflect motivation differences, not the tutoring method itself. Additionally, if the researcher knows which students are in each group, they might grade the treatment group's tests more leniently (unconsciously) or provide extra encouragement to that group. A double-blind design would have both groups receive tutoring that appears identical from the outside, with different methods applied in ways neither students nor evaluators can distinguish. *Reference:* Section 4.7

19. How does the concept of confounding apply to AI and machine learning? Give a specific example.

Answer

AI models learn patterns from training data, and if that data contains confounded relationships, the AI learns those confounded associations as if they were real. **Specific example:** A hiring algorithm is trained on data from a company's past successful employees. The company's workforce is 80% male, not because men are better workers, but because of historical bias in hiring (a confounder). The algorithm learns that being male is associated with being a successful employee — confounding gender with historical hiring bias. When used to screen new applicants, the algorithm penalizes female candidates, perpetuating the original bias. The algorithm can't distinguish between a genuine predictor (relevant skills and experience) and a confounded association (gender correlated with success due to biased hiring history). This is the same confounding problem as ice cream and drowning — the algorithm mistakes correlation for causation because it can't see the confounding variable (historical bias). *Reference:* Section 4.9

20. Design a simple experiment to test whether background music affects students' ability to concentrate during a reading comprehension test. Include: treatment group, control group, randomization, the response variable, and how you would address blinding.

Answer

**Treatment group:** Students who take the reading comprehension test with background music playing. **Control group:** Students who take the same test in a quiet room (no music). **Randomization:** Recruit 100 volunteers (acknowledging this is a convenience sample). Randomly assign 50 to the music group and 50 to the quiet group using a random number generator. **Response variable:** Score on the reading comprehension test (numerical, continuous-ish). **Blinding considerations:** True double-blinding is difficult here — students will know whether music is playing. However, you can: - Not tell students the hypothesis (they don't know whether music is expected to help or hurt) - Have the tests graded by someone who doesn't know which student was in which group (blinding the evaluator) - Use a standardized multiple-choice test so grading is objective and not influenced by knowledge of group assignment **Additional considerations:** You should ensure both groups take the test at the same time of day, in similar rooms, with the same time limit. The test should be the same for both groups. You should also consider whether the type of music matters — a complete design might test multiple types (classical, pop, white noise) against silence. *Reference:* Section 4.7