Statistical Inference and Study Design: Practice Questions & Study Guide
Questions testing whether you can identify what a study's design allows you to validly conclude, including generalizability and causality.
Understanding Statistical Inference and Study Design
Statistical inference questions are about logical reasoning, not computation. The test covers three core ideas: (1) whether a conclusion can be generalized beyond the sample studied, (2) whether the data support causation or only association, and (3) whether the study design itself is appropriate for the claimed conclusion. Getting these distinctions right is entirely a matter of understanding a handful of principles.
Generalizability depends on how the sample was selected. If participants were chosen randomly from a well-defined population, conclusions can be extended to that population. If the sample is a convenience sample or a self-selected group (e.g., volunteers), you cannot generalize—the findings apply only to the specific individuals studied. The test often does this by presenting a study of students at a single school and asking whether the conclusion applies to 'all students in the country'—it doesn't, without a random national sample.
Causation vs. association is tested through study type. A randomized controlled experiment (participants randomly assigned to treatment and control groups) can support causal conclusions because random assignment controls for confounding variables. An observational study—where researchers observe but do not intervene—can only establish that two variables are associated (correlated). No observational design, no matter how large, can alone establish causation. Wrong answer choices in inference questions almost always overclaim by using words like 'causes,' 'proves,' or 'demonstrates causality' for observational studies.
Margin of error and confidence are occasionally tested. A larger, more representative sample reduces the margin of error and produces a narrower confidence interval. The test will not ask you to calculate these but may ask which of two studies produces a more reliable estimate and why.
Key Rules & Formulas
Memorize these rules — they come up directly in practice questions.
Random sample → can generalize to the population; convenience/volunteer sample → cannot generalize
A randomly selected national sample of 1,000 teens supports conclusions about all teens; a survey of 1,000 volunteers at a mall does not
Randomized experiment → can infer causation; observational study → association only
If researchers randomly assign participants to exercise or no-exercise groups and measure health outcomes, they can conclude exercise causes the measured improvement
Larger, representative sample → smaller margin of error → more precise estimate
A poll of 2,000 randomly selected voters is more precise than a poll of 200
Confounding variable: a third variable that is related to both the independent and dependent variables and could explain the association
People who carry lighters are more likely to get lung cancer—but lighters don't cause cancer; smoking is the confounding variable
The scope of a conclusion must match the scope of the sample (group, time period, geography)
A study of 9th-graders at one urban school cannot support conclusions about 'all U.S. high school students'
Statistical Inference and Study Design Practice Questions
Select an answer and click Check Answer to reveal the full explanation. Questions go from easiest to hardest.
A researcher surveys 50 randomly selected students at a single high school about their screen time habits. Which of the following conclusions is best supported by the data?
Show explanation
Correct answer: C. The results can reasonably be generalized to students at this school
Explanation
A random sample from a specific school can support conclusions about students at that school. It cannot be generalized nationally without a national sample.
A study found that cities with more hospitals have higher rates of disease. A journalist writes, 'Hospitals cause disease.' Which type of error does this conclusion represent?
Show explanation
Correct answer: B. The study confuses correlation with causation
Explanation
An observational study showing an association between two variables does not establish that one causes the other. The more likely explanation is a confounding variable (sicker populations attract more hospitals).
A researcher randomly assigns 100 participants to two groups: one receives a new dietary supplement, and one receives a placebo. She measures the change in cholesterol levels after 12 weeks. Which type of study is this?
Show explanation
Correct answer: C. A randomized controlled experiment
Explanation
Random assignment to treatment and control groups defines a randomized controlled experiment. This design can support causal conclusions.
A poll of 1,000 randomly selected adults in a city found that 62% support a new transit policy, with a margin of error of ±3%. Which of the following is the best interpretation?
Show explanation
Correct answer: B. Between 59% and 65% of all adults in the city are estimated to support the policy
Explanation
Margin of error creates an interval estimate: 62% ± 3% = [59%, 65%]. This interval applies to the city's adult population (the sampling frame), not the country.
A nutrition scientist observes that people who eat breakfast daily tend to have lower body mass index (BMI) than those who skip breakfast. Based on this observational study, which conclusion is valid?
Show explanation
Correct answer: C. Eating breakfast is associated with lower BMI in this sample
Explanation
An observational study can only establish association, not causation. The conclusion is limited to the sample observed, using language like 'associated with.'
Two studies examine the effect of a new teaching method on test scores. Study 1 uses a random sample of 50 students; Study 2 uses 200 volunteers who sign up to participate. Which study produces results that can more reliably be generalized to all students?
Show explanation
Correct answer: B. Study 1, because random selection produces a representative sample
Explanation
Random sampling—not sample size—determines generalizability. Volunteers are self-selected and may differ systematically from all students, producing biased results regardless of sample size.
A study tests whether a new fertilizer increases crop yield. Farmers voluntarily apply the fertilizer to the fields they believe are most fertile. The results show higher yields in fertilized fields. What is the main flaw in this study design?
Show explanation
Correct answer: C. Confounding: more fertile fields were chosen for treatment, not randomly assigned
Explanation
Because farmers chose which fields to fertilize—selecting the best ones—soil fertility is a confounding variable. We cannot determine whether the fertilizer or the pre-existing soil quality caused the higher yields.
A school district surveys every student in three randomly selected schools about homework time and grade point average. The study finds a positive correlation. Which of the following conclusions is best supported?
Show explanation
Correct answer: B. Students who do more homework tend to have higher GPA in these three schools
Explanation
The schools were randomly selected from the district, so conclusions can be extended to the district—but not the country (no national random sample). The design is observational, so only association (not causation) can be inferred. Answer B accurately captures both limitations.
A researcher studying sleep patterns randomly assigns 200 college students to two groups. Group A sleeps 8 hours per night for 4 weeks; Group B sleeps 6 hours per night. At the end, Group A shows significantly higher scores on a cognitive test. Which conclusion is most justified?
Show explanation
Correct answer: A. Sleeping 8 hours causes better cognitive performance in college students
Explanation
Random assignment to conditions means this is a randomized controlled experiment. Causal conclusions are justified for the population studied (college students). The sample was drawn from college students, so the conclusion is limited to that population—but within it, causation can be inferred.
A researcher uses a convenience sample of 500 gym members to study exercise habits in the general adult population. The study finds that 85% of adults exercise at least 3 times per week. Why is this estimate likely unreliable for the general adult population?
Show explanation
Correct answer: B. Gym members are not a random or representative sample of all adults
Explanation
Gym members are systematically more likely to exercise than the general adult population—this is selection bias. No matter how large the sample is, a non-representative convenience sample cannot reliably estimate a population parameter.
Want more Statistical Inference and Study Design practice?
Access 1,000+ additional questions filtered by difficulty and score band in the full 1600.lol question bank — free, no signup needed.
Common Mistakes to Avoid
These are the most frequent errors students make on Statistical Inference and Study Design questions. Knowing them in advance prevents costly point losses.
- !Choosing an answer that claims causation from an observational study
- !Generalizing beyond the population from which the sample was drawn (e.g., extending results from one city to the nation)
- !Confusing a larger sample with a more representative sample—size alone doesn't fix selection bias
- !Treating correlation as implying directionality of cause (A causes B vs. B causes A)
- !Selecting an answer that underclaims (refuses to generalize from a valid random sample) when one overclaims is the trap
Strategy Tips: Statistical Inference and Study Design
Before reading the answer choices, decide two things about the study: (1) was sampling random? (2) was assignment experimental or observational? Those two decisions rule out most wrong answers
Eliminate any answer choice that contains 'causes,' 'proves,' or 'demonstrates' unless the study is a randomized experiment
The correct inference answer is almost always the most conservative, carefully-scoped statement—matching exactly the population sampled and using 'associated' or 'suggests' language
If the question asks what cannot be concluded, use your inference rules to identify the strongest overclaim in the choices—that is your answer
Other Problem Solving & Data Analysis Subtopics
Ratios, Rates, and Proportional Relationships
Questions asking you to set up proportional equations, convert units, and scale quantities in real-world contexts.
Percentages and Percent Change
Questions covering percent conversions, percent increase and decrease, and multi-step percentage problems with real-world price or quantity contexts.
Statistics, Data Interpretation, and Distributions
Questions requiring you to calculate and interpret measures of center and spread, read graphs and tables, and understand the shape of data distributions.
Probability and Conditional Probability
Questions asking you to find simple, compound, and conditional probabilities, often from two-way frequency tables.
Master Statistical Inference and Study Design
These 10 questions are just the start. Unlock the full 1600.lol question bank for 12,000+ practice questions with the graphing calculator, instant feedback, and progress tracking.
Join 50,000+ students preparing for the test on 1600.lol