The 10% condition ensures that the sample is no more than 10% of the population, maintaining independence.
The 68-95-99.7 rule describes the percent of values within 1, 2, and 3 standard deviations in a normal distribution.
If two events are mutually exclusive, P(A or B) = P(A) + P(B).
Association means that knowing the value of one variable gives information about another.
Bias is a systematic error that leads to incorrect estimates of population parameters.
A biased estimator consistently over- or underestimates a population parameter.
Bimodal describes a distribution with two distinct peaks.
A binomial distribution models the number of successes in a fixed number of independent trials with constant probability.
A block is a group of experimental units that are similar, used to control variation in experiments.
A boxplot shows the five-number summary and helps visualize the spread and symmetry of a distribution.
A categorical variable places individuals into groups or categories.
A census collects data from every individual in the population.
The Central Limit Theorem states that sample means follow a normal distribution for large sample sizes.
The chi-square distribution is used in tests for categorical data and is right-skewed.
A cluster sample randomly selects entire groups and includes all individuals in chosen groups.
R² measures the proportion of variability in the response explained by the explanatory variable.
The complement of an event is all outcomes not in the event, calculated as 1 – P(A).
A completely randomized design assigns treatments to all experimental units randomly.
Conditional probability is the chance of an event given that another has occurred: P(A|B) = P(A ∩ B)/P(B).
A confidence interval gives a range of plausible values for a population parameter.
Confounding occurs when two variables’ effects on a response cannot be separated.
A continuous random variable can take any value in an interval and is represented by a density curve.
A control group receives no treatment or a standard treatment for comparison.
A convenience sample selects individuals that are easiest to reach, often causing bias.
Correlation (r) measures the strength and direction of a linear relationship between two variables.
A critical value is a cutoff used to determine statistical significance in hypothesis tests.
An ogive (cumulative relative frequency graph) shows the accumulation of data over intervals.
A density curve is a smooth curve that represents the distribution of a continuous variable.
A discrete random variable takes countable values, like 0, 1, 2, etc.
A distribution shows the possible values of a variable and how often they occur.
A dotplot displays data values as dots along a number line.
A double-blind study hides treatment groups from both subjects and experimenters.
An event is a set of outcomes from a random process.
An experiment imposes treatments to study cause-and-effect relationships.
An experimental unit is the smallest entity receiving a treatment.
Extrapolation is using a model to predict values outside the observed range.
A factor is a variable whose levels are manipulated in an experiment.
The five-number summary includes the minimum, Q1, median, Q3, and maximum.
The general addition rule is P(A or B) = P(A) + P(B) – P(A and B).
The general multiplication rule is P(A and B) = P(A) × P(B|A).
The geometric distribution models the number of trials until the first success.
A histogram displays frequencies of data in intervals using adjacent bars.
Independent events occur without affecting each other's probabilities.
An influential observation strongly affects regression calculations when removed.
The IQR is Q3 – Q1 and measures the spread of the middle 50% of data.
The intersection of two events contains outcomes in both A and B: P(A ∩ B).
The large counts condition requires np ≥ 10 and n(1–p) ≥ 10 to use normal approximation.
The law of large numbers says that sample statistics approach population parameters as n increases.
The least-squares regression line minimizes the sum of squared residuals.
A level is a specific value of a factor in an experiment.
The margin of error is the range added to or subtracted from a statistic in a confidence interval.
A matched pairs design compares two treatments using related or paired subjects.
The mean is the sum of values divided by the number of observations.
The median is the middle value of an ordered data set.
The mode is the value that appears most often in a dataset.
A mosaic plot is a graphical display for two categorical variables using proportional rectangles.
The multiplication rule for independent events is P(A and B) = P(A) × P(B).
Mutually exclusive events cannot happen at the same time.
Nonresponse occurs when selected individuals do not participate in a survey.
A normal distribution is a symmetric, bell-shaped curve defined by its mean and standard deviation.
A normal probability plot checks if data follow a normal distribution using a straight-line pattern.
An observational study observes outcomes without assigning treatments.
An outlier is a data point that falls far from the rest of the distribution.
The p-value is the probability of observing a result as extreme as the sample, assuming H₀ is true.
A parameter is a number that describes a population, like μ or p.
A placebo is a treatment with no active ingredient used in control groups.
A point estimate is a single value used to estimate a population parameter.
The pooled sample proportion is the combined proportion across two samples, used in 2-prop z-tests.
A population is the entire group about which you want information.
Power is the probability a test correctly rejects a false null hypothesis.
A quantitative variable takes numerical values that can be averaged or measured.
Random assignment places subjects into groups using chance to reduce bias.
The random condition requires that data come from a random sample or randomized experiment.
Random sampling selects individuals using a chance process to represent a population.
A random variable assigns numerical values to outcomes of a random process.
A side-by-side bar graph compares two categorical variables across groups.
The significance level α is the threshold for rejecting H₀ in a hypothesis test.
A simple random sample gives every group of individuals an equal chance to be selected.
A skewed distribution has a longer tail on one side: left or right.
The slope of a regression line shows the average change in y per unit change in x.
Standard deviation measures the average distance of values from the mean.
The standard deviation of residuals (sₑ) measures the typical prediction error in regression.
Standard error is the estimated standard deviation of a sampling distribution.
The standard normal distribution has mean 0 and standard deviation 1.