AP Statistics Glossary

10% condition

The 10% condition ensures that the sample is no more than 10% of the population, maintaining independence.

68-95-99.7 rule (Empirical Rule)

The 68-95-99.7 rule describes the percent of values within 1, 2, and 3 standard deviations in a normal distribution.

Addition rule for mutually exclusive events

If two events are mutually exclusive, P(A or B) = P(A) + P(B).

Association

Association means that knowing the value of one variable gives information about another.

Bias

Bias is a systematic error that leads to incorrect estimates of population parameters.

Biased estimator

A biased estimator consistently over- or underestimates a population parameter.

Bimodal

Bimodal describes a distribution with two distinct peaks.

Binomial distribution

A binomial distribution models the number of successes in a fixed number of independent trials with constant probability.

Block

A block is a group of experimental units that are similar, used to control variation in experiments.

Boxplot (Box-and-Whisker Plot)

A boxplot shows the five-number summary and helps visualize the spread and symmetry of a distribution.

Categorical variable

A categorical variable places individuals into groups or categories.

Census

A census collects data from every individual in the population.

Central Limit Theorem (CLT)

The Central Limit Theorem states that sample means follow a normal distribution for large sample sizes.

Chi-square distribution

The chi-square distribution is used in tests for categorical data and is right-skewed.

Cluster sample

A cluster sample randomly selects entire groups and includes all individuals in chosen groups.

Coefficient of determination (R²)

R² measures the proportion of variability in the response explained by the explanatory variable.

Complement (of an event)

The complement of an event is all outcomes not in the event, calculated as 1 – P(A).

Completely randomized design

A completely randomized design assigns treatments to all experimental units randomly.

Conditional probability

Conditional probability is the chance of an event given that another has occurred: P(A|B) = P(A ∩ B)/P(B).

Confidence interval

A confidence interval gives a range of plausible values for a population parameter.

Confounding

Confounding occurs when two variables’ effects on a response cannot be separated.

Continuous random variable

A continuous random variable can take any value in an interval and is represented by a density curve.

Control group

A control group receives no treatment or a standard treatment for comparison.

Convenience sample

A convenience sample selects individuals that are easiest to reach, often causing bias.

Correlation (r)

Correlation (r) measures the strength and direction of a linear relationship between two variables.

Critical value

A critical value is a cutoff used to determine statistical significance in hypothesis tests.

Cumulative relative frequency graph (Ogive)

An ogive (cumulative relative frequency graph) shows the accumulation of data over intervals.

Density curve

A density curve is a smooth curve that represents the distribution of a continuous variable.

Discrete random variable

A discrete random variable takes countable values, like 0, 1, 2, etc.

Distribution

A distribution shows the possible values of a variable and how often they occur.

Dotplot

A dotplot displays data values as dots along a number line.

Double-blind

A double-blind study hides treatment groups from both subjects and experimenters.

Event

An event is a set of outcomes from a random process.

Experiment

An experiment imposes treatments to study cause-and-effect relationships.

Experimental unit

An experimental unit is the smallest entity receiving a treatment.

Extrapolation

Extrapolation is using a model to predict values outside the observed range.

Factor (experimental factor)

A factor is a variable whose levels are manipulated in an experiment.

Five-number summary

The five-number summary includes the minimum, Q1, median, Q3, and maximum.

General addition rule

The general addition rule is P(A or B) = P(A) + P(B) – P(A and B).

General multiplication rule

The general multiplication rule is P(A and B) = P(A) × P(B|A).

Geometric distribution

The geometric distribution models the number of trials until the first success.

Histogram

A histogram displays frequencies of data in intervals using adjacent bars.

Independent events

Independent events occur without affecting each other's probabilities.

Influential observation

An influential observation strongly affects regression calculations when removed.

Interquartile range (IQR)

The IQR is Q3 – Q1 and measures the spread of the middle 50% of data.

Intersection (of events)

The intersection of two events contains outcomes in both A and B: P(A ∩ B).

Large counts condition

The large counts condition requires np ≥ 10 and n(1–p) ≥ 10 to use normal approximation.

Law of large numbers

The law of large numbers says that sample statistics approach population parameters as n increases.

Least-squares regression line

The least-squares regression line minimizes the sum of squared residuals.

Level (of a factor)

A level is a specific value of a factor in an experiment.

Margin of error

The margin of error is the range added to or subtracted from a statistic in a confidence interval.

Matched pairs design

A matched pairs design compares two treatments using related or paired subjects.

Mean (Arithmetic Mean)

The mean is the sum of values divided by the number of observations.

Median

The median is the middle value of an ordered data set.

Mode

The mode is the value that appears most often in a dataset.

Mosaic plot

A mosaic plot is a graphical display for two categorical variables using proportional rectangles.

Multiplication rule for independent events

The multiplication rule for independent events is P(A and B) = P(A) × P(B).

Mutually exclusive events (Disjoint events)

Mutually exclusive events cannot happen at the same time.

Nonresponse

Nonresponse occurs when selected individuals do not participate in a survey.

Normal distribution

A normal distribution is a symmetric, bell-shaped curve defined by its mean and standard deviation.

Normal probability plot (Q-Q plot)

A normal probability plot checks if data follow a normal distribution using a straight-line pattern.

Observational study

An observational study observes outcomes without assigning treatments.

Outlier

An outlier is a data point that falls far from the rest of the distribution.

p-value

The p-value is the probability of observing a result as extreme as the sample, assuming H₀ is true.

Parameter

A parameter is a number that describes a population, like μ or p.

Placebo

A placebo is a treatment with no active ingredient used in control groups.

Point estimate

A point estimate is a single value used to estimate a population parameter.

Pooled (combined) sample proportion

The pooled sample proportion is the combined proportion across two samples, used in 2-prop z-tests.

Population

A population is the entire group about which you want information.

Power (of a test)

Power is the probability a test correctly rejects a false null hypothesis.

Quantitative variable

A quantitative variable takes numerical values that can be averaged or measured.

Random assignment

Random assignment places subjects into groups using chance to reduce bias.

Random condition

The random condition requires that data come from a random sample or randomized experiment.

Random sampling

Random sampling selects individuals using a chance process to represent a population.

Random variable

A random variable assigns numerical values to outcomes of a random process.

Side-by-side bar graph

A side-by-side bar graph compares two categorical variables across groups.

Significance level (α)

The significance level α is the threshold for rejecting H₀ in a hypothesis test.

Simple random sample (SRS)

A simple random sample gives every group of individuals an equal chance to be selected.

Skewness (Skewed distribution)

A skewed distribution has a longer tail on one side: left or right.

Slope (of a regression line)

The slope of a regression line shows the average change in y per unit change in x.

Standard deviation

Standard deviation measures the average distance of values from the mean.

Standard deviation of the residuals (sₑ)

The standard deviation of residuals (sₑ) measures the typical prediction error in regression.

Standard error

Standard error is the estimated standard deviation of a sampling distribution.

Standard normal distribution

The standard normal distribution has mean 0 and standard deviation 1.