target parameter : the unknown population parameter that we’re interested in estimating

point estimator : a single variable calculated from a sample that estimates a target population parameter

confidence interval : range of numbers which contain the target param with a high degree of confidence

interval estimator : see confidence interval

confidence coefficient : the probability that the confidence interval contains the population parameter.

confidence level : the confidence coefficient, but as a percentage.

z-statistic : An approximation of a mean using the sample’s mean and a known standard deviation.

t-statistic : An approximation for mean that takes into account both the sample’s mean as well as the sample’s standard deviation .

degrees of freedom : the number of things that vary in an equation. It’s the number of entries minus the number of parameters used to calculate the resulting parameter. So for calculating variance (), it’s the number of things in X minus 1 b/c we’re using the sample mean as a parameter.

Confidence intervals based on a normal statistic

The formula for calculating the confidence interval is .

The 1.96 works out to contain the mean with a 95% chance.

confidence level
90%.101.645
95%.051.960
98%.022.326
99%.012.575

Notation: is the value on a normal distribution () such that the area () will be on the right. In practice, that means is the symbol for the right-most tail when you calculate a confidence interval.

The generalized formula for large-sample (i.e. ) is this:

If is known, If is unknown, (where is the sample’s Standard Deviation)

but to be a valid large-sample confidence interval, we need:

  1. A random sample
  1. The sample size must be large enough so the Central Limit Theorem applies.

Confidence intervals for a population mean: student’s t-statistic

If we’re operating on a small sample and don’t know , then we can use t-statistics. T-statistics are more variable than z-statistics b/c they have an additional variable (sample standard deviation, rather z-statistics’ population standard deviation).

They use the same formula, but sub (sample stddev) instead of (population stddev) and uses the “inverse t” to find the t-statistic (rather than “inverse normal” to find the z-statistic).