statistical hypothesis : a statement about the numerical value of a population parameter
null hypothesis : the status quo; assumed to be true unless there’s convincing data otherwise. Denoted
alternative hypothesis : hypothesis that’s only accepted if there’s compelling evidence, denoted
test statistic : sample statistic, computed from information in the sample, used to decide between null and alternative hypothesis.
Type I error : (relationship doesn’t actually exist); researcher rejects the null hypothesis in favor of the alternative, when is true. The probability of committing Type I error is denoted . aka false positive
Type II error : (relationship really does exist); when is accepted, but it’s actually false. The probability of this error is denoted by . aka false negative
Power (statistics)- rejection region :: the possible values where the researcher will reject in favor of .
criterion : probability that defines whether it’s unlikely to represent the underlying population.
critical value : the score that marks the inner edge of the region of rejection
one-tailed test : the alternative case is strictly greater than the specific value (e.g. “strength of pipe must be > 2400 psi”)
two-tailed test : alternative hypothesis exists within some bounds (e.g. tolerance of a machined part)
p-value : aka observed significance level, is the probability of observing a value of the test statistic that is at least as contradictory to the null hypothesis and supportive of the alternative hypothesis, as the actual one computed form the sample data. Reworded: The likelihood that a test statistic overlaps with an interval, given a confidence value.
power of the test : probability that the test will correctly lead to the rejection of the null hypothesis for a particular value of or in the alternative hypothesis. The power is equal to .
significant : It means that this result is unlikely to have occured due to a sampling error; Indicates a rejection of the null hypothesis
significance level : This is . is the confidence level.
inferential statistics : procedures for deciding whether sample data represent a particular relationship in the population
parametric statistics : inferential stats for computing the mean; Require certain assumptions about the raw score population represented by the sample
nonparametric statistics : used for median and mode; don’t require assumptions about the raw score population of the sample
When a sample statistic exists outside of the “rejection region”, we can’t reject the null hypothesis.. but rather say “the sample evidence is insufficient to reject at “.
Elements of a hypothesis test
- Null hypothesis ().
- Alternative (research) hypothesis ().
- Test statistic.
- Rejection region, which uses (also referred to as the “level of significance”).
- Assumtpions: clear statements about the population being sampled.
- Experiment & calculation of test statistic: the computation of the test statistic
- Conclusion: rejection (with possible type I error) if the value is in the rejection region. Insufficient evidence to reject if it isn’t (given we don’t know the probability of , which is the likelihood of type II error).
Formulating a hypothesis
- Pick an alternative hypothesis
- upper-tailed ()
- lower-tailed ()
- two-tailed ()
- select null hypothesis ().
Calculating a p-value
- Determine the value of the test stat corresponding with the result of the sampling experiment.
- If the test is one-tailed, the p-value is the area above or below (depending on which tail) the observed z-value. If it’s two-tailed, p-value is 2x the tail area beyond the z-value in the direction of z.
In practice, this means that you take the CDF of the z value (or perhaps, 1-z value, depending on direction) and check if it’s smaller than the you’ve chosen. If so, you should reject the null hypothesis, b/c it lives within the rejection region.
Calculating a p-value from a z score
You just need to take the cdf of the z-score. Ensure you’re capturing the tail(s) that you care about.
Converting a two-tailed p-value to a one-tailed p-value
p-value for proportions / chi
- draw curve
- draw region
- label areas / test stat
- compute the p-value
to compute the p-value, compute the chi-square CDF for the null hypothesis.
Statistical tests
Statistical tests often take the form of:
(observed difference - what we expect if null is true) / average variation
Z statistic testing
Statistical tests Used when you have a large (n>=30) sample Test statistic: Rejection region depend on alternative hypothesis.
This requires:
- a random sample
- the sample size is large enough ()
T statistic testing
Statistical tests Used for small samples Test statistic:
Rejection region depend on alternative hypothesis.
This requires:
- A random sample
- The population is approximately normal
Hypothesis about a population proportion
Test Statistic: Rejection region depend on alternative hypothesis.
This requires:
- A random sample of a binomial population
- Sample size is n large ()
Calculating for a mean
- Calculate that corresponds to the border between acceptance/rejection regions, denoted . for upper tailed tests. (minus for lower, and both plus and minus w/ instead for two tailed)
- Specify the value of in the alternative hypothesis to calculate for . Convert to z-values, using the alternative. .
Example
test statistic: Rejection region: for . (note: this is only one tailed) s: 200 n: 50
—
Then find the z-value for (the border between rejection/acceptance).
Unsure: Where does 2425 come from? Do you just pick it out of a hat b/c it’s in the alternative hypothesis?
so z = .76. The area under the curve is .
This B value comes from Table II of our book, which gets the area under the curve. Likely need to +.5 if it’s bigger than the mean.
Calculating for a p (proportion)
Same as above, but we use instead of aka
Hypothesis about population variance
NOTE: Be careful on if you’re talking about variance or stddev and adjust accordingly.
test statistic:
Assumes:
- random sample
- population is approximately normal
For CVA (common value approach), we get the critical values by doing an inverse chi-square passing alpha and 1-alpha (assuming 2 tailed).
For PVA (p value approach): It’s the chi-cdf of the test stat, accounting for tail’dness.