Watch
Watching this resources will notify you when proposed changes or new versions are created so you can keep track of improvements that have been made.
Favorite
Favoriting this resource allows you to save it in the “My Resources” tab of your account. There, you can easily access this resource later when you’re ready to customize it or assign it to your students.
Elements of a Hypothesis Test
A statistical hypothesis test is a method of making decisions using data from a scientific study.
Learning Objective

Outline the steps of a standard hypothesis test.
Key Points
 Statistical hypothesis tests define a procedure that controls (fixes) the probability of incorrectly deciding that a default position (null hypothesis) is incorrect based on how likely it would be for a set of observations to occur if the null hypothesis were true.
 The first step in a hypothesis test is to state the relevant null and alternative hypotheses; the second is to consider the statistical assumptions being made about the sample in doing the test.
 Next, the relevant test statistic is stated, and its distribution is derived under the null hypothesis from the assumptions.
 After that, the relevant significance level and critical region are determined.
 Finally, values of the test statistic are observed and the decision is made whether to either reject the null hypothesis in favor of the alternative or not reject it.
Terms

significance level
A measure of how likely it is to draw a false conclusion in a statistical test, when the results are really just random variations.

null hypothesis
A hypothesis set up to be refuted in order to support an alternative hypothesis; presumed true until statistical evidence in the form of a hypothesis test indicates otherwise.
Example
 In a famous example of hypothesis testing, known as the Lady tasting tea example, a female colleague of Sir Ronald Fisher claimed to be able to tell whether the tea or the milk was added first to a cup. Fisher proposed to give her eight cups, four of each variety, in random order. One could then ask what the probability was for her getting the number she got correct, but just by chance. The null hypothesis was that the Lady had no such ability. The test statistic was a simple count of the number of successes in selecting the 4 cups. The critical region was the single case of 4 successes of 4 possible based on a conventional probability criterion (< 5%; 1 of 70 ≈ 1.4%). Fisher asserted that no alternative hypothesis was (ever) required. The lady correctly identified every cup, which would be considered a statistically significant result.
Full Text
A statistical hypothesis test is a method of making decisions using data from a scientific study. In statistics, a result is called statistically significant if it has been predicted as unlikely to have occurred by chance alone, according to a predetermined threshold probability—the significance level. Statistical hypothesis testing is sometimes called confirmatory data analysis, in contrast to exploratory data analysis, which may not have prespecified hypotheses. Statistical hypothesis testing is a key technique of frequentist inference.
Statistical hypothesis tests define a procedure that controls (fixes) the probability of incorrectly deciding that a default position (null hypothesis) is incorrect based on how likely it would be for a set of observations to occur if the null hypothesis were true. Note that this probability of making an incorrect decision is not the probability that the null hypothesis is true, nor whether any specific alternative hypothesis is true. This contrasts with other possible techniques of decision theory in which the null and alternative hypothesis are treated on a more equal basis.
The Testing Process
The typical line of reasoning in a hypothesis test is as follows:
 There is an initial research hypothesis of which the truth is unknown.
 The first step is to state the relevant null and alternative hypotheses. This is important as misstating the hypotheses will muddy the rest of the process.
 The second step is to consider the statistical assumptions being made about the sample in doing the test—for example, assumptions about the statistical independence or about the form of the distributions of the observations. This is important because invalid assumptions will mean that the results of the test are invalid.
 Decide which test is appropriate, and state the relevant test statisticT.
 Derive the distribution of the test statistic under the null hypothesis from the assumptions.
 Select a significance level (α), a probability threshold below which the null hypothesis will be rejected. Common values are 5% and 1%.
 The distribution of the test statistic under the null hypothesis partitions the possible values of T into those for which the null hypothesis is rejected, the so called critical region, and those for which it is not. The probability of the critical region is α.
 Compute from the observations the observed value t_{obs} of the test statistic T.
 Decide to either reject the null hypothesis in favor of the alternative or not reject it. The decision rule is to reject the null hypothesis H_{0} if the observed value t_{obs} is in the critical region, and to accept or "fail to reject" the hypothesis otherwise.
An alternative process is commonly used:
7. Compute from the observations the observed value t_{obs} of the test statistic T.
8. From the statistic calculate a probability of the observation under the null hypothesis (the pvalue).
9. Reject the null hypothesis in favor of the alternative or not reject it. The decision rule is to reject the null hypothesis if and only if the pvalue is less than the significance level (the selected probability) threshold.
The two processes are equivalent. The former process was advantageous in the past when only tables of test statistics at common probability thresholds were available. It allowed a decision to be made without the calculation of a probability. It was adequate for classwork and for operational use, but it was deficient for reporting results. The latter process relied on extensive tables or on computational support not always available. The calculations are now trivially performed with appropriate software.
Key Term Reference
 alternative hypothesis
 Appears in these related concepts: Assumptions, Example: Test for Independence, and Does the Difference Prove the Point?
 control
 Appears in these related concepts: Managing to Prevent Fraud, Controlling for a Variable, and Using a Bank for Control
 datum
 Appears in these related concepts: Inferential Statistics, Applications of Statistics, and Change of Scale
 distribution
 Appears in these related concepts: Application of Knowledge, Monte Carlo Simulation, and Selling to Consumers
 exploratory data analysis
 Appears in these related concepts: Exploratory Data Analysis (EDA), Statistical Graphics, and StemandLeaf Displays
 frequentist
 Appears in these related concepts: Variation and Prediction Intervals, Estimating the Target Parameter: Interval Estimation, and Interpreting a Confidence Interval
 hypothesis test
 Appears in these related concepts: Level of Confidence, Determining Sample Size, and Hypothesis Tests or Confidence Intervals?
 independence
 Appears in these related concepts: The Year the Polls Elected Dewey, Independence, and Departmentalization Cons
 level
 Appears in these related concepts: Misleading Graphs, Randomized Design: SingleFactor, and Factorial Experiments: Two Factors
 line
 Appears in these related concepts: Plotting Lines, Varieties of Line, and Qualities of Line
 mean
 Appears in these related concepts: Mean, Variance, and Standard Deviation of the Binomial Distribution, The Mean Value Theorem, Rolle's Theorem, and Monotonicity, and Understanding Statistics
 pvalue
 Appears in these related concepts: Estimating and Making Inferences About the Slope, Calculations for the tTest: One Sample, and How Fisher Used the ChiSquared Test
 partition
 Appears in these related concepts: Experimental Design, Sex Bias in Graduate Admissions, and Repeated Measures Design
 probability
 Appears in these related concepts: The Addition Rule, Theoretical Probability, and Rules of Probability for Mendelian Inheritance
 sample
 Appears in these related concepts: What Is a Confidence Interval?, Sampling, and Fundamentals of Statistics
 statistics
 Appears in these related concepts: What Is Statistics?, Communicating Statistics, and Population Demography
Sources
Boundless vets and curates highquality, openly licensed content from around the Internet. This particular resource used the following sources:
Cite This Source
Source: Boundless. “Elements of a Hypothesis Test.” Boundless Statistics. Boundless, 01 Jul. 2015. Retrieved 03 Jul. 2015 from https://www.boundless.com/statistics/textbooks/boundlessstatisticstextbook/estimationandhypothesistesting12/hypothesistestingonesample54/elementsofahypothesistest2622713/