What is Hypothesis Testing?
Hypothesis Testing in statistics is exactly that – testing a hypothesis, where a hypothesis is a theory about a given situation. For example, you might have a coin that you suspect is biased as coin tossing seems to be favouring heads over tails. You might like to test the hypothesis that the coin is biased in favour of heads – this would be hypothesis testing.
See this example and some other Examples of Hypothesis Testing.
So, what is a hypothesis test? First of all, consider a given statement such as ‘my coin is fair’ which is normally accepted/expected. This is known as the null hypothesis. However, a situation arises which contradicts it which leads us to present an alternative hypothesis. Part of the hypothesis test is to set up an experiment to find evidence to accept or reject the null hypothesis. In the most common approach, the probability of the result (or worse) of the experiment is calculated. If the probability is low enough (below the significance level), one can conclude that the result is unlikely to be due to chance. In this case, the alternative hypothesis is more likely and the null hypothesis should be rejected. In hypothesis testing we only ever ACCEPT or REJECT the null hypothesis.
In order to test your hypothesis mathematically, you must first be very clear about what you are testing. The hypothesis test should be set up in a formal fashion.
Null and Alternative Hypotheses
The first step is to write down the statement you wish to challenge and provide its associated alternative. This involves stating a null hypothesis and an alternative hypothesis. We conventionally call these $H_0$ and $H_1$. This is because writing null hypothesis and alternative hypothesis is tedious. The following shows how we might write down the null and alternate hypotheses in words. We, typically, however, write them as equations:
$H_0$: This is the commonly accepted theory, the one that is being challenged. It is the opposite of the alternative hypothesis. For example, the coin mentioned above FAIR.
$H_1$: The alternative hypothesis is the one being presented. This is the theory that is being tested using probabilities. For example, the coin mentioned above is BIASED.
However, the null and alternative hypotheses should be written in terms of a test statistic. The null hypothesis will either be accepted or rejected depending on the probability of an experiment result.
In hypothesis testing, the test statistic is the statistic that is being assessed. In the example above, the test statistic is the probability p of tossing heads. The null and alternative hypothesis should both be written in terms of this statistic:
$H_1: p > 0.5$
In order to test a hypothesis, a significance level must be specified. It is the probability at which the result of the experiment occurring by chance becomes too low. For example, suppose an experiment is conducted with the coin mentioned above. The outcome of the experiment (or worse) occurs with a probability of 0.04 assuming that the coin is fair. This is below the 5% significance level and so the null hypothesis should be rejected. Note that for discrete probability distributions, it is unlikely to get an experiment outcome that is exactly the same as the significance level. We usually go for the outcome that gets us the closest.
1 and 2 Tail Tests
If the alternative hypothesis suggests bias in a given direction, i.e. the probability of tossing heads is greater than 0.5, then the test is one-tailed. On the contrary, if the direction is not specified the test is two-tailed. For example, the following hypotheses could be stated if the coin is biased either way:
$H_1$: $p\ne 0.5$
For two-tailed test, since we are challenging the null hypothesis test either way, we must split the significance level up between the two experiment tails. Probabilities at both extremes of the experiment must be calculated and assessed. See some Examples of Hypothesis Testing.
Critical Value and Critical Region/Acceptance Region
Given the significance level, the critical/acceptance regions are the sets of values that lead to rejection/acceptance of the null hypothesis. The values in the critical/acceptance regions correspond to outcomes of the experiment. The critical region is the set of experiment outcomes that lead to rejection of the null hypothesis. Likewise, the acceptance region is the set of experiment outcomes that lead to acceptance of the null hypothesis. The critical values are the boundary values between critical and acceptance regions.
p-value is essentially a probability value. In the approach mentioned above, the null hypothesis is rejected if the probability of an outcome of an experiment is below the significance level. With this approach, the significance level is chosen first. Alternatively, one can define the critical region given a certain significance level and see if the outcome of the experiment falls inside or outside of this critical region. The main advantage of the original approach is seeing at what levels the outcome is significant. This, however, can introduce bias in the choosing of the significant level so as to ensure rejection of the null hypothesis.
See some Examples of Hypothesis Testing.