Introduction
I'd like to understand, why I have to use a fear as the Alternative-Hypothesis and not as the Null-Hypothesis. I'm using my homework as an example for that, which says the following (I translated it manually, I hope that you can understand the exercise):
A big band finds out that one CD from their first 20 CDs has a wrong cover on
it. Now they're afraid that at least 5% of all covers are wrong. So they
want to start a significance test with a significance niveau of 10% and with
the Null-Hypothesis: "The share of the wrong covers is smaller than 5%". If the
result shows that their fear is true, they will ask the producer to sell the
CDs for a lower price.
a) Explain why they chose this Null-Hypothesis.
The solution says, that they chose this Null-Hypothesis in order to be able to limit the risk of the error, but I'm wondering why it isn't possible to do that with the Hypothesis: "The share of the wrong covers is at least 5%.". Because this would let to these two Hypothesis:
$$ H_{0}: p \geq 0{,}05 \\ H_{1}: p < 0{,}05 $$
But the correct way would be:
$$ H_{0}: p < 0{,}05 \\ H_{1}: p \geq 0{,}05 $$
Now let's stick to the "correct way" for a moment and assume that I'd do a Hypothesis test, I'd look at the "corner-case" of the Null-Hypothesis, so $H_{0} = p < 0{,}05 = p = 0{,}05$. After that I can calculate the "area", which helps me to decide when I can reject the Null-Hypothesis and when not, by finding out how many CDs have a wrong cover:
$$ P\left(X \geq k | p = 0{,}05\right) \leq 0{,}1 $$
So the region of rejection would be:
$$ R = \left\{k, \ldots, n\right\} $$
where:
- $n$ is the amount of the CDs which are used in the Hypothesis test and
- $k$ is the least border-value where we reject our Null-Hypothesis or in other words: The minimal amount of CDs which have a wrong cover, where we reject the Null-Hypothesis.
- $X$ represents the amount of CDs which have a wrong cover
So far so good, that makes sense for me: We're calculating the rejecting-areas to be able to conclude if the result of our test would be the same as if we'd test all CDs which they sold.
EDIT (1):
Little "addition" to the conclusion here: We've calculated now our rejecting-areas to be able to decide if our fear (the Alternative-Hypothesis) is "true" (or "valid") or not by testing it on our dataset of $n$ CDs. If it turns out that our fear is "true", so our dataset landed in our rejection-area, we can think, that our fear (the Alternative-Hypothesis) has a probability to be the "better" assumption than the Null-Hypothesis. But it could change, if we'd do more tests and the "hit-rate" of the Null-Hypothesis increases and becomes bigger than the Alternative-Hypothesis so that's why we can't say, that this hypothesis is "the absolute correct one". That's correct, right?
Question
Now I'm wondering: Why I can't just use their fear as the Null-Hypothesis (my first two Hypothesis)? I can do the same steps as before like in the "correct way" and get a rejecting-area for my Null-Hypothesis as well. This would lead to the following Hypothesis test:
$$ P\left(X \leq k | p = 0{,}05 \right) \leq 0{,}1 $$
After finding a suitable value for $k$, I'd have this rejection-area:
$$ R = \left\{0, \ldots, k\right\} $$
Which can be intrepreted like this(?): "If there are at least $k$ CDs with a wrong cover of $n$ CDs, than I can be sure that there are only at the most 5% of all CDs with a wrong cover." What's my wrong thought?
Which external sources have you tried out?
I watched StatQuest's video about the alternative Hypothesis and in this part, he explained that you can't assume that our alternative Hypothesis is right since it includes "all" alternative Hypothesis (if we're using the correct way). I understood his explanation about his example, but I can't get a "connection" from his example to my problem/homework. Didn't I showed, that $H_{0}: p \geq 0{,}05$ is true if I found more than $k$ wrong CDs according to my (wrong) Hypothesis test? Didn't I show in this Hypothesis test, that no matter which other Alternative Hypothesises are included in the Alternative Hypothesis, they all have $p \geq 0{,}05$?
Summary
The correct one
$$ H_{0}: p < 0{,}05\\ H_{1}: p \geq 0{,}05 $$
Our rejection-area:
$$ R = \left\{k, \ldots, n\right\} $$
Interpretation of our result:
We get into our rejection-area $\to$ The possibility of our null hypothesis is smaller than the alternative hypothesis.
In other words:
The possibility that there are at most 5% of the CDs with a wrong cover in a dataset is smaller than the possibility that at least 5% of the CDs have a wrong cover.We don't get into our rejection-are $\to$ The possibility of our null hypothesis is greater than the alternative hypothesis.
In other words:
The possibility that there are at most 5% of the CDs with a wrong cover in a dataset is greater than the possibility that at least 5% of the CDs have a wrong cover.
The "false" one
$$ H_{0}: p \geq 0{,}05 \\ H_{1}: p < 0{,}05 $$
Our rejection area:
$$ R = \left\{0, \ldots, k\right\} $$
Interpretation of our result:
We get into our rejection-area $\to$ The possibility of our null hypothesis is smaller than the alternative hypothesis.
In other words:
The possibility that there are at most 5% of the CDs with a wrong cover in a dataset is greater than the possibility that at least 5% of the CDs have a wrong cover.We don't get into our rejection-are $\to$ The possibility of our null hypothesis is greater than the alternative hypothesis.
In other words:
The possibility that there are at most 5% of the CDs with a wrong cover in a dataset is smaller than the possibility that at least 5% of the CDs have a wrong cover.
I achieved in both cases the same result, correct? So why is it not possible to chose the "false" one?
You might prove the alternative hypothesisandin which case you also take no action because you haven't proved anything.are mindblowing for me at the moment. How can I imagine that?failing to reject the null hypothesis does not mean that we believe the null hypothesis is true.But isn't that the reason why we do a hypothesis test? We create a hypothesis and look, if we can reject this hypotheses or not, according to our "conditions" by the reject-areas. – TornaxO7 May 02 '21 at 07:50Anyway, I'm going to wait until someone who is more expert in statistics chimes in.Ok, but thank you for the "discussion"! Saying that the hypothesis test works only for our current dataset improved my understanding about its meaning :) – TornaxO7 May 02 '21 at 08:46