1

I remember that in using z-test vs t-test, the required sample size for z-test is n>30 while in t-test n<30 (Generally, is this the answer for the maximum sample size for t-test?)

In ANOVA, I know that the groups must be at least two but I don't know how many must be the required sample size. (Generally, is it just like t-test which has a maximum sample size of 30?)

X's
  • 25
  • The rule of 30 is almost completely fanciful, and should not be considered in 'power and sample size' computations for z tests, t tests, or ANOVAs. // For z tests power for a given sample size, population variance, and effect to detect is elementary. For t tests you have to use the noncentral t distribution (formulas in math stat and experimental design texts) or simulation. – BruceET Jan 30 '22 at 08:51

1 Answers1

2

Suppose you have a sample of size $n=40$ from a normal population with estimated $\sigma = 10.$ that you want to use a t test $H_0: \mu = 50$ against $H_a: \mu > 50$ at level $\alpha = 0.05 = 5\%,$ and that you want 80% power to reject if $\mu_a = 52.$ Is $n = 40$ large enough?

Test for one hypothetical dataset (using R) is shown below. The P-value $0.024 < 0.05 =5\%$ indicates rejection in this one instance.

set.seed(130)
x = rnorm(40, 52, 10)
t.test(x, mu = 50, alt="greater")
    One Sample t-test

data: x t = 2.0415, df = 39, p-value = 0.024 alternative hypothesis: true mean is greater than 50 95 percent confidence interval: 50.61081 Inf sample estimates: mean of x 53.4965

A simulation of 100,000 such datasets and tests will give us an idea what the rejection probability is.

set.seed(2022)
pv = replicate(10^5, t.test(rnorm(40,52,10), mu=50, alt="g")$p.val)
mean(pv <= 0.05)
[1] 0.34678

So, the actual power against the alternative value $\mu_a = 52$ is only about 35%, not the desired 80%. With a little experimentation, we can see that a sample of size of about $n = 160$ would be required for power 80% in these circumstances. [More specifically, power $0.808 \pm 0.002.]$

pv = replicate(10^5, t.test(rnorm(160,52,10), mu=50, alt="g")$p.val)
mean(pv <= 0.05)
[1] 0.80837                   # aprx power
2*sd(pv <= 0.05)/sqrt(10^5)
[1] 0.002484384               # aprx 95% CI for margin of sim error

Note: Many statistical software programs have pre-programmed 'power and sample size' procedures. Also, some Internet sites have useful procedures (unfortunately, others have garbage; be careful). Also, I don't know the level of your statistics background. If you are ready for a bit of theory and an introduction to non-central t distributions, you can look for the relevant equations.

BruceET
  • 51,500