7

Problem: Let $X_1,X_2,\ldots,X_n$ be independent identically distributed random variables (i.i.d's) with common CDF $F$. Fix $x_0\in\mathbb{R}$ and find an unbiased estimator for $F(x_0)$. Show that your estimator is an UMVUE for $F(x_0)$ (i.e. a uniformly minimum variance unbiased estimator).

So the problem is to find an unbiased estimator $\hat{F}(x_0)$ for $F(x_0)$ such that if $T$ is any unbiased estimator for $F(x_0)$, then $\operatorname{Var}(T)\geq \operatorname{Var}(\hat{F}(x_0))$.


Attempt: I think I found what should be our unbiased estimator for $F(x_0)$. But after many, many attempts, I have no idea on how to show that it is an UMVUE. My estimator is $$\hat{F}(x_0)=\frac{1}{n}\sum_{i=1}^nI\{X_i\leq x_0\}$$ where $I\{X_i\leq x_0\}$ is the indicator function that equals $1$ when $X_i\leq x_0$ and $0$ otherwise. We have $$E[I\{X_i\leq x_0\}]=\int_{-\infty}^{x_0}f(x)dx=F(x_0),$$ so it is clear that $\hat{F}(x_0)$ is unbiased.

But to show that it is UMVUE, the only tool that I have is the Cramer-Rao Inequality: $$\operatorname{Var}(T)\geq\frac{[g'(\theta)]^2}{n E[(\frac{\partial}{\partial \theta}\log f(X;\theta))^2]}\tag{1}$$ for any unbiased estimator $T$ of $g(\theta)$. But here if $\theta=F(x_0)$ how do we differentiate $f$ with respect to $\theta$?

I got $$\operatorname{Var}(\hat{F}(x_0))=\frac{F(x_0)(1-F(x_0))}{n},\tag{2}$$ so I am trying to show that the r.h.s of $(1)$ is $(2)$ if $\theta=F(x_0)$ and $g(\theta)=\theta$.


Edit: I looked it up and this estimator for $F(x_0)$ appears to be known as the empirical distribution function. However, there is no clue that this is an UMVUE...

James
  • 71
  • Here I give a counter-example to show this empirical distribution function can have higher variance than another unbiased estimator: https://math.stackexchange.com/questions/4838854/proving-empirical-distribution-function-is-the-effective-estimator-for-cumulativ/4838876#4838876 – Michael Feb 04 '24 at 15:23

1 Answers1

0

Based on the problem statement, we can't assume that a pdf $f(x)$ necessarily exists, so we cannot apply the Cramér-Rao bound. But given that $F$ ranges over all possible CDFs (i.e., we're not restricted to any particular parametric family), $(X_1,\dots,X_n)$ is a complete, sufficient statistic for $F(x_0)$ (for a proof, see Jun Shao, Mathematical Statistics, 2nd ed., Example 2.17, pp. 111-2.). Therefore, by the Lehmann-Scheffé theorem, any unbiased estimator for $F(x_0)$ is a UMVUE, in fact the unique UMVUE; since the empirical distribution function $\hat F(x_0)$ is unbiased, it is therefore the UMVUE.

Brent Kerby
  • 5,539
  • Is $(X_1,\ldots,X_n)$ also complete? The theorem you are referring to assumes existence of a density and says that the vector of order statistics $(X_{(1)},\ldots,X_{(n)})$ is complete sufficient. Since $\hat F(x_0)=\frac1n\sum_{i=1}^n I(X_{(i)}\le x_0)$, this is UMVUE of $F(x_0)$ by Lehmann-Scheffe. – StubbornAtom Feb 07 '21 at 08:20
  • See counter-example here: https://math.stackexchange.com/questions/4838854/proving-empirical-distribution-function-is-the-effective-estimator-for-cumulativ/4838876#4838876 – Michael Feb 04 '24 at 15:31