Questions tagged [statistics]

Mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory and other branches of mathematics such as linear algebra and analysis.

Statistics is the science of the collection, organization, and interpretation of data. It deals with many aspects of data, which includes the planning of data collection in terms of the design of surveys and experiments. (From Wikipedia)

More specifically, mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and mathematical analysis. (From Wikipedia)

For questions which are more generally about collecting and treating data, it is advised that you post your question on Cross Validated and DSSE.

37109 questions
4
votes
1 answer

Approximation of sample mean distribution

Suppose that we have an iid sample $X_1,\dots,X_n$ with a distribution function $F$. Denote $\bar X_n:=\frac{1}{n}\sum_{i=1}^n X_i$ and $\bar X_n^*:=\frac{1}{n}\sum_{i=1}^n X_i^*,$ where $X_1^*,\dots,X_n^*$ are iid from the empirical distribution…
user25574
  • 49
  • 1
4
votes
3 answers

What is a sufficient statistic?

I am trying to understand the definition of a sufficient statistic and trying to make conceptual sense of it. Wikipedia says $$Pr(X=x|T(X)=t,\theta) = Pr(X=x|T(X)=t)$$ Exactly how am I suppose to make sense of probability with $\theta$? Probability…
user782220
  • 3,195
4
votes
0 answers

How big should the sample size be to disprove this article?

There is a new poker computer that is claimed to be unbeatable. http://www.theguardian.com/science/2015/jan/08/poker-program-cepheus-unbeatable I beat this computer on my first try today but my friend says that my 100 rounds against the computer was…
John
  • 41
3
votes
3 answers

Paternity probability calculator based on blood group and eye color

I am currently writing a paternity probability calculator. I am struggling with finding the correct statistical approach to determining probability based on blood type and on eye colour. For example, assume the following family: …
Anon21
  • 2,581
3
votes
1 answer

Estimation of $\mathbb{P}(X > t)$

Let $X_1, \ldots, X_n \sim X$ be iid random variables. The goal is to find an estimator of $\theta = \mathbb{P}(X > t)$ for a given $t > 0$. 1) Show that $\hat{\theta}_1 = 1 - F_n(t)$ is an unbiased estimator of $\theta$ and find its MSE ($F_n$ is…
user14559
3
votes
1 answer

Pooled sample variance, how to prove

I did read the related question, and if it did contain the answer to my question, it must have been above my level. This is my very first post, so I'll stick to letters for now. The pooled sample variance for two stochastic variables with the same…
Magnus
  • 703
3
votes
2 answers

Properties of sigmoid functions

I'm considering a parametrized sigmoid function such as the following logistic: $$f(x)=\frac{e^{a+bx}}{1+e^{a+bx}}$$ And I'm interested only in the interval $\displaystyle x >= 0$ and $\displaystyle x < x_{max}$ (with a given $x_{max}$). Two tiny…
Mino
  • 155
3
votes
1 answer

Injective functions and sufficient statistics

I'm trying to prove that for a random sample $X_1,\ldots,X_n$ that depends on $\theta$, if $T$ is a sufficient statistic for $\theta$, then so is $T'=f\circ T$, for any injective function $f$. My attempt: Since $f$ is injective, if we restrict the…
user153582
  • 2,723
3
votes
1 answer

How to count importance of bought rate?

We are presenting lists of products to our users. Users can buy products. We have about 100 000 products. Users are watching only two or three pages of products. It's important to show best products on the first two pages. We are saving to database…
4n4831
  • 33
3
votes
0 answers

How can I compute this?(method)

$$\mu = \sum\limits_{k=0}^{\infty}\dfrac{2^{k}}{\displaystyle{2k+1 \choose k}}$$ The answer is $\dfrac{\pi}{2}$. But I don't know how to do it. Please show me the way! Thank you.
3
votes
2 answers

MLE for independent Poisson distributions with different mean variable

Assume that $X_i \sim \text{Poisson}(\lambda^i)$, then we want to find the maximum likelihood estimate (MLE) of $\lambda$ and its asymptotics. I did in the following way, but got stuck here. Since $\mathbb{P}(X_i=x_i)=e^{\lambda^i}\frac{\lambda^{i…
Julie
  • 1,107
3
votes
3 answers

Why is the Kendall tau distance a metric?

So I am trying to see how the Kendall $\tau$ distance is considered a metric; i.e. that it satisfies the triangle inequality. The Kendall $\tau$ distance is defined as follows: $$K(\tau_1,\tau_2) = |(i,j): i < j, ( \tau_1(i) < \tau_1(j) \land…
3
votes
3 answers

How to do hypothesis testing on Gaussian mixture model?

I am CS major, please be patient if my question is not well-stated. The dataset is quantitative mass spectrometry (MS) data. By labeling proteins of two different samples A and B, we get the relative abundance of 100 to thousands of proteins in…
3
votes
1 answer

Can regression to the mean explain the unexpected underperformance (per player-rating) of girls against boys on chess?

I'm somewhat mathematically illiterate, I'd appreciate some help in order to satisfy a curiosity. Concise summary attempt: chess ratings are derived from past performance and therefore thought to be good predictors of future performance, odds of the…
patzer
  • 39
3
votes
2 answers

Second moment of chi squared distribution

I've got difficulties in computing the second momentum of chi squared. Chi squared distribution with $n$ degrees of freedom is the sum of $n$ independent distributions $X^2$, where $X \sim N(0;1)$. We know that the fourth momentum of each $X_i$ is…