Difference between revisions of "Maximum (random variable)"
(Saving work) |
m (Wrong categories) |
||
Line 42: | Line 42: | ||
==Notes== | ==Notes== | ||
<references group="Note"/> | <references group="Note"/> | ||
− | {{Definition|Statistics|Probability | + | {{Definition|Statistics|Probability|Elementary Probability}} |
Latest revision as of 18:57, 26 November 2017
Contents
[hide]Definition notes
Let X1, …, Xn be i.i.d random variables that are sampled from a distribution, X and additionally let M:=Max(X1,…,Xn) for short.
- P[Max(X1,…,Xn)≤x]=P[X1≤x]P[X2≤x | X1≤x]⋯P[Xn≤x | X1≤x∩⋯∩Xn−1≤x]
- =n∏i=1P[Xi≤x]- provided the Xi are independent random variables
- =n∏i=1P[X≤x]
- =(P[X≤x])n
- =n∏i=1P[Xi≤x]
We shall call this F′(x):=(P[X≤x])n (and use F(x):=P[X≤x], as is usual for cumulative distribution functions) Caveat:Do not confuse the 's for derivatives, then:
- the probability density function, f′(x):=ddx[F′(x)]|x[Note 1] is:
- f′(x)=ddx[(P[X≤x])n]|x
- =ddx[(F(x))n]|x
- =n(F(x))n−1f(x)by the chain rule, herein written nF(x)n−1f(x) for simplicity.
- =ddx[(F(x))n]|x
- so f′(x)=nF(x)n−1f(x)
- f′(x)=ddx[(P[X≤x])n]|x
Expectation of the maximum
- E[M]:=∫∞−∞xf′(x)dx
- =n∫∞−∞xf(x)F(x)n−1dx- I wonder if we could use integration by parts or a good integration by substitution to clear this up
- =n∫∞−∞xf(x)F(x)n−1dx
Special cases
- For X∼Rect([a,b]):
- E[Max(X1,…,Xn)]=nb+an+1
- This is actually simplified from the perhaps more useful a+nn+1(b−a), recognising (b−a) as the width of the uniform distribution we see that it slightly "under estimates" a+(b−a)=b, from this we can actually obtain a very useful unbiased estimator.
- This is actually simplified from the perhaps more useful a+nn+1(b−a)
- E[Max(X1,…,Xn)]=nb+an+1
a=0 case
Suppose that a=0, then to find b we could observe that the E[X]=b2
However note in this case that the maximum has expectation E[M]=nn+1b
- Thus: n+1nE[M]=band so E[n+1nM]=b
So defining: M′=n+1nM we do obtain an unbiased estimator for b from our biased one
It can be shown that for n≥8 that M′ has lower variance (and thus is better) than the 2x average estimator, they agree for n=1. For 2≤n≤7 2x the average has the lower variance and is thus objectively better (or the same, from the n=1 case) as an estimator for b when n≤7 Warning:Only the following is known Alec (talk) 18:56, 26 November 2017 (UTC)
- This is only true when comparing the variance of M (not M′) to that of 1n∑ni=1Xi, as we double the average, the variance would go up 4 times making the difference even worse.
- To obtain M′ we multiply M by a small (specifically n+1n) constant slightly bigger than 1 as n stops being tiny, this will increase the variance by a factor of this value squared, which will still be "slightly bigger than 1" so M′ is a better estimator, it may however move the specific bound of "better for n≥8 further down probably - this needs to be calculated
This is independent of b
Specifically:
- Var(M)=(nn+2−n2(n+1)2)b2- note this is for M not M′ and
- Var(1nn∑i=1Xi)=b212n