Notes:Distribution of the sample median

From Maths
Revision as of 06:10, 12 December 2017 by Alec (Talk | contribs) (Problem statement)

Jump to: navigation, search

Problem overview

Let X1,,X2m+1 be a sample from a population X, meaning that the Xi are i.i.d random variables, for some mN0. We wish to find:

  • P[Median(X1,,X2m+1)r]
    - the Template:Cdf of the median.

Initial work

Since the variables are independent then any ordering is as likely as any other (which I proved the long way, rather than just jumping to 1(2m+1)!

- silly me) however the result, found in Probability of i.i.d random variables being in an order and not greater than something will be useful.


I believe the P[Median(X1,,X2m+1)r]=P[X1Xm+1r | X1X2m+1]. Let us make some definitions to make this shorter.

  • O:=X1X2m+1 - representing the order part
  • M:=X1Xm+1r - representing the median part
  • Q:=P[Median(X1,,X2m+1)r]=P[O | O] - representing the question


We should also have some sort of converse, related to rXm+2X2m+1 or something.


We also have:

Analysis

Let us look at Xr and XY to see what we can say if both are true (the "and")

  • Claim: (XrXY)(XMin(r,Y))
  • Proof:
      1. Suppose rY, so Min(r,Y)=r, obviously Xr  Xr=Min(r,Y), so the implication holds in this case
      2. Suppose Yr, so Min(r,Y)=Y, obviously XY  XY=Min(r,Y), so the implication holds in this case too.
      • We notice either Min(r,Y)=r if rY, or Min(r,Y)=Y if Yr (slightly modify the language for the equality, it doesn't matter though really)
        • Thus if rY then Xr and as rY by assumption, we use the transitivity of to see XrY thus XY too - as required
        • Thus if Yr then XY and as Yr by assumption, we use the transitivity of to see XYr and thus Xr too - as required.
      • So in either case, we have XY and Xr - as required

Problem statement

Thus we really want to find:

  • P[Median(X1,,X2m+1)r]=P[X1Xm+1r | X1X2m+1]
    =P[M and O]P[O]
    =((2m+1)!)P[X1Xm+1Min(r,Xm+2)Xm+2Xm+3X2m+1]
    • Caveat:We now need: (XrXYZ)(XMin(r,Y)YZ)
      to justify this format. Although that's arguably not that helpful for the integral.