Share on Facebook Share on Twitter Email
Answers.com

Neyman–Pearson lemma

 
Statistics Dictionary: Neyman–Pearson lemma

A lemma, introduced in 1933 by Neyman and Egon Pearson, that gives a sufficient condition, in a hypothesis test with null hypothesis θ=θ0 and alternative hypothesis θ=θ1, for choosing a critical region, with given significance level, that maximizes the power of the test.



Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
Wikipedia: Neyman–Pearson lemma
Top

In statistics, the Neyman-Pearson lemma states that when performing a hypothesis test between two point hypotheses H0θ = θ0 and H1θ = θ1, then the likelihood-ratio test which rejects H0 in favour of H1 when

\Lambda(x)=\frac{ L( \theta _{0} \mid x)}{ L (\theta _{1} \mid x)} \leq \eta \text{ where } P(\Lambda(X)\leq \eta|H_0)=\alpha

is the most powerful test of size α for a threshold η. If the test is most powerful for all \theta_1 \in \Theta_1, it is said to be uniformly most powerful (UMP) for alternatives in the set \Theta_1 \, .

It is named for Jerzy Neyman and Egon Pearson.

In practice, the likelihood ratio is often used directly to construct tests — see Likelihood-ratio test. However it can also be used to suggest particular test-statistics that might be of interest or to suggest simplified tests — for this one considers algebraic manipulation of the ratio to see if there are key statistics in it is related to the size of the ratio (i.e. whether a large statistic corresponds to a small ratio or to a large one).

Contents

Proof

Define the rejection region of the null hypothesis for the NP test as

R_{NP}=\left\{ X: \frac{L(\theta_{0},X)}{L(\theta_{1},X)} \leq \eta\right\} .

Any other test will have a different rejection region that we define as RA. Furthermore define the function of region, and parameter

P(R,\theta)=\int_R L(\theta|x)\, dx,

where this is the probability of the data falling in region R, given parameter θ.

For both tests to have significance level α, it must be true that

\alpha= P(R_{NP}, \theta_0)=P(R_A, \theta_0) \,.

However it is useful to break these down into integrals over distinct regions, given by

P(R_{NP} \cap R_A, \theta) + P(R_{NP} \cap R_A^c, \theta) = 
P(R_{NP},\theta) ,

and

 P(R_{NP} \cap R_A, \theta) + P(R_{NP}^c \cap R_A, \theta) =  P(R_A,\theta).

Setting θ = θ0 and equating the above two expression, yields that

P(R_{NP} \cap R_A^c, \theta_0) =  P(R_{NP}^c \cap R_A, \theta_0).

Comparing the powers of the two tests, which are P(RNP1) and P(RA1), one can see that

P(R_{NP},\theta_1) \geq P(R_A,\theta_1) \text{ if, and only if, }
P(R_{NP} \cap R_A^c, \theta_1) \geq P(R_{NP}^c \cap R_A, \theta_1).

Now by the definition of RNP ,

 P(R_{NP} \cap R_A^c, \theta_1)= \int_{R_{NP}\cap R_A^c} L(\theta_{1}|x)\,dx \geq \frac{1}{\eta} \int_{R_{NP}\cap R_A^c} L(\theta_0|x)\,dx = \frac{1}{\eta}P(R_{NP} \cap R_A^c, \theta_0)
 = \frac{1}{\eta}P(R_{NP}^c \cap R_A, \theta_0) = \frac{1}{\eta}\int_{R_{NP}^c \cap R_A} L(\theta_{0}|x)\,dx \geq \int_{R_{NP}^c\cap R_A} L(\theta_{1}|x)dx  = P(R_{NP}^c \cap R_A, \theta_1).

Hence the inequality holds.

Example

Let X_1,\dots,X_n be a random sample from the \mathcal{N}(\mu,\sigma^2) distribution where the mean μ is known, and suppose that we wish to test for H_0:\sigma^2=\sigma_0^2 against H_1:\sigma^2=\sigma_1^2. The likelihood for this set of normally distributed data is

L\left(\sigma^2;\mathbf{x}\right)\propto \left(\sigma^2\right)^{-n/2} \exp\left\{-\frac{\sum_{i=1}^n \left(x_i-\mu\right)^2}{2\sigma^2}\right\}.

We can compute the likelihood ratio to find the key statistic in this test and its effect on the test's outcome:

\Lambda(\mathbf{x}) = \frac{L\left(\sigma_1^2;\mathbf{x}\right)}{L\left(\sigma_0^2;\mathbf{x}\right)} = 
\left(\frac{\sigma_1^2}{\sigma_0^2}\right)^{-n/2}\exp\left\{-\frac{1}{2}(\sigma_1^{-2}-\sigma_0^{-2})\sum_{i=1}^n \left(x_i-\mu\right)^2\right\}.

This ratio only depends on the data through \sum_{i=1}^n \left(x_i-\mu\right)^2. Therefore, by the Neyman-Pearson lemma, the most powerful test of this type of hypothesis for this data will depend only on \sum_{i=1}^n \left(x_i-\mu\right)^2. Also, by inspection, we can see that if \sigma_1^2>\sigma_0^2, then \Lambda(\mathbf{x}) is an increasing function of \sum_{i=1}^n \left(x_i-\mu\right)^2. So we should reject H0 if \sum_{i=1}^n \left(x_i-\mu\right)^2 is sufficiently large. The rejection threshold depends on the size of the test.

See also

References

External links


Best of the Web: Neyman–Pearson lemma
Top

Some good "Neyman–Pearson lemma" pages on the web:


Math
mathworld.wolfram.com
 
 
 

 

Copyrights:

Statistics Dictionary. A Dictionary of Statistics. Second edition revised. Copyright © Oxford University Press, 2008. All rights reserved.  Read more
Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Neyman–Pearson lemma" Read more