Saturday, June 17, 2017

Generalized Log-Likelihood Test for Poisson Distribution

In another blog, I described how a generalized log-likelihood ratio can be used to find interesting differences in counts. In the simplest cases, we have counts for some kind of observation (A and not A) under two conditions (B and not B). The question of interest is whether the the rate of A varies with whether or not condition B applies. In classical statistics, we would be talk about having a test against a null hypothesis, but in practice we want to look at lots of A's under lots of different conditions B. Testing so many situations makes it very hard to use the classical machinery well, so we have to look at the problem a bit differently.

As I mentioned in other blogs, we can still use a classically derived test known as the generalized log-likelihood ratio as a way of simply ranking different A-B combinations against each other according to how interesting they are. Even without being able to interpret the statistical score as a statistical test, we get useful results in practice.

The generalized log-likelihood ratio most commonly used in these situations is derived assuming we have two binomial observations. This test can be extended to compare two multinomial conditions for independence, but this is rarely done, if only because comparing two binomials is so darned useful.

With the binomial test, we look at the number of positive observations out of some total number of observations for each condition. In some situations, it is much more natural to talk about the number of positive observations not as a fraction of all observations, but as a fraction of the duration of the condition. For instance, we might talk about the number of times we noticed a particular kind of network error under different conditions. In such a case, we probably can say how long we looked for the errors under each condition, but it can be very hard to say how many observations there were without an error.

For a Poisson distribution under two conditions $A$ and $\neg A$, we can observe a count for each of the conditions as well as the total time over which the count is taken. We can arrange our results this way:

Count $$\Delta t$$
$$A$$
$$k_1$$
$$\,\,\,\, t_1 \,\,\,\,$$
$$\neg A$$
$$k_2$$
$$t_2$$


This is suggestive of the way that counts from two binomial observations can be arranged to look for a difference under different conditions \cite{dunning93}.

We can investigate whether the Poisson distribution the same under both conditions using the generalized log-likelihood ratio test. Such a test uses $\lambda$, the generalized log-likelihood ratio,
\[
\lambda = \frac{
\max_{\theta_1 = \theta_2 = \theta_0} p(k_1 | \theta_0)p(k_2 | \theta_0)
}{
\max_{\theta_1, \theta_2} p(k_1 | \theta_1)p(k_2 | \theta_2)
}
\]
According to Wilks\cite{wilks1938} and also later Chernoff\cite{Chernoff1954}, the quantity $-2 \log \lambda$ is asymptotically $\chi^2$ distributed with one degree of freedom.

For the Poisson distribution,
\[
p(k | \theta, t) = \frac{(\theta t)^k e^{-\theta t}}{k!} \\
\log p(k|\theta, t) = k \log \theta t - \theta t - \log k!
\]
The maximum likelihood estimator $\hat\theta$ can be computed by maximizing the log probability
\[
\max_\theta \frac{(\theta t)^k e^{-\theta t}}{k!} = \max_\theta \log \frac{(\theta t)^k e^{-\theta t}}{k!} \\
\log \frac{(\theta t)^k e^{-\theta t}}{k!} = k \log \theta t - \theta t - \log k! \\
\frac{\partial \log L(k | \theta, t)}{\partial \theta} = \frac{k}{ \theta} - t
=0 \\
\hat \theta = k
\]
Returning to the log-likelihood ratio test, after some cancellation we get
\[
-\log \lambda =
k_1 \log k_1 +
k_2 \log k_2
- k_1 \log \frac{k_1+k_2}{t_1+t_2} t_1 - k_2 \log \frac{k_1+k_2}{t_1+t_2} t_2
\]
Some small rearrangement gives the following preferred form that is very reminiscent of the form most commonly to compute the log-likelihood ratio test for binomials and multinomials
\[
-2 \log \lambda = 2 \left( k_1 \log \frac{k_1}{t_1} +
k_2 \log \frac{k_2}{t_2} - (k_1+k_2) \log \frac{k_1+k_2}{t_1+t_2}
\right)
\]