The trick is that the measurements are only available at irregular intervals. If they were sampled regularly, then the standard mixing trick would work:
\[
m_{n+1} = \mu m_n + (1-\mu) x_{n+1}
\]
where $m$ is our current estimate of the mean, $x_n$ is the $n$-th sample and $\mu$ determines how much history to use.
With unequal sample times, things become a bit more complicated. If we get lots of measurements all at once, we want to give them nearly equal weight but if we have a long gap, we want to weight the very old samples much less.
In fact, we want to weight old samples according to how old they are with exponentially decreasing weight. If we sample values $\left \lbrace x_1 \ldots x_n \right \rbrace$ at times $t_1 \ldots t_n$ then we want the weighted mean defined by
\[
m_n = {\sum_{i=1}^n x_i e^{-(t_n - t_i)/\alpha} \over \sum_{i=1}^n e^{-(t_n - t_i)/\alpha} }
\]
Here $\alpha$ plays the same role as $\mu$ did before, but on a different scale. If the evenly sampled data comes at time intervals $\Delta t$ then $\mu = e^{\Delta t / \alpha}$.
Happily, there is a very simple recurrence relationship that allows us to keep only two intermediate values while computing the value of $m_1 \ldots m_n$ in an entirely on-line fashion as the $x_i$ values arrive.
To see this, define
\begin{eqnarray*}
\pi_n &=& e^{(t_{n+1}-t_n)/\alpha} \\
w_{n+1} &=&
\sum_{i=1}^{n+1} e^{-(t_{n+1} - t_i)/\alpha} =
1+e^{-(t_{n+1}-t_n)/\alpha} \sum_{i=1}^{n} e^{-(t_{n} - t_i)/\alpha} \\
& =& 1 + \pi w_n\\
s_{n+1} &=&
\sum_{i=1}^{n+1} x_i e^{-(t_{n+1} - t_i)/\alpha} =
x_{n+1}+e^{-(t_{n+1}-t_n)/\alpha} \sum_{i=1}^{n} x_i e^{-(t_{n} - t_i)/\alpha} \\
&=& x_{n+1} + \pi_n s_n
\end{eqnarray*}
Then note that
\[
m_{n+1} = {s_{n+1} \over w_{n+1}}
\]
This leads naturally to a procedure that has state consisting of $t, w, m$ which are updated with using new values of $t_n, x_n$ according to
\begin{eqnarray*}
\pi &=& e^{t_{n}-t} \\
w &=& 1 + \pi w \\
s &=& x_n + \pi s \\
m &=& {s \over w} \\
t &=& t_{n}
\end{eqnarray*}
Isn't that a kick!
To do this right, however, we need a test. Here are some data vectors computed for $\alpha=5$:
t x pi w s m
1 11.35718 1.5992071 1.0000000 1.000000 1.5992071 1.5992071
2 21.54637 -1.3577032 0.1303100 1.130310 -1.1493105 -1.0168100
3 28.91061 -0.3405638 0.2292718 1.259148 -0.6040683 -0.4797436
4 33.03586 0.7048632 0.4382129 1.551775 0.4401527 0.2836447
5 39.57767 0.3020558 0.2702621 1.419386 0.4210124 0.2966159

0 comments:
Post a Comment