Parameter Estimation in White Gaussian Noise

Parameter Estimation in White Gaussian Noise#

Assume that \( k \) samples of the measured signal \( y_i \), taken over a period \( T \), are real with

\[ y_i = s_i(\alpha) + n_i, \quad i = 1, \ldots, k \]

where

\( \alpha \) is the parameter to be estimated
\( s_i(\alpha) \), \( i = 1, \ldots, k \), are samples of the signal
\( n_i \), \( i = 1, \ldots, k \), are samples of zero-mean, white Gaussian noise with variance \( \sigma^2 \).

Note that \(\sigma^2 = N_0 B\) for passband.

Let \( \vec{y} \) be the set of samples \( y_i \), \( i = 1, \ldots, k \).

The pdf \( p(\vec{y}|\alpha) \) can be expressed as

\[ p(\vec{y}|\alpha) = \frac{1}{(2\pi\sigma^2)^{k/2}} \exp \left[ -\frac{1}{2\sigma^2} \sum_{i=1}^k (y_i - s_i(\alpha))^2 \right] \]

MAP Estimation#

A MAP estimate of \( \alpha \) can then be obtained by finding the maximum of \( \ln p(\alpha|\vec{y}) \) or equivalently, by finding the maximum of \( \ln[p(\vec{y}|\alpha) p(\alpha)] \), i.e.,

\[ \frac{\partial}{\partial \alpha} \ln p(\vec{y}|\alpha) + \frac{\partial}{\partial \alpha} \ln p(\alpha) = 0 \]

Pluggin \(p(\vec{y}|\alpha)\), a MAP estimate to be obtained as the solution to

\[ \frac{1}{\sigma^2} \sum_{i=1}^k (y_i - s_i(\alpha)) \frac{\partial s_i(\alpha)}{\partial \alpha} + \frac{\partial}{\partial \alpha} \ln p(\alpha) = 0 \]

To proceed further, the form of the signal and the a priori pdf \( p(\alpha) \) (if \( \alpha \) is random) must be known.

ML Estimation#

An ML estimate can be obtained by finding the maximum of \( p(\vec{y}|\alpha) \) given \(p(\alpha)\) is unknown.

Thus, the ML estimate results by finding the solution to

\[ \sum_{i=1}^k (y_i - s_i(\alpha)) \frac{\partial s_i(\alpha)}{\partial \alpha} = 0 \]

MSE Estimation#

An MSE estimate can be obtained by computing

\[ \hat{\alpha}_{\rm MSE} = \int_{-\infty}^{\infty} \alpha p(\alpha | \vec{y}) d\alpha \]

where the pdf \( p(\alpha | \vec{y}) \) is determined by Bayes’ rule, i.e.,

\[ p(\alpha | \vec{y}) = \frac{p(\vec{y}|\alpha) p(\alpha)}{p(\vec{y})} \]

From the condition probability (the A Posteriori), we have

\[ p(\alpha | \vec{y}) = \kappa p(\alpha) \exp \left( -\frac{1}{2\sigma^2} \sum_{i=1}^k (y_i - s_i(\alpha))^2 \right) \]

where the constant

\[ \kappa \triangleq \frac{1}{(2\pi\sigma^2)^{k/2} p(\vec{y})} \]

is independent of the parameter \( \alpha \).

Continuous versions of these results can be obtained by utilizing the procedure outlined in Chapter 5 (e.g., multiply with \(\Delta t\)).

For example, the MAP estimate above can be expressed in continuous form by

\[ \frac{1}{\sigma^2} \int_0^T [y(t) - s(t, \alpha)] \frac{\partial s(t, \alpha)}{\partial \alpha} \, dt + \frac{\partial}{\partial \alpha} \ln p(\alpha) = 0 \]

where \( T \) is the measurement interval.

Minimum Variance#

We use the Cramér-Rao bound (CRB) to compute the minimum variance of an unbiased estimator \( \hat{\alpha} \).

We have the log-likelihood is:

\[ \ln p(\vec{y}|\alpha) = -\frac{k}{2} \ln 2\pi \sigma^2 - \frac{1}{2\sigma^2} \sum_{i=1}^k (y_i - s_i(\alpha))^2 \]

Score Function

The score function is the derivative of the log-likelihood with respect to the parameter \( \alpha \):

\[ \frac{\partial}{\partial \alpha} \ln p(\vec{y}|\alpha) = \frac{1}{\sigma^2} \sum_{i=1}^k (y_i - s_i(\alpha)) \frac{\partial s_i(\alpha)}{\partial \alpha} \]

where

\( \frac{\partial s_i(\alpha)}{\partial \alpha} \): Sensitivity of the model output \( s_i \) to changes in the parameter \( \alpha \).
The term \( (y_i - s_i(\alpha)) \) captures the residual between the observed data and the model prediction.

Fisher Information

The Fisher information is computed as the expectation of the square of the score function:

\[ E \left\{ \left[ \frac{\partial}{\partial \alpha} \ln p(\vec{y}|\alpha) \right]^2 \right\} \]

Substituting the score function:

\[ E \left\{ \left[ \frac{\partial}{\partial \alpha} \ln p(\vec{y}|\alpha) \right]^2 \right\} = \frac{1}{\sigma^4} \sum_{i=1}^k \sum_{j=1}^k E[n_i n_j] \frac{\partial s_i(\alpha)}{\partial \alpha} \frac{\partial s_j(\alpha)}{\partial \alpha} \]

For white Gaussian noise, \( E[n_i n_j] = \sigma^2 \delta_{ij} \) (where \( \delta_{ij} \) is 1 if \( i = j \), and 0 otherwise). This simplifies the double sum to:

\[ \mathcal{I}(\alpha) = \frac{1}{\sigma^2} \sum_{i=1}^k \left( \frac{\partial s_i(\alpha)}{\partial \alpha} \right)^2 \]

Cramér-Rao Bound

The second derivative of the log-likelihood function \( \ln p(\vec{y}; \alpha) \) with respect to \( \alpha \) is related to the curvature of the log-likelihood function.

This is given by:

\[ \frac{\partial^2}{\partial \alpha^2} \ln p(\vec{y}; \alpha) \]

The log-likelihood function generally has a maximum at the true parameter value \( \alpha \), because it represents the parameter value that maximizes the probability of observing the data \( \vec{y} \).

Near the maximum, the log-likelihood curve bends downwards, which means the second derivative is negative.

Thus, The Fisher information is defined as:

\[ \mathcal{I}(\alpha) = - E\left\{\frac{\partial^2}{\partial \alpha^2} \ln p(\vec{y}; \alpha)\right\} \]

Here, the negative sign ensures that the Fisher information is positive, as the expected value of the second derivative is typically negative.

The variance of an unbiased estimator satisfies the inequality:

\[ V(\hat{\alpha}) \geq \frac{1}{\mathcal{I}(\alpha)} \]

where the Fisher information \( \mathcal{I}(\alpha) \) is computed as:

\[ \mathcal{I}(\alpha) = -E\left\{\frac{\partial^2}{\partial \alpha^2} \ln p(\vec{y}; \alpha)\right\} \]

Substituting this into the CRB inequality leads to:

\[ V(\hat{\alpha}) \geq \frac{1}{-E\left\{\frac{\partial^2}{\partial \alpha^2} \ln p(\vec{y}; \alpha)\right\}} \]

which is defined in Chapper 3.

Back to our problem

Substituting \( \mathcal{I}(\alpha) \):

\[ \text{Var}(\hat{\alpha}) \geq \frac{\sigma^2}{\sum_{i=1}^k \left( \frac{\partial s_i(\alpha)}{\partial \alpha} \right)^2} \]

The minimum variance for any unbiased estimator \( \hat{\alpha} \), denoted \( \sigma^2_{\text{min}}(\hat{\alpha}) \), is therefore:

\[ \sigma^2_{\text{min}}(\hat{\alpha}) = \frac{\sigma^2}{\sum_{i=1}^k \left( \frac{\partial s_i(\alpha)}{\partial \alpha} \right)^2} \]

Continuous Case

If the data is continuous over time rather than discrete, the summation is replaced by an integral:

\[ \sigma^2_{\text{min}}(\hat{\alpha}) = \frac{\sigma^2}{\int_0^T \left( \frac{\partial s(t, \alpha)}{\partial \alpha} \right)^2 dt} \]

This generalizes the result to cases where \( s(t, \alpha) \) is a continuous signal.