Optimal Detection in a General Vector Channel Model#
The received signal is modeled as an \(N\)-dimensional vector, \(\vec{r}\) that depends statistically on the transmitted signal vector, \(\vec{s}_m\).
In this section, we consider a general vector channel model—not limited to the AWGN scenario—and develop the concepts underlying optimal detection.
Signal Transmission and the General Vector Channel#
Suppose the transmitter has a set of possible signal vectors
each corresponding to a message.
These vectors are transmitted according to certain a priori probabilities \(P_m\) (other notation \(p(\vec{s}_m)\) or \(p(m)\)), which capture the likelihood of each message being sent.
When a signal vector \(\vec{s}_m\) is transmitted, the received vector \(\vec{r}\) is a random quantity whose statistics are described by the conditional probability density function (pdf)
Thus, the overall channel model is characterized by the statistical relationship between \(\vec{s}_m\) and \(\vec{r}\). The receiver, upon observing \(\vec{r}\), must decide which message was most likely transmitted.
Decision Function and the Optimal Detector#
To formalize the detection process, we define a decision function (or decision rule) \(g(\vec{r})\), which is a mapping from the observation space \(\mathbb{R}^N\) to the set of messages. The decision rule is expressed as:
It indicates that the decision function \(g\) takes an \(N\)-dimensional vector (obtained from the projection of the received waveform \(r(t)\) onto an orthonormal basis) as its input and maps it to a message index \(\hat{m}\) in the set \(\{1, 2, \ldots, M\}\).
When the receiver observes a particular \(\vec{r}\), it declares the transmitted message as
The goal of optimal detection is to choose the decision rule \(g(\vec{r})\) so as to minimize the error probability or, equivalently, to maximize the probability of a correct decision.
Message Indices#
Note that in this context, \( m \) and \(\hat{m}\) are typically used as indices that label the possible messages.
Message Index:
The notation \( m \) (with \( 1 \le m \le M \)) identifies one of the \( M \) possible messages. Each message corresponds to a unique transmitted signal vector \( \vec{s}_m \). Although each message is originally derived from a sequence of bits (for example, a \( k \)-bit sequence when \( M = 2^k \)), in the detection and analysis framework we use the index \( m \) to refer to the message.
Detection Process:
The receiver observes the vector \(\vec{r}\) and makes a decision by choosing an index \(\hat{m}\) that maximizes the posterior probability (or likelihood) given the observation:
Here, \(\hat{m}\) is the estimated message index that is declared as the transmitted message.
Mapping to Bit Sequences:
Although \( m \) (and thus \(\hat{m}\)) is an index, there is typically an established mapping between these indices and the actual sequences of bits. For example, if a \( k \)-bit message is mapped to an index \( m \), the receiver can convert the detected index \(\hat{m}\) back to its corresponding \( k \)-bit binary sequence using the inverse of the mapping used at the transmitter.
Probability of a Correct Decision#
If the receiver decides \(\hat{m}\) upon receiving \(\vec{r}\), the probability that this decision is correct is given by
That is, given the observation \(\vec{r}\), the probability of being correct is the conditional probability that the transmitted message is \(\hat{m}\).
To obtain the overall probability of a correct decision, we average this conditional probability over all possible received vectors \(\vec{r}\) weighted by their marginal probability \(p(\vec{r})\):
Since \(p(\vec{r})\) is nonnegative for all \(\vec{r}\), the overall probability of a correct decision is maximized if, for every received \(\vec{r}\), the decision rule maximizes \(\Pr[\hat{m} \text{ was sent} \mid \vec{r}]\).
Design Criterion for the Optimal Detector#
We can see that since \(p(\vec{r})\) is nonnegative for all \(\vec{r}\), the overall probability of a correct decision is maximized if, for every received \(\vec{r}\), the decision rule maximizes \(\Pr[\hat{m} \text{ was sent} \mid \vec{r}]\).
This observation leads directly to the formulation of the optimal detection rule. The optimal detector selects the message \(m\) that maximizes the posterior probability \(\Pr[m|\vec{r}]\). Formally, the decision function \(g_{\text{opt}}(\vec{r})\) is defined as:
In practice, this means that upon receiving \(\vec{r}\), the detector computes \(\Pr[m|\vec{r}]\) for each \(m = 1, 2, \ldots, M\) and then declares the message corresponding to the largest value of this conditional probability.
Equivalence in Terms of Signal Vectors#
Since transmitting message (index) \(m\) is equivalent to transmitting the signal vector \(\vec{s}_m\), the optimal decision rule can equivalently be written in terms of these vectors:
This formulation emphasizes that the receiver’s decision is based on the likelihood of having received \(\vec{r}\) given that the signal \(\vec{s}_m\) was transmitted.
Summary of Types of Probabilities#
Prior Probability \(P_m\)#
Definition:
The prior probability, denoted by \(P_m\), is the probability that the transmitter sends the \(m\)-th message before any observation is made at the receiver. In other words,\[ P_m \equiv p(\vec{s}_m) \equiv p(m) = \Pr[m \text{ is selected to send}] = \Pr{\vec{s}_m \text{ is selected to send}} \]which reflects the statistical likelihood of the \(m\)-th message being transmitted.
Role in Detection:
The prior probability encapsulates any inherent bias in the message selection process. In many systems, messages are chosen uniformly at random, in which case\[ P_m = \frac{1}{M} \quad \text{for all } m \]where \(M\) is the total number of messages. However, if some messages are more likely to occur than others, \(P_m\) will vary accordingly.
Likelihood Probability \(\Pr(\vec{r}|\vec{s}_m)\) or \(\Pr(\vec{r}|m)\)#
Definition:
The likelihood probability represents the probability density (or probability, in the discrete case) of receiving the vector \(\vec{r}\) given that the signal corresponding to the \(m\)-th message (i.e., \(\vec{s}_m\)) was transmitted. It is denoted by:\[ \Pr(\vec{r}|\vec{s}_m) \quad \text{or} \quad \Pr(\vec{r}|m) \]Role in Detection:
This probability is determined by the channel’s statistical behavior. For example, in an AWGN channel, the likelihood is typically modeled as a multivariate Gaussian density centered at \(\vec{s}_m\) with a covariance matrix that depends on the noise variance. It tells us how likely it is to observe \(\vec{r}\) if \(\vec{s}_m\) were transmitted.
Posterior Probability \(\Pr(\vec{s}_m|\vec{r})\) or \(\Pr(m|\vec{r})\)#
Definition:
The posterior probability is the probability that the \(m\)-th message (or equivalently, the signal \(\vec{s}_m\)) was transmitted given that the received vector is \(\vec{r}\). It is denoted by:\[ \Pr(m|\vec{r}) \quad \text{or} \quad \Pr(\vec{s}_m|\vec{r}) \]Derivation Using Bayes’ Rule:
The posterior probability is computed by applying Bayes’ theorem:\[ \Pr(m|\vec{r}) = \frac{\Pr(\vec{r}|m) \, P_m}{p(\vec{r})} \]where:
\(\Pr(\vec{r}|m)\) is the likelihood.
\(P_m\) is the prior probability.
\(p(\vec{r})\) is the marginal probability (or evidence), given by
\[ p(\vec{r}) = \sum_{m=1}^{M} \Pr(\vec{r}|m) \, P_m \]
Role in Detection:
The posterior probability represents the updated belief about which message was transmitted after taking into account the observation \(\vec{r}\). It is the key quantity in the Maximum a Posteriori (MAP) detection rule.
Below is an extended explanation of the marginal probability \(p(\vec{r})\) and an updated Markdown table that includes it.
Marginal Probability \(p(\vec{r})\)#
Definition: The marginal probability \(p(\vec{r})\) represents the unconditional probability density of receiving the vector \(\vec{r}\) regardless of which message was transmitted. It is calculated by averaging (or “marginalizing”) the likelihood \(\Pr(\vec{r}|m)\) over all possible messages, each weighted by its corresponding prior probability \(P_m\). Mathematically, it is given by:
Role in Detection: This term plays a crucial role in Bayes’ theorem, serving as a normalizing constant to ensure that the posterior probabilities sum to one. It provides a measure of how likely the observation \(\vec{r}\) is under the entire statistical model of the transmission process.
Conditional Probability (Correction Detection Probability) \(\Pr(\hat{m} \text{ was sent} \,|\, \vec{r})\)#
Definition:
Once the receiver processes \(\vec{r}\), it makes a decision and declares \(\hat{m}\) as the transmitted message. The probability\[ \Pr(\hat{m} \text{ was sent} \,|\, \vec{r}) \]is the probability that the message corresponding to the decision \(\hat{m}\) is indeed the one that was transmitted, given the received vector \(\vec{r}\).
This probability is often referred to as the conditional probability of correct detection (or simply the conditional correctness probability) given \(\vec{r}\).
Role in Performance Analysis:
This quantity directly reflects the confidence of the detector in its decision for a given observation \(\vec{r}\). The overall performance (i.e., the overall probability of a correct decision) is obtained by averaging \(\Pr(\hat{m} \text{ was sent} \,|\, \vec{r})\) over all possible received vectors:\[ \boxed{ \Pr[\text{correct decision}] = \int \Pr(\hat{m} \text{ was sent} \,|\, \vec{r}) \, p(\vec{r}) \, d\vec{r} } \]