Channel Modeling#

End-to-end System Overview#

Channel Encoding#

In digital communication systems, the discrete channel encoder plays a critical role in preparing a binary information sequence for transmission over a noisy channel.

The primary function of the encoder is to introduce redundancy into the binary sequence in a controlled and systematic manner.

This redundancy refers to additional bits that do not carry new information but are strategically added to enable the receiver to detect and correct errors caused by noise and interference during transmission.

Without such redundancy, the receiver would have no means to distinguish between the intended signal and the distortions introduced by the channel, rendering reliable communication impossible in the presence of noise.

The encoding process involves segmenting the input binary information sequence into blocks of \(k\) bits, where \(k\) is the number of information bits per block.

Each unique \(k\)-bit sequence is then mapped to a corresponding \(n\)-bit sequence, known as a codeword, where

\[\boxed{ n > k } \]

This mapping ensures that each possible \(k\)-bit input has a distinct \(n\)-bit output, preserving the uniqueness of the information while embedding redundancy.

For example, a simple repetition code might map a single bit (e.g., \(k = 1\), input \((0)\)) to a three-bit codeword (e.g., \(n = 3\), output \((000)\)), repeating the bit to add redundancy.

Code Rate#

The amount of redundancy introduced by this encoding process is quantified by the ratio \(n/k\).

This ratio indicates how many total bits (\(n\)) are transmitted for each information bit (\(k\)).

A higher \(n/k\) implies more redundancy; for instance, if \(k = 4\) and \(n = 7\), then

\[ \frac{n}{k} = \frac{7}{4} = 1.75 \]

meaning 1.75 bits are sent per information bit, with 0.75 bits being redundant.

The reciprocal of this ratio, \(k/n\), is defined as the code rate, denoted \(R_c\):

\[ \boxed{ R_c = \frac{k}{n} } \]

The code rate measures the efficiency of the encoding scheme, representing the fraction of the transmitted bits that carry actual information.

For the example above (\(k = 4\), \(n = 7\)),

\[ R_c = \frac{4}{7} \approx 0.571 \]

meaning approximately 57.1% of the transmitted bits are information, and the remaining 42.9% are redundancy.

A code rate closer to 1 indicates less redundancy (higher efficiency), while a lower \(R_c\) indicates more redundancy (better error protection but lower efficiency).

The choice of \(R_c\) balances the trade-off between data throughput and error resilience, depending on the channel’s noise characteristics.

Modulation and Interface to the Channel#

The binary sequence output from the channel encoder, consisting of \(n\)-bit codewords, is passed to the modulator, which serves as the interface between the digital system and the physical communication channel (e.g., a wireless medium or optical fiber).

The modulator’s role is to convert the discrete binary sequence into a continuous-time waveform suitable for transmission over the channel.

In its simplest form, the modulator employs binary modulation, where each bit in the sequence is mapped to one of two distinct waveforms:

  • A binary \((0)\) is mapped to waveform \( s_1(t)\).

  • A binary \((1)\) is mapped to waveform \( s_2(t)\).

For example, in binary phase-shift keying (BPSK), \( s_1(t)\) and \( s_2(t)\) might be sinusoidal signals differing in phase (e.g., \( s_1(t) = A \cos(2\pi f_c t)\) and \( s_2(t) = -A \cos(2\pi f_c t)\)), transmitted over a symbol duration \( T\).

This one-to-one mapping occurs at a rate determined by the bit rate of the encoded sequence.

Alternatively, the modulator can operate on blocks of \(q\) bits at a time, using \(M\)-ary modulation, where \( M = 2^q\) represents the number of possible waveforms.

Each unique \(q\)-bit block is mapped to one of \(M\) distinct waveforms.

For instance, if \(q = 2\), then \( M = 2^2 = 4\), and the modulator might use four waveforms (e.g., in quadrature phase-shift keying, QPSK), such as \( s_1(t), s_2(t), s_3(t), s_4(t)\), each corresponding to a 2-bit sequence \((00, 01, 10, 11)\).

This increases the data rate per symbol—since each waveform carries \(q\) bits—but requires a more complex receiver to distinguish between the \(M\) signals.

At the receiving end, the transmitted waveform is corrupted by channel effects (e.g., noise, fading, or interference), resulting in a channel-corrupted waveform.

The demodulator processes this received signal and converts each waveform back into a form that estimates the transmitted data symbol.

In binary modulation, the demodulator might output a scalar value (e.g., a voltage level) indicating whether \( s_1(t)\) or \( s_2(t)\) was more likely sent.

In \(M\)-ary modulation, it might produce a vector in a signal space (e.g., coordinates in a constellation diagram) representing one of the \(M\) possible symbols.

This output serves as an estimate of the original binary or \(M\)-ary data symbol, though it may still contain errors due to channel noise.

Mathematical Pipeline of the End-to-End Process#

Source Input (Information Bits):

\[ \vec{u} = \{ u_1, u_2, \dots, u_{L} \},\quad u_i \in \{0, 1\} \]

Segment into \( \frac{L}{k} \) blocks of \( k \) bits.

Channel Encoding:

\[ \vec{c}_i =\vec{u}_i \times \mathbf{G},\quad\vec{c}_i \in \{0, 1\}^n,\quad \forall i \]

where \( \mathbf{G} \in \mathbb{F}_2^{k \times n} \) is the generator matrix.

Output bitstream:

\[ \vec{c} = \{ c_1, c_2, \dots, c_{N} \},\quad N = \frac{L}{k} \times n \]
  1. Bitstream Grouping for Modulation:
    Group into \( \frac{N}{q} \) symbols:

\[ \vec{b}_j = \{ c_{(j-1)q+1}, \dots, c_{jq} \},\quad \vec{b}_j \in \{0, 1\}^q \]

If \( N \not\equiv 0 \ (\text{mod}\ q) \), pad with zeros to make length divisible by \( q \).

Symbol Mapping (8-QAM):

\[ \vec{s}_j = \mu(\vec{b}_j),\quad \mu: \{0,1\}^q \rightarrow \mathbb{C} \]

where \( \mu \) maps \( q \)-bit groups to complex constellation points in 8-QAM.

Transmission:

\[ x(t) = \Re\left\{ \sum_j \vec{s}_j \times e^{j2\pi f_c t} \right\} \]

where \( f_c \) is the carrier frequency.

Example: (7,4) Linear Block Code with 8-QAM Modulation#

System Parameters

  • Channel code: \( (n, k) = (7, 4) \)

  • Code rate: \( R_c = \frac{k}{n} = \frac{4}{7} \)

  • Modulation: 8-QAM

  • Modulation order: \( M = 8 \), bits per symbol: \( q = \log_2 M = 3 \)

The (7,4) Linear Block Code

A (7,4) linear block code adds redundancy to enable error correction:

  • \(k = 4\): Number of message (information) bits.

  • \(n = 7\): Length of each encoded codeword.

  • \(n − k = 3\): Number of parity (redundant) bits.

For each 4-bit message, the encoder generates a 7-bit codeword.

For example (systematic encoding):
Input = \(1011\) → Output = \(1011xyz\)
Where \(xyz\) are parity bits computed from the message bits based on the code’s generator matrix.

Bitstream Formation

The encoder takes a sequence of message bits and breaks it into 4-bit blocks. Each block is encoded into a 7-bit codeword.

Input messages: \(1011\), \(0010\), \(1100\)
Encoded codewords: \(1011100\), \(0010110\), \(1100101\)

These 7-bit codewords are concatenated into a continuous binary stream:

Bitstream = \(101110000101101100101\)

This bitstream is then sent to the modulator.

Modulator Input: Preparing for 8-QAM

We know that

  • 8-QAM transmits 3 bits per symbol (\(q = 3\), \(M = 2^3 = 8\)).

  • Each symbol corresponds to a unique point in an 8-point constellation, typically placed asymmetrically or as a modified rectangular/circular constellation in the I-Q (in-phase and quadrature) plane.

Mapping Encoded Bits to 8-QAM Symbols

Since the codeword length (7 bits) is not a multiple of 3, we treat the bitstream as continuous and group it into 3-bit chunks.

Bitstream: \(101110000101101100101\)

Grouped into 3-bit segments: \(101\), \(110\), \(000\), \(101\), \(101\), \(100\), \(101\)

If the total bit count isn’t divisible by 3, padding bits (e.g., \(0\)s) may be appended at the end to complete the final 3-bit symbol.

Original = 21 bits → already divisible by 3
No padding required in this example.

Each 3-bit group is then mapped to a unique 8-QAM constellation point:

Example mapping:

  • \(000\) → Symbol 1

  • \(001\) → Symbol 2

  • \(111\) → Symbol 8

Each symbol has a defined amplitude and phase, used to modulate a carrier signal.

Modulation and Transmission

Each 3-bit group modulates a carrier wave:

  • The carrier’s amplitude and phase are set according to the corresponding constellation point (e.g., \(101\) → (I = -1, Q = 2)).

  • The signal is transmitted over the physical medium (e.g., wireless channel, cable, optical link).

  • During transmission, the signal may experience noise, fading, or distortion.

Receiver Side (Overview)

At the receiver:

  • Each received analog symbol is demodulated back into a 3-bit group.

  • The demodulated bitstream is re-segmented into 7-bit codewords.

  • The decoder checks each 7-bit codeword for errors and corrects them based on the (7,4) code structure.

  • Finally, the original 4-bit message blocks are recovered.

Are Additional Steps

Depending on the system, optional preprocessing or enhancements may be applied between encoding and modulation:

  • Interleaving: Rearranges bits to reduce the impact of burst errors.

  • Scrambling: Prevents long sequences of 0s or 1s to improve synchronization.

  • Framing: Adds synchronization headers or delimiters to help the receiver align data.

  • Rate Matching: Adjusts the number of bits/symbol to match the channel bandwidth (less common in fixed systems like (7,4) + 8-QAM).

Transmission: Each symbol is sent over the channel using its associated I-Q modulation.

Summary of The Pipeline

Stage

Operation

Notes

Encoder

(7,4) linear block coding

Adds 3 parity bits per 4-bit message

Bitstream

Concatenate 7-bit codewords

Continuous binary stream

Modulator

Group bits into 3-bit segments

Each group → 1 symbol (8-QAM)

Symbol Mapping

Map 3-bit groups to 8-QAM constellation

Modulate carrier with amplitude/phase

Channel

Transmit analog modulated signal

Subject to channel impairments

Receiver

Demodulate + decode

Recover original 4-bit messages

Detection and Decision-Making#

The demodulator’s output is fed to the detector, which interprets the estimate and makes a decision about the transmitted symbol.

Hard Decision#

In the simplest case, for binary modulation, the detector determines whether the transmitted bit was a \((0)\) or a \((1)\) based on the demodulator’s scalar output.

For example, if the demodulator outputs a value \( y\) and a threshold \( \tau\) is set (e.g., \(\tau = 0\) in BPSK), the detector decides:

\[\begin{split} \begin{cases} y > \tau: & \text{Bit is } 1 \\ y \leq \tau: & \text{Bit is } 0 \end{cases} \end{split}\]

This binary decision is termed a hard decision, as it commits to one of two possible outcomes without retaining ambiguity.

The detection process can be viewed as a form of quantization.

In the hard-decision case, the demodulator’s continuous output (e.g., a real-valued voltage) is quantized into one of two levels, analogous to binary quantization.

The decision boundary (e.g., \( \tau\)) divides the output space into two regions, each corresponding to a bit value.

More generally, the detector can quantize the demodulator output into \( Q \geq 2\) levels, forming a \( Q\)-ary detector.

If \(M\)-ary modulation is used (with \( M = 2^q\) waveforms), the number of quantization levels must satisfy \( Q \geq M\) to distinguish all possible symbols.

For example, in QPSK (\( M = 4\)), a hard-decision detector might use \( Q = 4\) to map the demodulator’s vector output to one of four symbols.

Soft Decision#

In the extreme case, if no quantization is performed (\( Q = \infty\)), the detector passes the unquantized, continuous output directly to the next stage, preserving all information from the demodulator.

When \( Q > M\), the detector provides more granularity than the number of transmitted symbols, resulting in a soft decision.

For instance, in QPSK (\( M = 4\)), a detector with \( Q = 8\) might assign the demodulator output to one of eight levels, offering finer resolution within each symbol’s decision region.

Soft decisions retain more information about the received signal’s likelihood rather than forcing a definitive choice, which can improve error correction in subsequent decoding.

The detector’s quantized output—hard or soft—is then passed to the channel decoder.

The decoder exploits the redundancy introduced by the encoder (e.g., the extra bits in each \(n\)-bit codeword) to correct errors caused by channel disturbances.

For hard decisions, the decoder works with binary or \( Q\)-ary symbols, using techniques like Hamming distance minimization. For soft decisions, it can use probabilistic methods (e.g., maximum likelihood decoding or log-likelihood ratios) to leverage the additional information, typically achieving better performance in noisy conditions.

Channel Models#

A communication channel serves as the medium through which information is transmitted from a sender to a receiver.

The channel’s behavior is mathematically modeled to predict and optimize system performance.

A general communication channel is described by three key components:

  1. Set of Possible Inputs \(\mathcal{X}\):
    This is the input alphabet, denoted \(\mathcal{X}\), which consists of all possible symbols that can be transmitted into the channel.

    For example, in a binary system, \(\mathcal{X} = \{0, 1\}\), meaning the input is restricted to two symbols. The input alphabet defines the domain of signals the transmitter can send.

  2. Set of Possible Channel Outputs \(\mathcal{Y}\):
    This is the output alphabet, denoted \(\mathcal{Y}\), which includes all possible symbols that can be received from the channel.

    In some cases, \(\mathcal{Y}\) may differ from \(\mathcal{X}\) (e.g., due to noise altering the signal), but in simple models like binary channels, \(\mathcal{Y}\) might also be \(\{0, 1\}\).

    The output alphabet represents the range of observable outcomes at the receiver.

  3. Conditional Probability:
    The relationship between inputs and outputs is captured by the conditional probability

    \[ P\bigl[y_1, y_2, \ldots, y_n \,\bigm|\; x_1, x_2, \ldots, x_n\bigr], \]

    where \(\vec{x} = (x_1, x_2, \ldots, x_n)\) is an input sequence of length \((n)\) and \(\vec{y} = (y_1, y_2, \ldots, y_n)\) is the corresponding output sequence of length \((n)\).

    This probability distribution specifies the likelihood of receiving a particular output sequence \(\vec{y}\) given a specific input sequence \(\vec{x}\).

    It encapsulates the channel’s behavior, including effects like noise, interference, or distortion, and applies to sequences of any length \((n)\).

Together, these components \(\bigl(\mathcal{X}, \mathcal{Y}, \text{and the conditional probability}\bigr)\) provide a complete probabilistic model of the channel, enabling analysis of how reliably information can be transmitted.

The channel’s characteristics determine the strategies needed for encoding, modulation, and decoding to achieve effective communication.

Memoryless Channels#

A channel is classified as memoryless if its output at any given time depends only on the input at that same time, with no influence from previous inputs or outputs.

Mathematically, a channel is memoryless if the conditional probability of the output sequence \(\vec{y}\) given the input sequence \(\vec{x}\) factors into a product of individual conditional probabilities:

\[ P[\vec{y} \mid \vec{x}] \;=\; \prod_{i=1}^{n} P\bigl[y_i \mid x_i\bigr] \quad \text{for all } n \]

Here, \(P[y_i \mid x_i]\) is the probability of receiving output \(y_i\) given input \(x_i\) at time index \((i)\), and the product form indicates that each output \(y_i\) is statistically independent of all other inputs \(x_j\) \((j \neq i)\) and outputs \(y_j\) \((j \neq i)\), conditioned on \(x_i\).

This property implies that the channel has no “memory” of past transmissions; the effect of an input \(x_i\) on the output \(y_i\) is isolated to that specific time instance.

In other words, for a memoryless channel, the output at time \((i)\) depends solely on the input at time \((i)\), and the channel’s behavior at each time step is governed by the same conditional probability distribution \(P[y_i \mid x_i]\).

This simplifies analysis and design, as the channel can be characterized by a single-symbol transition probability rather than a complex sequence-dependent model.

The simplest and most widely studied memoryless channel model is the binary symmetric channel (BSC).

In the BSC, both the input and output alphabets are binary, i.e., \(\mathcal{X} = \mathcal{Y} = \{0, 1\}\).

As will be detailed later, we can define the BSC with the crossover probability \(p\) as

\[\begin{split} P[y_i \mid x_i] = \begin{cases} 1 - p, & \text{if } y_i = x_i \\ p, & \text{if } y_i \ne x_i \end{cases} \end{split}\]

This model is particularly suitable for systems employing binary modulation (where bits are mapped to two waveforms) and hard decisions at the detector (where the receiver makes a definitive choice between \((0)\) and \((1)\)).

The BSC captures the essence of a basic digital communication channel with symmetric error characteristics, making it a foundational concept in information theory and coding.

The Binary Symmetric Channel (BSC) Model#

The binary symmetric channel (BSC) model emerges when we consider a communication system as a composite channel, incorporating the modulator, the physical waveform channel, and the demodulator/detector as an integrated unit.

This abstraction is particularly relevant for systems with the following components:

  • Modulator with Binary Waveforms:
    The modulator maps each binary input \((0)\) or \((1)\) to one of two distinct waveforms (e.g., \( s_1(t)\) for \((0)\) and \( s_2(t)\) for \((1)\)), such as in binary phase-shift keying (BPSK).

    This converts the discrete binary sequence into a continuous-time signal for transmission.

  • Detector with Hard Decisions:
    The receiver’s demodulator processes the channel-corrupted waveform and produces an estimate, which the detector then quantizes into a binary decision \((0)\) or \((1)\), committing to one of the two possible symbols without ambiguity.

In this setup, the physical channel is modeled as an additive noise channel, where the transmitted waveform is perturbed by random noise (e.g., additive white Gaussian noise, AWGN).

The demodulator and detector together transform the noisy waveform back into a binary sequence.

The resulting composite channel operates in discrete time, with a binary input sequence (from the encoder) and a binary output sequence (from the detector).

This end-to-end system, depicted in the following figure, abstracts the continuous-time waveform transmission into a discrete-time model, simplifying analysis.

The BSC model assumes that the combined effects of modulation, channel noise, demodulation, and detection can be represented as a single discrete-time channel with binary inputs and binary outputs.

This abstraction is valid when the noise affects each transmitted bit independently, and the detector’s hard decisions align with the binary nature of the input, making the BSC an appropriate and widely used model for such systems.

Characteristics of the Binary Symmetric Channel#

The composite channel, modeled as a BSC, is fully characterized by the following:

  • Input Alphabet:
    \( \mathcal{X} = \{0, 1\}\), the set of possible binary inputs fed into the channel (e.g., the encoded bits from the transmitter).

  • Output Alphabet:
    \( \mathcal{Y} = \{0, 1\}\), the set of possible binary outputs produced by the detector after processing the received signal.

  • Conditional Probabilities:
    A set of probabilities that define the likelihood of each output given each input, capturing the channel’s error behavior.

For the BSC, the channel noise and disturbances are assumed to cause statistically independent errors in the transmitted binary sequence, with an average probability of error \((p)\), known as the crossover probability.

The conditional probabilities are symmetric and defined as:

\[\begin{split} \begin{aligned} P[Y = 0 \mid X = 1] &= P[Y = 1 \mid X = 0] = p,\\ P[Y = 1 \mid X = 1] &= P[Y = 0 \mid X = 0] = 1 - p. \end{aligned} \end{split}\]
  • \( P[Y = 0 \mid X = 1] = p\):
    The probability that an input \((1)\) is received as a \((0)\) (an error).

  • \( P[Y = 1 \mid X = 0] = p\):
    The probability that an input \((0)\) is received as a \((1)\) (an error).

  • \( P[Y = 1 \mid X = 1] = 1 - p\):
    The probability that an input \((1)\) is correctly received as a \((1)\).

  • \( P[Y = 0 \mid X = 0] = 1 - p\):
    The probability that an input \((0)\) is correctly received as a \((0)\).

The symmetry arises because the error probability \((p)\) is the same in both directions (\(0 \to 1\) and \(1 \to 0\)), and the correct reception probability is \(1 - p\).

Since the channel is memoryless, these probabilities apply independently to each transmitted bit, consistent with:

\[ P[\vec{y} \mid \vec{x}] = \prod_{i=1}^{n} P[y_i \mid x_i]. \]

The BSC is often depicted diagrammatically as a transition model with two inputs and two outputs, connected by arrows labeled with probabilities \((p)\) and \((1 - p)\).

The cascade of the binary modulator, waveform channel, and binary demodulator/detector is thus reduced to this equivalent discrete-time channel, the BSC.

This model simplifies the analysis of error rates and informs the design of error-correcting codes, as \((p)\) (typically \(0 < p < 0.5\)) quantifies the channel’s reliability.

Discrete Memoryless Channels (DMC)#

The binary symmetric channel (BSC), discussed previously, is a specific instance of a broader class of channel models known as the discrete memoryless channel (DMC).

A DMC is characterized by two key properties:

  1. Discrete Input and Output Alphabets
    The input alphabet \(\mathcal{X}\) and output alphabet \(\mathcal{Y}\) are finite, discrete sets.

    For example, \(\mathcal{X}\) might consist of \(M\) symbols (e.g., \(\{0, 1, \ldots, M-1\}\)), and \(\mathcal{Y}\) might consist of \(Q\) symbols (e.g., \(\{0, 1, \ldots, Q-1\}\)), where \(M\) and \(Q\) are integers.

  2. Memoryless Property
    The channel’s output at any given time depends only on the input at that same time, with no dependence on prior inputs or outputs.

    Mathematically, this is expressed as

    \[ P[\vec{y} \mid \vec{x}] \;=\; \prod_{i=1}^{n} P[y_i \mid x_i] \]

    for an input sequence \(\vec{x} = (x_1, x_2, \ldots, x_n)\) and output sequence \(\vec{y} = (y_1, y_2, \ldots, y_n)\).

A practical example of a DMC arises in a communication system using an \(M\)-ary memoryless modulation scheme.

Here, the modulator maps each input symbol from \(\mathcal{X}\) (with \(\lvert\mathcal{X}\rvert = M\)) to one of \(M\) distinct waveforms (e.g., in \(M\)-ary phase-shift keying, M-PSK).

The detector processes the received waveform and produces an output symbol from \(\mathcal{Y}\), consisting of \(Q\)-ary symbols (e.g., after hard or soft quantization, where \(Q \geq M\)).

The composite channel—comprising the modulator, physical channel, and detector—is thus a DMC, as the modulation and detection processes preserve the discrete and memoryless nature of the system.

The input-output behavior of the DMC is fully described by a set of conditional probabilities \( P[y \mid x]\), where \(x \in \mathcal{X}\) and \(y \in \mathcal{Y}\).

There are \(M \times Q\) such probabilities, one for each possible input-output pair.

For instance, if \(M = 2\) (binary input) and \(Q = 2\) (binary output), as in the BSC, there are \(2 \times 2 = 4\) probabilities (e.g., \(P[0\mid0], P[1\mid0], P[0\mid1], P[1\mid1]\)).

These conditional probabilities can be organized into a probability transition matrix \(\mathbf{P} = [p_{ij}]\), where:

  • The rows correspond to inputs \(x_i\), with \(i = 1, 2, \ldots, |\mathcal{X}|\).

  • The columns correspond to outputs \(y_j\), with \(j = 1, 2, \ldots, |\mathcal{Y}|\).

  • Each entry \(p_{ij} = P[y_j \mid x_i]\) is the probability of receiving output \(y_j\) given input \(x_i\).

The matrix \(\mathbf{P}\) has dimensions \(\lvert\mathcal{X}\rvert \times \lvert\mathcal{Y}\rvert\) (e.g., \(2 \times 2\) for the BSC), and each row sums to 1 (i.e., \(\sum_{j} p_{ij} = 1\) for each \(i\)), since these rows represent probability distributions over \(\mathcal{Y}\) for a given \(x_i\).

This matrix, often illustrated as in the following figure, provides a compact representation of the DMC’s statistical behavior and facilitates analysis of error rates and channel capacity.

Discrete-Input, Continuous-Output Channels#

In contrast to the DMC, the discrete-input, continuous-output channel model relaxes the constraint on the output alphabet while retaining a discrete input.

This model is defined by:

  1. Discrete Input Alphabet:
    The input to the modulator is selected from a finite, discrete set \(\mathcal{X}\), with \(\lvert\mathcal{X}\rvert = M\).

    For example, in QPSK \((M = 4)\), \(\mathcal{X} = \{0, 1, 2, 3\}\), where each symbol corresponds to a unique waveform.

  2. Continuous Output Alphabet:
    The detector’s output is unquantized, meaning \(\mathcal{Y} = \mathbb{R}\), the set of all real numbers.

    This occurs when the demodulator produces a continuous-valued estimate (e.g., a voltage or likelihood measure) without subsequent quantization into discrete levels.

This configuration defines a composite discrete-time memoryless channel, consisting of the modulator, physical channel, and detector.

The channel takes a discrete input \(X \in \mathcal{X}\) and produces a continuous output \(Y \in \mathbb{R}\).

Its behavior is characterized by a set of conditional probability density functions (PDFs):

\[ p(y \mid x), \quad x \in \mathcal{X}, \; y \in \mathbb{R}. \]

For each input symbol \(x\), \(p(y \mid x)\) is a PDF over the real line, describing the likelihood of observing a particular output value \(y\) given \(x\).

Unlike the DMC’s discrete probabilities \(\bigl(P[y \mid x]\bigr)\), here \(p(y \mid x)\) is a continuous function, and the probability of \(Y\) falling in an interval \([a, b]\) is

\[ \int_{a}^{b} p(y \mid x)\,\mathrm{d}y, \]

with

\[ \int_{-\infty}^{\infty} p(y \mid x)\,\mathrm{d}y \;=\; 1 \quad \text{for each } x. \]

This model is relevant when the receiver retains the full resolution of the received signal (e.g., soft-decision outputs) rather than forcing a discrete decision, providing more information for subsequent decoding processes.

Additive White Gaussian Noise (AWGN) Channel#

The additive white Gaussian noise (AWGN) channel is one of the most fundamental examples of a discrete-input, continuous-output memoryless channel in communication theory.

It is modeled by:

\[ Y = X + N, \]

where:

  • \(X\) is the discrete input drawn from an alphabet \(\mathcal{X}\) (for example, a modulated symbol).

  • \(N\) is a zero-mean Gaussian random variable with variance \(\sigma^2\). Its probability density function (PDF) is:

    \[ p_N(n) \;=\; \frac{1}{\sqrt{2\pi\,\sigma^2}}\, \exp\!\Bigl(-\,\tfrac{n^{2}}{2\,\sigma^2}\Bigr), \]

    representing the additive noise in the channel.

  • \(Y\) is the continuous output in \(\mathbb{R}\).

The term “white” indicates that the noise has a flat power spectral density (i.e., it is uncorrelated across time), while “Gaussian” refers to its normal distribution.

For a given input \(x\), the output \(Y\) is a Gaussian random variable with mean \(x\) and variance \(\sigma^2\), thus:

\[ p(y \mid x) = \frac{1}{\sqrt{2\pi\,\sigma^2}} \exp\!\Bigl(-\,\tfrac{(y - x)^{2}}{2\,\sigma^2}\Bigr). \]

Multiple Inputs and Outputs#

Consider a sequence of \(n\) inputs \(X_i\), \(i = 1, 2, \ldots, n\).

The corresponding outputs are:

\[ Y_i = X_i + N_i, \quad i = 1, 2, \ldots, n, \]

where each \(N_i\) is an independent, identically distributed (i.i.d.) Gaussian noise term,

\[ N_i \,\sim\, \mathcal{N}(0,\sigma^2). \]

Because the channel is memoryless, the noise in each output \(Y_i\) depends only on \(X_i\).

Formally,

\[ p(y_1, y_2, \ldots, y_n \,\big\vert\, x_1, x_2, \ldots, x_n) = \prod_{i=1}^{n} p(y_i \,\big\vert\, x_i). \]

Substituting the Gaussian PDF yields:

\[ p(y_1, y_2, \ldots, y_n \,\big\vert\, x_1, x_2, \ldots, x_n) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi\,\sigma^2}} \exp\!\Bigl(-\,\tfrac{(y_i - x_i)^{2}}{2\,\sigma^2}\Bigr). \]

This factorization confirms the channel’s memoryless nature, as the joint PDF of the output sequence is the product of individual PDFs, each depending only on the corresponding input.

Role of AWGN Channels#

The AWGN channel is a cornerstone of communication theory, providing an accurate model for systems where thermal noise dominates, such as satellite links and wireless channels.

Its importance extends to analyzing modulation schemes (e.g., BPSK, QPSK) with continuous outputs prior to any quantization, forming the basis for many fundamental results in digital communications.

The Discrete-Time AWGN Channel#

A discrete-time (continuous-input, continuous-output) additive white Gaussian noise (AWGN) channel is one in which both the input and output take values in the set of all real numbers:

\[ \mathcal{X} \;=\; \mathcal{Y} \;=\; \mathbb{R}. \]

Unlike channels with discrete alphabets, this model permits continuous-valued inputs and outputs, corresponding to a situation with no quantization at either the transmitter or the receiver.

Input–Output Relationship#

At each discrete time instant \(i\), an input \(x_i \in \mathbb{R}\) is transmitted over the channel, producing the received symbol:

\[ y_i \;=\; x_i \;+\; n_i, \]

where \(n_i\) represents additive noise.

The noise samples \(\{n_i\}\) are independent, identically distributed (i.i.d.) zero-mean Gaussian random variables with variance \(\sigma^2\).

Hence, the PDF of each \(n_i\) is

\[ p_{N_i}(n_i) \;=\; \frac{1}{\sqrt{2\pi\,\sigma^2}} \exp\!\Bigl(-\,\tfrac{n_i^{2}}{2\,\sigma^2}\Bigr). \]

Given an input \(x_i\), the output \(y_i\) is a Gaussian random variable with mean \(x_i\) and variance \(\sigma^2\). Thus, its conditional PDF is

\[ p(y_i \mid x_i) \;=\; \frac{1}{\sqrt{2\pi\,\sigma^2}} \exp\!\Bigl(-\,\tfrac{\bigl(y_i - x_i\bigr)^{2}}{2\,\sigma^2}\Bigr). \]

Power Constraint#

A key practical limitation in this channel model is the power constraint on the input, expressed as an expected power limit:

\[ \mathbb{E}\bigl[X^{2}\bigr] \;\le\; P, \]

which ensures that the transmitter does not exceed a certain average energy \(P\).

For a sequence of \(n\) input symbols

\[ \vec{x} \;=\; (x_{1}, x_{2}, \ldots, x_{n}), \]

the time-average power is

\[ \frac{1}{n}\,\sum_{i=1}^{n} x_{i}^{2} \;=\; \frac{1}{n}\,\|\vec{x}\|^{2}, \]

where

\[ \|\vec{x}\|^{2} \;=\; \sum_{i=1}^{n} x_{i}^{2} \]

is the squared Euclidean norm of \(\vec{x}\).

As \(n\) grows large, the law of large numbers implies that, with high probability, the time-average power \(\tfrac{1}{n}\|\vec{x}\|^{2}\) converges to \(\mathbb{E}[X^2]\).

Thus, the constraint

\[ \frac{1}{n}\,\|\vec{x}\|^{2} \;\le\; P \]

arises naturally. In simpler terms,

\[ \sum_{i=1}^{n} x_{i}^{2} \;\le\; n\,P. \]

Geometric Interpretation#

Geometrically, the set of all allowable input sequences \(\vec{x}\) lies within an \(n\)-dimensional sphere of radius \(\sqrt{n\,P}\) centered at the origin, since

\[ (\sqrt{n\,P})^{2} \;=\; n\,P \quad\Longleftrightarrow\quad \sum_{i=1}^{n} x_{i}^{2} \;\le\; n\,P. \]

This spherical boundary in \(n\)-dimensional space is crucial for understanding both the channel capacity and the design of signal constellations under energy constraints.

The AWGN Waveform Channel#

The AWGN waveform channel describes a physical communication medium in which both the input and output are continuous-time waveforms, rather than discrete symbols.

One can interpret this as a continuous-time, continuous-input, continuous-output AWGN channel.

In order to highlight the core behavior of the physical channel, the modulator and demodulator are treated as separate from the channel model, so that attention is directed solely to the process of waveform transmission.

Suppose the channel has a bandwidth \( W \), characterized by an ideal frequency response

\[ C(f) = 1 \quad \text{for} \quad |f| \leq W \]

and

\[ C(f) = 0 \quad \text{otherwise}. \]

This means that the channel perfectly transmits signals whose frequency components lie in the interval \(\,[-W, +W]\) and suppresses those outside this range.

The input waveform \( x(t) \) is assumed to be band-limited so that its Fourier transform satisfies

\[ X(f) = 0 \quad \text{for} \quad |f| > W, \]

ensuring that it conforms to the channel’s bandwidth.

At the channel output, the waveform \( y(t) \) is given by

\[ y(t) = x(t) + n(t), \]

where \( n(t) \) is a sample function of an additive white Gaussian noise (AWGN) process. The noise has a power spectral density

\[ \frac{N_0}{2} \quad \text{(W/Hz)}, \]

indicating that its power is distributed uniformly across all frequencies. For a channel of bandwidth \( W \), the noise power confined within the interval \(\,[-W, +W]\) is

\[ \sigma^2 = \int_{-W}^{W} \frac{N_0}{2} \, df = \frac{N_0}{2} \times 2W = N_0 W. \]

As will become clearer later, the discrete-time equivalent of this channel provides a simpler perspective on this through sampling.

Power Constraint and Signal Representation#

The input waveform \( x(t) \) must obey a power constraint:

\[ \mathbb{E}[x^2(t)] \leq P, \]

which restricts the expected instantaneous power of \( x(t) \) to \( P \).

For ergodic processes, where time averages equal ensemble averages (as is the case for stationary processes), we write:

\[ \lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} x^2(t) \, dt \leq P. \]

Interpreted over an interval of length \( T \), this stipulates that the average energy per unit time cannot exceed \( P \).

Consequently, this condition is consistent with that represented via \(\mathbb{E}[x^2(t)].\)

To analyze the channel in probabilistic terms, \( x(t) \), \( y(t) \), and \( n(t) \) are expanded in terms of a complete set of orthonormal functions \(\{\phi_j(t)\}.\)

When a signal has bandwidth \( W \) and duration \( T \), its dimension in signal space can be approximated by \( 2WT \).

This approximation follows from the sampling theorem:

  • A band-limited signal can be reconstructed from samples taken at the Nyquist rate, i.e., \( 2W \) samples per second.

  • Over a time interval \( T \), this yields \( 2W \times T \) samples, each of which corresponds to one dimension of the signal space.

Thus, the signal space effectively has \( 2W \) dimensions per second.

Orthonormal Expansion#

Using this orthonormal set, the waveforms can be written as:

\[ x(t) = \sum_{j} x_j \,\phi_j(t), \]
\[ n(t) = \sum_{j} n_j \,\phi_j(t), \]
\[ y(t) = \sum_{j} y_j \,\phi_j(t), \]

where

\[ \{\phi_j(t), \, j = 1, 2, \ldots, 2WT\} \]

are orthonormal basis functions (e.g., sinc functions or prolate spheroidal wave functions) satisfying

\[\begin{split} \int \phi_i(t) \,\phi_j(t)\, dt = \delta_{ij} = \begin{cases} 1, & \text{if } i = j,\\ 0, & \text{if } i \neq j. \end{cases} \end{split}\]

The expansion coefficients

\[ x_j = \int x(t)\,\phi_j(t)\,dt, \quad n_j = \int n(t)\,\phi_j(t)\,dt, \quad y_j = \int y(t)\,\phi_j(t)\,dt \]

are the projections of the signals onto these basis functions. Since

\[ y(t) = x(t) + n(t), \]

substituting the expansions into this relationship results in:

\[ \sum_{j} y_j \phi_j(t) = \sum_{j} x_j \phi_j(t) \;+\; \sum_{j} n_j \phi_j(t). \]

By orthonormality, matching coefficients across the sums yields

\[ y_j = x_j + n_j. \]

Because \( n(t) \) is white Gaussian noise with power spectral density \( \tfrac{N_0}{2} \), the noise coefficients \( n_j \) are i.i.d. Gaussian random variables with zero mean and variance \(\sigma^2 = \frac{N_0}{2}.\)

Hence, each dimension of the expansion carries a noise variance of \( \frac{N_0}{2} \), consistent with the total noise power spread over the channel’s bandwidth.

Equivalent Discrete-Time Channel#

One can reduce the AWGN waveform channel to a discrete-time model in which each output coefficient \(y_j\) is related to the corresponding input coefficient \(x_j\) through

\[ y_j = x_j + n_j. \]

The conditional probability density function (PDF) for each output symbol given the input symbol is

\[ p(y_j \mid x_j) \;=\; \frac{1}{\sqrt{2\pi \sigma^2}} \exp\!\Bigl(-\tfrac{(y_j - x_j)^2}{2\sigma^2}\Bigr) \;=\; \frac{1}{\sqrt{\pi N_0}} \exp\!\Bigl(-\tfrac{(y_j - x_j)^2}{N_0}\Bigr), \]

because

\[ \sigma^2 \;=\; \frac{N_0}{2} \quad\text{and}\quad \sqrt{2\pi \sigma^2} \;=\; \sqrt{2\pi \times \tfrac{N_0}{2}} \;=\; \sqrt{\pi N_0}. \]

Since the noise coefficients \(n_j\) are independent for different values of \(j\), the overall channel is memoryless, which gives

\[ p(y_1, y_2, \ldots, y_N \,\mid\, x_1, x_2, \ldots, x_N) \;=\; \prod_{j=1}^{N} p(y_j \mid x_j). \]

Vector AWGN Model#

From the relationship:

\[ y_j = x_j + n_j, \quad \text{with } n_j \sim \mathcal{N}(0, \sigma^2 = N_0/2) \]

for \(j = 1, 2, \dots, N\)

We can rewrite as a compact vector form:

\[\boxed{ \vec{y} = \vec{x} + \vec{n}, \quad \vec{n} \sim \mathcal{N}(\vec{0}, \tfrac{N_0}{2} \mathbf{I}) } \]

where:

  • \(\vec{x} = (x_1, x_2, \ldots, x_N)\) — input vector (coefficients from orthonormal expansion)

  • \(\vec{y} = (y_1, y_2, \ldots, y_N)\) — output vector

  • \(\vec{n} = (n_1, n_2, \ldots, n_N)\) — noise vector

  • \(\mathbf{I}\) — identity matrix

  • Noise is i.i.d. Gaussian, so the covariance matrix is diagonal with entries \(N_0/2\).

Power Constraint and Parseval’s Theorem#

The continuous-time power constraint translates directly to the discrete coefficients.

By Parseval’s theorem, for a signal of duration \(T\),

\[ \frac{1}{T} \int_{-T/2}^{T/2} x^2(t)\,dt \;=\; \frac{1}{T}\sum_{j=1}^{2WT} x_j^2. \]

In this interval of length \(T\), there are \(2WT\) coefficients, so the average power per coefficient is

\[ \mathbb{E}[X^2] \;=\; \frac{1}{2WT} \sum_{j=1}^{2WT} \mathbb{E}[X_j^2]. \]

Hence,

\[ \lim_{T\to\infty} \frac{1}{T} \int_{-T/2}^{T/2} x^2(t)\,dt \;=\; \lim_{T\to\infty} \frac{1}{T}\sum_{j=1}^{2WT} x_j^2 \;=\; 2W\,\mathbb{E}[X^2] \;\leq\; P. \]

Solving for \(\mathbb{E}[X^2]\), one obtains

\[ \mathbb{E}[X^2] \;\leq\; \frac{P}{2W}. \]

Accordingly, a waveform channel of bandwidth \(W\) and input power \(P\) behaves like \(2W\) uses per second of a discrete-time AWGN channel whose noise variance is \(\sigma^2 \;=\; \frac{N_0}{2}.\)

This equivalence establishes the connection between the continuous-time channel and its discrete-time counterpart.