Block Codes#

Codewords#

A \(q\)-ary block code \(\mathcal{C}\) comprises a set of \(M\) vectors, each of length \(n\), denoted as \(\vec{c}_m = (c_{m1}, c_{m2}, \ldots, c_{mn})\), where \(1 \leq m \leq M\).

These vectors, termed codewords, have components drawn from an alphabet of \(q\) symbols.

When the alphabet is binary (symbols \(0\) and \(1\)), the code is a binary code, widely used due to its simplicity in digital systems.

If \(q = 2^b\), where \(b\) is a positive integer, each \(q\)-ary symbol corresponds to a \(b\)-bit binary sequence, enabling efficient representation.

A nonbinary code with block length \(N\) can be transformed into a binary code of length \(n = bN\), facilitating compatibility with binary transmission systems by expanding each symbol into its binary equivalent.

\((n, k)\) Code#

In a binary block code of length \(n\), there are \(2^n\) possible codewords, representing all combinations of \(n\) bits.

From this set, \(M = 2^k\) codewords (\(k < n\)) are selected to form a code, where each codeword corresponds to a \(k\)-bit information block, adding redundancy for error control.

This mapping defines an \((n, k)\) code with a rate \(R_c = \frac{k}{n}\), indicating the fraction of information bits per codeword.

For a general \(q\)-ary code, there are \(q^n\) possible codewords, and selecting \(M = q^k\) codewords allows transmission of \(k\)-symbol information blocks, generalizing the binary case to larger alphabets and increasing flexibility in design.

Weight (Introduction)#

Beyond the code rate \(R_c\), a key codeword parameter is its weight, defined as the number of nonzero elements it contains, reflecting its density of active symbols.

Each codeword in a code typically has a unique weight, and the collection of all weights forms the weight distribution, a critical metric for assessing error-correcting capability.

If all \(M\) codewords share the same weight, the code is classified as a fixed-weight or constant-weight code, offering uniform properties that simplify analysis and implementation in specific applications, such as balanced signaling.

Linear Block Codes#

Linear block codes (LBCs), a prominent subset of block codes, have been extensively studied over recent decades due to their advantageous properties.

Their linearity ensures easier implementation and analysis, as operations follow algebraic rules, reducing computational complexity.

LBCs perform comparably to the broader class of block codes, making them practical without significant loss in efficacy.

An LBC \(\mathcal{C}\) is a \(k\)-dimensional subspace within an \(n\)-dimensional space, termed an \((n, k)\) code.

For binary LBCs, \(\mathcal{C}\) contains \(2^k\) sequences of length \(n\), closed under addition: if \(\vec{c}_1, \vec{c}_2 \in \mathcal{C}\), then \(\vec{c}_1 + \vec{c}_2 \in \mathcal{C}\).

This closure ensures the all-zero vector \(\vec{0}\) is always a codeword, a property stemming from the linear structure.

Generator Matrix for LBC#

In an LBC, the mapping from \(M = 2^k\) information sequences of length \(k\) to \(2^k\) codewords of length \(n\) is represented by a \(k \times n\) generator matrix \(\mathbf{G}\), where:

\[ \vec{c}_m = \vec{u}_m \mathbf{G}, \quad 1 \leq m \leq 2^k \]

Here, \(\vec{u}_m\) is a \(k\)-bit information sequence, and \(\vec{c}_m\) is its corresponding codeword.

The matrix \(\mathbf{G}\) encapsulates the linear transformation, enabling systematic encoding by defining how information bits generate codewords, streamlining both design and hardware realization.

Generator Matrix Details#

The rows of \(\mathbf{G}\), denoted \(\vec{g}_i\) for \(1 \leq i \leq k\), are codewords corresponding to the \(k\) standard basis vectors of the information space (e.g., \((1,0,\ldots,0)\), etc.), structured as:

\[\begin{split} \mathbf{G} = \begin{bmatrix} \vec{g}_1 \\ \vec{g}_2 \\ \vdots \\ \vec{g}_k \end{bmatrix} \end{split}\]

Thus, a codeword is:

\[ \vec{c}_m = \sum_{i=1}^{k} u_{mi} \vec{g}_i \]

where summation occurs in the Galois Field \(\mathrm{GF}(2)\) (modulo-\(2\) arithmetic), ensuring binary operations.

The code \(\mathcal{C}\) is the row space of \(\mathbf{G}\), encompassing all linear combinations of its rows.

Two LBCs, \(\mathcal{C}_1\) and \(\mathcal{C}_2\), are equivalent if their generator matrices share the same row space, possibly after column permutation, indicating that code properties depend on the subspace, not the specific matrix representation.

Parity Check Bits#

When the generator matrix \(\mathbf{G}\) of a linear block code is structured as:

\[ \mathbf{G} = [\mathbf{I}_k | \mathbf{P}] \]

where \(\mathbf{I}_k\) is a \(k \times k\) identity matrix and \(\mathbf{P}\) is a \(k \times (n - k)\) matrix, the code is classified as systematic.

In systematic codes, the first \(k\) components of a codeword directly correspond to the information sequence, explicitly retaining the original data.

The subsequent \(n - k\) components, termed parity check bits, are generated from the information bits using \(\mathbf{P}\) and provide redundancy to mitigate transmission errors.

It is noteworthy that any linear block code can be converted into an equivalent systematic form, as represented by the above equation, through elementary row operations (such as row swapping or addition) and column permutations, preserving the code’s essential characteristics while achieving this structured format.

Dual Code and Parity Check Matrix#

Since \(\mathcal{C}\) constitutes a \(k\)-dimensional subspace within the \(n\)-dimensional binary vector space, its orthogonal complement—consisting of all \(n\)-dimensional binary vectors orthogonal to every codeword in \(\mathcal{C}\)—forms an \((n - k)\)-dimensional subspace.

This subspace defines an \((n, n - k)\) code, denoted \(\mathcal{C}^\perp\), referred to as the dual code of \(\mathcal{C}\).

The generator matrix of \(\mathcal{C}^\perp\), an \((n - k) \times n\) matrix, has rows orthogonal to those of \(\mathbf{G}\), the generator matrix of \(\mathcal{C}\).

This matrix, known as the parity check matrix of \(\mathcal{C}\) and denoted \(\mathbf{H}\), satisfies the condition that for every codeword \(\vec{c} \in \mathcal{C}\):

\[ \vec{c} \mathbf{H}^{\sf T} = \vec{0} \]

This orthogonality property enables \(\mathbf{H}\) to serve as a mechanism for verifying membership in \(\mathcal{C}\), leveraging the algebraic structure of linear codes for error detection.

Parity Check Matrix Properties#

A binary \(n\)-dimensional vector \(\vec{c}\) satisfies \(\vec{c} \mathbf{H}^{\sf T} = \vec{0}\) if and only if it resides in the orthogonal complement of the row space of \(\mathbf{H}\), which corresponds exactly to \(\mathcal{C}\).

Thus, this equation establishes a necessary and sufficient condition for \(\vec{c} \in \{0, 1\}^n\) to be a codeword, rendering \(\mathbf{H}\) an effective tool for codeword identification.

Given that the rows of \(\mathbf{G}\) are codewords, it follows that:

\[ \mathbf{G} \mathbf{H}^{\sf T} = \vec{0} \]

For systematic codes where \(\mathbf{G} = [\mathbf{I}_k | \mathbf{P}]\), the parity check matrix is expressed as:

\[ \mathbf{H} = [ -\mathbf{P}^{\sf T} | \mathbf{I}_{n-k} ] \]

In the binary field (GF(2)), where modulo-\(2\) arithmetic applies, \(-\mathbf{P}^{\sf T} = \mathbf{P}^{\sf T}\), simplifying \(\mathbf{H}\) to \([ \mathbf{P}^{\sf T} | \mathbf{I}_{n-k} ]\).

This configuration ensures that the product \(\mathbf{G} \mathbf{H}^{\sf T} = \vec{0}\) holds, as the interaction between \(\mathbf{P}\) and \(\mathbf{P}^{\sf T}\) aligns with the identity matrices to satisfy the orthogonality condition.

EXAMPLE: (7, 4) Linear Block Code#

Consider a (7, 4) linear block code with the generator matrix:

\[\begin{split} \mathbf{G} = [\mathbf{I}_4 | \mathbf{P}] = \begin{bmatrix} 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 0 & 1 & 1 \end{bmatrix} \end{split}\]

This structure, featuring a 4×4 identity matrix \(\mathbf{I}_4\) followed by a 4×3 matrix \(\mathbf{P}\), identifies it as a systematic code, where the first four bits of each codeword mirror the information sequence.

The corresponding parity check matrix is:

\[\begin{split} \mathbf{H} = [\mathbf{P}^{\sf T} | \mathbf{I}_{n-k}] = \begin{bmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 0 & 1 & 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 & 1 \end{bmatrix} \end{split}\]

For an information sequence \(\vec{u} = (u_1, u_2, u_3, u_4)\), the codeword \(\vec{c} = (c_1, c_2, \ldots, c_7)\) is computed as:

\[\begin{split} \begin{align} &c_1 = u_1, \quad c_2 = u_2, \quad c_3 = u_3, \quad c_4 = u_4, \\ &c_5 = u_1 + u_2 + u_3, \quad c_6 = u_2 + u_3 + u_4, \\ &c_7 = u_1 + u_2 + u_4 \end{align} \end{split}\]

Here, the first four components copy the input, while the last three (parity bits) are linear combinations over GF(2), enhancing error detection capability, as verifiable through computational tools like MATLAB.