Coherent DSP Internals

A coherent optical receiver is, structurally, a software-defined radio for light: the analogue front-end captures both quadratures of both polarisations into four parallel ADCs, and a high-speed digital signal processor undoes — in order — every linear impairment the fibre and the optoelectronics imposed on the signal. This chapter walks the receiver block by block, from the 90° optical hybrid through bulk chromatic-dispersion compensation, adaptive equalisation, frequency-offset estimation, carrier-phase recovery, and SD-FEC, and closes with probabilistic constellation shaping (PCS) — the modulation-shaping technique that recovers the last ~1 dB of the gap to Shannon capacity.

ConceptWhat it says
90° Optical Hybrid + Balanced DetectionMixes the incoming signal with a local-oscillator laser in two orthogonal phases (0° and 90°) per polarisation, producing four photocurrents whose differences are the in-phase (I) and quadrature (Q) components of each polarisation. Balanced photodiodes cancel direct-detection terms and double the heterodyne-mixing efficiency.
Adaptive EqualisationA 2×2 MIMO FIR filter that simultaneously demultiplexes the two polarisation tributaries and tracks slowly varying PMD and PDL. Most production systems use the constant-modulus algorithm (CMA) for blind start-up, then switch to a decision-directed LMS once the equaliser converges.
Carrier-Phase Recovery (CPR)The block that removes the residual phase mismatch between the transmitter laser and the local oscillator. Modern systems use blind phase search (BPS) — testing many candidate phase rotations and picking the one minimising decision error — or feedforward Viterbi-Viterbi for QPSK class formats.
Probabilistic Constellation Shaping (PCS)Replaces the uniform symbol distribution of standard QAM with a Maxwell-Boltzmann-shaped distribution that prefers low-amplitude symbols, recovering ~1 dB of OSNR (the “shaping gain”) and enabling continuous rate adaptation between two adjacent QAM orders.

Receiver Front-End

The optical front-end converts the modulated optical field into four electrical waveforms suitable for digitisation. The chain is:

flowchart LR
    SIGIN["Signal in<br/>(DP-modulated)"] --> PBS["Polarisation<br/>beam splitter"]
    PBS -->|"X-pol"| HX["90° hybrid<br/>(X)"]
    PBS -->|"Y-pol"| HY["90° hybrid<br/>(Y)"]
    LO["Local oscillator<br/>tunable laser<br/>~100 kHz linewidth"] --> SPLIT["LO splitter"]
    SPLIT --> HX
    SPLIT --> HY
    HX -->|"I/Q"| BPDX["Balanced PDs<br/>(X)"]
    HY -->|"I/Q"| BPDY["Balanced PDs<br/>(Y)"]
    BPDX --> ADCX["4-channel ADC<br/>>= 2x symbol rate<br/>(Nyquist)"]
    BPDY --> ADCX
    ADCX --> DSP["DSP ASIC"]
    style SIGIN fill:#378ADD,stroke:#185FA5,color:#fff
    style LO fill:#D85A30,stroke:#993C1D,color:#fff
    style HX fill:#1D9E75,stroke:#0F6E56,color:#fff
    style HY fill:#1D9E75,stroke:#0F6E56,color:#fff
    style BPDX fill:#7F77DD,stroke:#534AB7,color:#fff
    style BPDY fill:#7F77DD,stroke:#534AB7,color:#fff
    style ADCX fill:#BA7517,stroke:#854F0B,color:#fff
    style DSP fill:#E24B4A,stroke:#A32D2D,color:#fff

Polarisation beam splitter separates X and Y; each pol is mixed with the LO in a 90° hybrid; balanced photodetectors produce four electrical streams (XI, XQ, YI, YQ) which are sampled at ≥ 2× symbol rate per the Nyquist criterion (01-optical-physics-and-link-engineering signal-transmission primer).

The local oscillator is a narrow-linewidth tunable laser — typically an integrated tunable-laser assembly (ITLA) with ~100 kHz linewidth — whose frequency lands within ±2-3 GHz of the signal carrier. The 90° hybrid is a four-port optical interferometer (a 4×4 MMI on InP or SiPh) that produces four output ports with relative phases 0°, 90°, 180°, 270°. Pairing the 0°/180° ports into one balanced photodetector and the 90°/270° ports into another extracts I and Q with full common-mode rejection of direct-detect terms.

Key Insight

Coherent detection extracts both amplitude and phase of the optical field, not just intensity. This is what makes high-order QAM (where information is encoded in phase) viable in the first place — a direct-detect receiver cannot read phase at all.

ADC sampling rate must satisfy Nyquist: at least 2× the signal symbol rate (in practice 1.5-2.0× with anti-alias filtering, e.g. 80 GSa/s for 64 Gbaud). ADC effective number of bits (ENOB) sets a lower bound on receiver performance — every loss of 1 ENOB costs ~6 dB of available SNR headroom.

Modulation formatSymbol rate (typical)ADC sample rateRequired ENOB
DP-QPSK 100G32 Gbaud~64 GSa/s5-6
DP-16QAM 200G32 Gbaud~64 GSa/s6-7
DP-16QAM 400G64 Gbaud~96 GSa/s6.5-7
DP-64QAM 600G+~70 Gbaud~128 GSa/s7.5-8
DP-PCS-64QAM 800G~96 Gbaud~128-160 GSa/s8

DSP Signal Chain

The DSP processes the four parallel digital streams in a fixed pipeline. Each block is responsible for undoing one specific impairment and feeds its corrected output to the next block.

flowchart LR
    ADC["ADC samples<br/>(XI XQ YI YQ)"] --> SKEW["Skew & IQ<br/>imbalance correction"]
    SKEW --> CD["Bulk CD compensation<br/>frequency-domain FIR<br/>(overlap-save)"]
    CD --> EQ["Adaptive equaliser<br/>2x2 MIMO FIR<br/>CMA --> DD-LMS"]
    EQ --> FOE["Frequency-offset<br/>estimation"]
    FOE --> CPR["Carrier-phase<br/>recovery (BPS / V-V)"]
    CPR --> SYMB["Symbol decoder<br/>QAM demap"]
    SYMB --> FEC["SD-FEC decoder<br/>LDPC"]
    FEC --> CLIENT["Client payload<br/>(OTN / Ethernet)"]
    style ADC fill:#378ADD,stroke:#185FA5,color:#fff
    style SKEW fill:#7F77DD,stroke:#534AB7,color:#fff
    style CD fill:#1D9E75,stroke:#0F6E56,color:#fff
    style EQ fill:#1D9E75,stroke:#0F6E56,color:#fff
    style FOE fill:#BA7517,stroke:#854F0B,color:#fff
    style CPR fill:#BA7517,stroke:#854F0B,color:#fff
    style SYMB fill:#D85A30,stroke:#993C1D,color:#fff
    style FEC fill:#D85A30,stroke:#993C1D,color:#fff
    style CLIENT fill:#378ADD,stroke:#185FA5,color:#fff

Front-End Skew and IQ Imbalance Correction

The first DSP block compensates for fixed analogue impairments in the receiver hardware: timing skew between the four ADC lanes (typically a few picoseconds) and gain or phase imbalance between I and Q. These calibrations are static — measured at factory test or once at boot — and applied as small linear corrections to the sample streams.

Bulk Chromatic-Dispersion Compensation

Chromatic dispersion accumulated over the link can reach 30 000-60 000 ps/nm on a 1500-3000 km G.652.D path — equivalent to spreading a 32 Gbaud pulse across hundreds of symbol periods. CD compensation is a static linear filter with the inverse of the fibre’s CD transfer function:

H_CD(f) = exp(+j · π · D · L · λ² · f² / c)

where D is the fibre dispersion coefficient (ps/(nm·km)), L is the length, λ is the carrier wavelength, and f is the baseband frequency. Because the impulse response is hundreds of taps long, CD compensation is implemented in the frequency domain using overlap-save FFT/IFFT — the only practical approach for tap counts > ~64.

CD-compensation algorithmDomainComplexityTypical use
Time-domain FIRTimeO(N · L_tap) per sampleShort reach (< 200 km), low CD
Frequency-domain FIR (overlap-save)FrequencyO(N · log N) per blockStandard for long-haul (> 200 km)
Frequency-domain FIR (overlap-add)FrequencyO(N · log N) per blockEquivalent — vendor preference
Half-symbol cyclic-prefix variantsHybridO(N · log N)OFDM-like coherent (research)

The CD parameter is either provisioned (link length × known fibre D) or auto-estimated by sweeping a candidate CD range and minimising a blind cost function (the “CMA-after-CD” minimum loss).

Adaptive Equalisation — Polarisation, PMD, and PDL

After CD compensation, the signal still suffers from time-varying impairments: polarisation rotation in the fibre, polarisation-mode dispersion (PMD, with mean DGD scaling as the PMD coefficient × √L), and polarisation-dependent loss (PDL) at every connector or amplifier. The 2×2 MIMO FIR equaliser tracks all three simultaneously:

[ y_X ]   [ h_XX  h_XY ]   [ x_X ]
[ y_Y ] = [ h_YX  h_YY ] * [ x_Y ]

Each filter h_ij is a complex-valued FIR with 11-31 taps at T/2 spacing — enough to track first-order PMD on modern fibre. Tap update typically uses the Constant Modulus Algorithm (CMA) for blind start-up (because QPSK has constant amplitude this works for any QAM at start-up using only the constant-modulus property), then switches to decision-directed LMS or radius-directed equalisation (RDE) once symbols are reliable enough to use as references.

Equaliser parameterTypical valueTracks
Tap count (per filter)11-31First-order PMD
Tap spacingT/2 (half symbol)Fractional sample timing
Update algorithmCMA → DD-LMS / RDEBlind init then steady-state
Convergence step size1e-4 to 1e-3Tracking bandwidth ~10-100 kHz
PMD tolerance~50-100 ps DGDHigher tap count → higher tolerance

Warning

The equaliser is the single most fragile DSP block in operational systems. CMA convergence can land on a local minimum where both equaliser outputs lock onto the same input polarisation (the “singularity” failure) — modern transponders deliberately initialise with a small spectral pre-rotation to break the symmetry.

Frequency-Offset Estimation (FOE)

Transmitter and local-oscillator lasers are independent free-running ITLAs, so a 100-200 MHz residual frequency offset between them is normal. FOE estimates this offset — either from the 4th-power spectrum of QPSK-class signals (which collapses modulation phase to a single tone at 4× the offset) or from pilot tones in PCS systems — and counter-rotates the samples digitally. Without FOE, the constellation would smear continuously around its centre.

Carrier-Phase Recovery (CPR)

CPR removes the slowly-varying residual phase between the lasers and the laser-linewidth-induced phase noise. Two algorithms dominate:

AlgorithmPrincipleBest for
Viterbi-Viterbi (V-V)4th-power feedforward — same trick as FOE but with a sliding windowQPSK / DP-QPSK class formats
Blind Phase Search (BPS)Try N candidate phase rotations, decide each, pick rotation minimising decision errorHigh-order QAM (16/32/64-QAM, PCS)

BPS dominates 200G+ designs because it scales gracefully to high QAM orders where V-V’s 4th-power ambiguity hurts. Typical BPS uses 16-64 trial phases per symbol, parallelised across many symbols per cycle in the ASIC.

Symbol Decoder and SD-FEC

After phase recovery, complex symbols are mapped to bits via Gray-coded QAM demapping, producing log-likelihood ratios (LLRs) for each bit. The LLRs feed a Soft-Decision Forward Error Correction decoder — modern systems use LDPC codes with iterative belief-propagation decoding, achieving net coding gains (NCG) of 11-12 dB at pre-FEC BER thresholds of ~1.2e-2. Compare this to the older G.709 hard-decision RS(255,239) at NCG ≈ 6 dB and a 8.5e-5 threshold.

Key Insight

SD-FEC is the single largest reach-extending innovation in coherent optics. The 5-6 dB of additional coding gain over hard-decision FEC corresponds, in span-budget terms, to 1-2 extra spans of reach — or equivalently, one full modulation step (e.g. DP-16QAM where DP-QPSK was previously needed).

Probabilistic Constellation Shaping (PCS)

In standard QAM, all constellation points have equal probability. Shannon’s capacity theorem says this is suboptimal for an AWGN channel: the capacity-achieving distribution is continuous Gaussian. PCS approximates this by drawing symbols from a discrete Maxwell-Boltzmann distribution that prefers low-amplitude (inner) constellation points and de-emphasises high-amplitude (outer) corners. The result is a roughly Gaussian-shaped output that recovers about 1 dB of the gap to Shannon — the shaping gain.

xychart-beta
    title "Symbol probability across 64-QAM amplitudes (uniform vs PCS)"
    x-axis "Symbol amplitude bin (1=innermost, 8=outermost)" [1, 2, 3, 4, 5, 6, 7, 8]
    y-axis "Probability" 0 --> 0.30
    bar "Uniform 64-QAM" [0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125]
    bar "PCS-64QAM (Maxwell-Boltzmann)" [0.26, 0.21, 0.16, 0.12, 0.09, 0.07, 0.05, 0.04]

Two practical consequences make PCS the default for 400G+ pluggable optics:

  1. ~1 dB shaping gain at any given constellation order. A PCS-64QAM at the right shaping factor outperforms uniform-32QAM at the same average power.
  2. Continuous rate adaptation. By varying the Maxwell-Boltzmann temperature parameter, the average number of bits per symbol can be tuned continuously between two adjacent QAM orders — for example smoothly between PDM-QPSK (4 bits/symbol) and PDM-16QAM (8 bits/symbol). A single transponder hardware can therefore serve a wide range of OSNR / reach combinations without changing modulation discretely.
PCS shaping pointEffective bits/symbolOSNR requirement (dB, @ 0.1 nm)Approx reach (G.652.D, 75 GHz flex-grid slot — 400G class)
PCS-16QAM (light)~3.5~14~3000 km
PCS-16QAM (uniform-eq)4.0~16~2000 km
PCS-64QAM (mid)~5.5~21~1500 km
PCS-64QAM (heavy)~6.0~25~600-800 km

Rule of Thumb

PCS is “free” 1 dB. Any serious 400G+ design uses PCS or signal-shaping equivalent (geometric constellation shaping is the close cousin). Drafts comparing PCS-N-QAM directly to uniform-N-QAM at the same average power without accounting for the shaping gain are wrong.

Real-Time vs Offline DSP

In research literature DSP algorithms are evaluated in MATLAB / Python with offline-captured samples. In the field, every DSP block must run in real time, in parallel, on ASIC silicon, at the symbol rate (32-96 Gbaud → trillions of multiply-accumulates per second). This forces three engineering compromises that academic algorithms do not face:

ConstraintOffline DSPReal-time DSP
FFT size for CD compUnlimited (millions of samples)Limited to ~2048-4096 (latency, area)
Equaliser tap countUnlimited~21-31 (ASIC area scales with tap × parallelism)
Algorithm choiceOptimal for SNRLowest-complexity that meets the OSNR target
Iteration countHighBounded by clock cycles per symbol
ASIC powerIrrelevantDominant — drives form-factor selection

ASIC power for a 400G coherent DSP is roughly 6-10 W today (5 nm node); 800G doubles symbol rate and pushes 12-20 W; co-packaged optics (see 09-photonic-integration-and-pluggable-optics) targets sub-15 W for 800G to fit pluggable thermal envelopes.

Summary

Coherent reception is a software-defined chain of linear-impairment removal: front-end calibration → bulk CD → 2×2 MIMO equalisation for polarisation/PMD/PDL → frequency-offset estimation → carrier-phase recovery → QAM demap → SD-FEC. PCS layers on top of QAM to recover the last dB toward Shannon and enables fine-grained rate adaptation. The whole chain must run in real time on power-budgeted ASIC silicon — and that constraint, more than the algorithms themselves, drives the modulation/symbol-rate/form-factor tradeoffs of every modern coherent product.

See Also

References

Standards (ITU-T / OIF)

  1. ITU-T G.709/Y.1331Interfaces for the Optical Transport Network (06/2020). https://www.itu.int/rec/T-REC-G.709
  2. ITU-T G.975.1Forward error correction for high bit-rate DWDM submarine systems (02/2004). https://www.itu.int/rec/T-REC-G.975.1
  3. ITU-T G.798Characteristics of optical transport network hierarchy equipment functional blocks (12/2017). https://www.itu.int/rec/T-REC-G.798
  4. OIF 400ZR Implementation Agreement (OIF-400ZR-01.0). https://www.oiforum.com/documents/

Books

  1. S. J. Savory, “Digital Coherent Optical Receivers: Algorithms and Subsystems,” IEEE J. Sel. Top. Quantum Electron. 16, 1164 (2010).
  2. K. Kikuchi, “Fundamentals of Coherent Optical Fiber Communications,” J. Lightwave Technol. 34, 157 (2016).
  3. G. P. Agrawal, Fiber-Optic Communication Systems, 5th ed., Wiley, 2021.

Papers

  1. T. Pfau, S. Hoffmann, R. Noé, “Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. 27, 989 (2009). [Blind Phase Search]
  2. D. Godard, “Self-recovering equalization and carrier tracking in two-dimensional data communication systems,” IEEE Trans. Commun. 28, 1867 (1980). [CMA]
  3. F. Buchali, F. Steiner, G. Böcherer, et al., “Rate Adaptation and Reach Increase by Probabilistically Shaped 64-QAM: An Experimental Demonstration,” J. Lightwave Technol. 34, 1599 (2016).
  4. J. Cho et al., “Probabilistic Constellation Shaping for Optical Fiber Communications,” J. Lightwave Technol. 37, 1590 (2019).
  5. K. Roberts et al., “Beyond 100 Gb/s: Capacity, Flexibility, and Network Optimization,” J. Lightwave Technol. 35, 1 (2017).