IP Multicast and MVPN in Service Provider Networks

IP multicast replaces N-way unicast replication with a single source stream that routers duplicate only where paths diverge, turning linear bandwidth growth into tree-shaped growth [16]. This chapter covers the fundamentals (Class D addressing [5], IGMP [4], PIM variants [1, 2]), the MVPN service models that extend multicast into customer L3VPNs — Draft-Rosen [9] and the modern BGP/MPLS MVPN per RFC 6513/6514 [7, 8] — and the real-world workloads (IPTV, financial market data, software distribution) that depend on these mechanisms.

Cards: 4

IGMP

Host-to-router signalling. End devices declare “I want group G” to their local router; leaves and joins drive edge replication state.

PIM

Router-to-router multicast routing. Builds distribution trees between routers in three flavours: Dense Mode, Sparse Mode, and Source-Specific Multicast.

Draft-Rosen MVPN

First-generation customer-multicast-over-VPN. GRE tunnels between PEs carry customer PIM. Simple, but every PE joins the default MDT whether or not it has receivers.

BGP/MPLS MVPN

Modern RFC 6513/6514 design. MP-BGP mvpn address family signals state; P2MP MPLS LSPs (built by mLDP or RSVP-TE) replicate in the data plane. Only interested PEs join the tree.

Content

Why Multicast Exists: The Replication Problem

Consider an IPTV operator streaming a live match to 100,000 subscribers at 5 Mbps each. Unicast delivery means the head-end originates 100,000 independent copies of the same content, demanding roughly 500 Gbps of egress bandwidth from a single source. This scales linearly with subscriber count and collapses the moment a popular event arrives.

Multicast inverts the model. The source emits one 5 Mbps stream into a distribution tree. Branch routers duplicate packets only at divergence points — where one downstream interface leads to two or more receiver segments. The edge router delivers a single copy per interested local listener. Total backbone carriage is bounded by tree shape, not by receiver count.

Highlight: Key Insight Multicast turns an O(N) replication problem into an O(branching factor) problem. A stream to 100,000 receivers costs the backbone the same as a stream to 100, provided the tree shape is similar.

Delivery modelSource egressBackbone carriageReceiver scalingTypical use
UnicastN x streamN x streamLinear (O(N))Web, VoD, most apps
Multicast1 x stream~tree x streamNear-constantIPTV, market data
Broadcast1 x streamFloods everywhereDomain-wideL2 discovery only

IP Multicast Fundamentals

Multicast destination addresses live in 224.0.0.0/4 (the former Class D range) [5]. A multicast group is identified by its group address — for example, 232.1.1.1 might represent “Channel 5 HD” [2, 5]. Any receiver that joins the group receives traffic sent to it; any source that transmits to it contributes packets into the tree. There is no concept of “owning” a group at the IP layer — membership is open, subject to network policy [5, 6].

Two protocols cooperate to deliver multicast end-to-end. IGMP operates between hosts and their first-hop router [4]. PIM operates between routers across the network [1]. IGMP builds edge state; PIM stitches those edges into a network-wide tree.

graph LR
    Source["Multicast<br/>Source<br/>S"]
    R1["R1<br/>(ingress)"]
    R2["R2<br/>(branch)"]
    R3["R3<br/>(branch)"]
    E1["Edge 1"]
    E2["Edge 2"]
    E3["Edge 3"]
    H1["Host A<br/>(IGMP join G)"]
    H2["Host B<br/>(IGMP join G)"]
    H3["Host C<br/>(IGMP join G)"]
    Source --> R1
    R1 -- "1 copy" --> R2
    R2 -- "1 copy" --> E1 --> H1
    R2 -- "1 copy" --> R3
    R3 -- "1 copy" --> E2 --> H2
    R3 -- "1 copy" --> E3 --> H3

One source stream reaches three receivers; routers replicate only at branching points R2 and R3.

IGMP: Host Membership

IGMP (Internet Group Management Protocol) is how an end-host signals local group interest [4]. When a subscriber’s set-top box tunes to a channel, it issues an IGMP Join for the corresponding group [4]. The first-hop router installs the host’s interface in its local outgoing interface list (OIL) for that group. When the host leaves or its query timer expires, the interface is pruned [4].

IGMP has no awareness of the network beyond the host’s LAN. It is the trigger that tells the first-hop router “pull this group toward me” — PIM then handles the pull across the backbone.

PIM: Building the Tree

PIM (Protocol Independent Multicast) is the routing protocol that constructs distribution trees between routers [1]. “Protocol independent” means PIM uses whatever unicast routing table is already present (IGP, BGP) to perform its reverse-path forwarding checks — it does not run its own routing computation [1]. Three operating modes exist, and SP relevance varies sharply [1, 2]:

PIM ModeTree ConstructionScalabilitySP Relevance
Dense ModeFlood to all PIM neighbours, prune back where no receivers existPoor (floods first)Never used in SPs; suits small LANs only
Sparse ModeReceivers explicitly join an RP-rooted shared tree; optional SPT switchoverGoodUsed in enterprise and some legacy SP MVPN
SSMReceivers join a specific (S,G) directly; no Rendezvous Point involvementBestPreferred for IPTV and new SP deployments

Highlight: Warning PIM Dense Mode assumes receivers are everywhere and floods first, pruning later. In an SP backbone this produces exactly the bandwidth waste multicast is meant to avoid. Dense Mode is almost never correct outside small isolated LANs.

PIM Sparse Mode (PIM-SM) uses a Rendezvous Point (RP) as a well-known meeting place [1]. Receivers send (*,G) joins toward the RP, building a shared tree [1]. Sources register with the RP, which forwards traffic down the shared tree [1]. Once traffic is flowing, receivers may switch to a source-specific shortest-path tree (S,G) for efficiency [1]. PIM-SM works, but it requires RP management (anycast RP, MSDP, static vs auto-RP) which adds operational overhead.

PIM Source-Specific Multicast (PIM-SSM) skips the RP entirely [2]. Hosts signal the specific (S,G) they want (via IGMPv3 source-filtering or statically configured mappings) [3, 4]. Routers build the shortest-path tree rooted at the source directly [2]. Most modern IPTV deployments use SSM in the 232.0.0.0/8 range precisely because there is no RP to operate [2, 5]. Signalling is simpler, failure modes are narrower, and scaling is cleaner.

stateDiagram-v2
    [*] --> NoInterest
    NoInterest --> JoinPending: IGMP Report<br/>received on LAN
    JoinPending --> OnSharedTree: PIM (*,G) Join<br/>to RP (PIM-SM)
    JoinPending --> OnShortestPathTree: PIM (S,G) Join<br/>to source (SSM)
    OnSharedTree --> OnShortestPathTree: SPT-switch<br/>threshold
    OnShortestPathTree --> NoInterest: IGMP Leave<br/>or timeout
    OnSharedTree --> NoInterest: IGMP Leave<br/>or timeout

Multicast VPN (MVPN): Customer Multicast Across an SP Core

A customer who runs multicast inside their own L3VPN — for example, a financial firm delivering real-time market data from a head-office feed handler to branch-office trading desks — needs those groups to traverse the SP backbone with the same isolation guarantees that the L3VPN unicast plane already provides [7]. Simply flooding customer multicast into the global table would break everything: isolation, scaling, and policy.

MVPN is the set of mechanisms that let an SP carry customer multicast across the provider MPLS core [7]. Two generations exist in production networks today [7, 15].

Generation 1: Draft-Rosen MVPN

The first MVPN design (often called “Rosen draft” after the lead author) predates MPLS P2MP LSPs [9]. It glues customer multicast to the core using GRE tunnels between PEs [9]:

  1. Every PE in a given MVPN joins a default MDT (Multicast Distribution Tree) — effectively a group address in the provider’s global multicast space [9].
  2. GRE tunnels between PEs carry PIM signalling and customer multicast data [9].
  3. Inside the GRE tunnel, the PEs run PIM as if they were directly adjacent [9].
  4. High-bandwidth groups can migrate to a data MDT to avoid flooding the default MDT [9].

Highlight: Warning Draft-Rosen’s scaling limitation is structural: every PE belonging to the MVPN participates in the default MDT whether or not it has any local receivers [9]. A 200-PE MVPN with two interested PEs still burns state and link bandwidth on the other 198. This is why modern deployments use BGP/MPLS MVPN [7, 8].

Generation 2: BGP/MPLS MVPN (RFC 6513/6514)

The modern design eliminates GRE tunnels and the “every PE joins” problem [7]. Two planes do the work:

  • Control plane: MP-BGP gains a new address family, ipv4 mvpn (and ipv6 mvpn) [8]. PEs exchange multicast state as BGP routes using the same RD/RT machinery that already isolates and targets L3VPN unicast routes [7, 8].
  • Data plane: Customer multicast rides P2MP MPLS LSPs, which branch in the core and replicate at LSR branch points [7]. These LSPs are typically built by mLDP (multicast LDP extensions) [10] or by RSVP-TE P2MP when the TE infrastructure is already in place [11].
graph TD
    subgraph "Customer VRF CUST-A"
        CSrc["C-Source<br/>(market data feed)"]
        CRcv1["C-Receiver 1<br/>(branch trading desk)"]
        CRcv2["C-Receiver 2<br/>(branch trading desk)"]
    end

    subgraph "SP MPLS Core"
        PE_Ingress["Ingress PE<br/>(C-Source attached)"]
        P1["P (branch LSR)"]
        PE_Egress1["Egress PE 1"]
        PE_Egress2["Egress PE 2"]
    end

    CSrc --> PE_Ingress
    PE_Ingress -- "P2MP LSP<br/>(transport + MVPN label)" --> P1
    P1 -- "replicate" --> PE_Egress1
    P1 -- "replicate" --> PE_Egress2
    PE_Egress1 --> CRcv1
    PE_Egress2 --> CRcv2

    PE_Ingress -. "MP-BGP mvpn<br/>Type 5 Source Active A-D" .-> PE_Egress1
    PE_Ingress -. "MP-BGP mvpn<br/>Type 5 Source Active A-D" .-> PE_Egress2
    PE_Egress1 -. "MP-BGP mvpn<br/>Type 4 Leaf A-D" .-> PE_Ingress
    PE_Egress2 -. "MP-BGP mvpn<br/>Type 4 Leaf A-D" .-> PE_Ingress
BGP-MVPN Route Types (relevant to forwarding)
TypeNameOriginatorPurpose
1Intra-AS I-PMSI A-DEvery PE in the MVPNDiscover other PEs belonging to the same MVPN
2Inter-AS I-PMSI A-DASBR PEsInter-AS PE discovery
3S-PMSI A-DIngress PEBind a specific (C-S,C-G) to a provider tunnel
4Leaf A-DEgress PE or ASBR responding to S-PMSI A-D or Inter-AS I-PMSI A-D route”Include me in the tree for this source/group”
5Source Active A-DIngress PEAdvertise “a customer source is active here”
6Shared Tree JoinEgress PEC-multicast Shared Tree Join (C-*,C-G) carried via BGP
7Source Tree JoinEgress PEC-multicast Source Tree Join (C-S,C-G) carried via BGP

Route types 4 and 5 drive the two-party conversation most operators care about: ingress PE says “I have a live source for (C-S,C-G)”, egress PEs reply “include me in the tree” [8].

End-to-End Forwarding Sequence
sequenceDiagram
    participant CRcv as C-Receiver
    participant PEe as Egress PE (VRF)
    participant BGP as MP-BGP (mvpn AF)
    participant PEi as Ingress PE (VRF)
    participant Core as P2MP MPLS LSP
    participant CSrc as C-Source

    CSrc->>PEi: Customer multicast (C-S,C-G)
    PEi->>BGP: Type 5 Source Active A-D
    BGP->>PEe: Propagate Type 5
    CRcv->>PEe: PIM (*,G) or IGMP Join
    PEe->>BGP: Type 4 Leaf A-D (I want the tree)
    BGP->>PEi: Deliver Leaf A-D
    PEi->>Core: Build / graft P2MP LSP leaf
    CSrc->>PEi: Data packet
    PEi->>Core: Push [transport label | MVPN label]
    Core->>PEe: Replicate at branches, deliver
    PEe->>CRcv: Pop labels, IP multicast forward in VRF
Label Stack on the Wire
LayerLabel roleInstalled bySwapped by
Outer (top)Transport labelLDP / RSVP-TECore LSRs
Inner (bottom)MVPN / VPN labelMP-BGP (mvpn AF)Unchanged
PayloadCustomer IP mcastCustomerUnchanged

The inner label identifies the MVPN context at the egress PE so it can forward into the correct customer VRF [7, 8]. The outer label drives the P2MP LSP through the core and is popped at the penultimate hop, same as unicast MPLS [10, 11].

Configuration: BGP-MVPN on a PE (Cisco IOS-XE)
vrf definition CUST-A
 rd 65000:100
 address-family ipv4
  mdt default 239.1.1.1
  mdt data    232.1.1.0 0.0.0.255 threshold 10
  route-target export 65000:100
  route-target import 65000:100
 exit-address-family
!
router bgp 65000
 address-family ipv4 mvpn
  neighbor 10.0.0.100 activate
StanzaPurpose
rd 65000:100Route distinguisher — same RD the L3VPN unicast plane already uses
mdt default 239.1.1.1Default MDT group (PIM-based, Profile 0) for low-rate customer multicast
mdt data 232.1.1.0 0.0.0.255 threshold 10Pool of data MDT groups switched to when a flow exceeds 10 kbps
route-target export/importSame RT machinery as unicast L3VPN — mvpn routes inherit VPN membership
address-family ipv4 mvpn + neighborEnable mvpn AF between RR / PE peers to carry MVPN route types 1-7

Highlight: Key Insight The elegance of BGP/MPLS MVPN is that it reuses the L3VPN control plane wholesale [7, 8]. Same RD, same RT, same BGP neighbours, same P2MP LSP fabric that traffic engineering already built. There is no parallel overlay — multicast is just another address family in the same VPN [14].

Why BGP/MPLS MVPN Wins in Modern Networks

PropertyDraft-RosenBGP/MPLS MVPN (RFC 6513/6514)
Core tunnelsGRE (IP-in-IP)P2MP MPLS LSPs (mLDP or RSVP-TE)
PE participationEvery PE joins default MDTOnly PEs with receivers join the tree
Control planePE-to-PE PIM inside GREMP-BGP mvpn address family
Integrates with L3VPNSeparate mechanismSame RD/RT, same BGP machinery
Scaling ceilingLow (every-PE state)High (thousands of customers, tens of thousands of groups per PE)
TE integrationAwkward (GRE does not ride P2MP LSPs)Native (rides the same P2MP LSPs TE provides)

Real-World MVPN Workloads

Highlight: Note The following use cases are the commercial drivers for MVPN deployment in Tier-1 SPs. Multicast is not an academic feature; it is how these workloads are economically possible.

Use CaseScaleWhy MulticastTypical PIM Mode
IPTV / live TV100s-1000s of HD channelsOne copy per active channel per region; set-top IGMP-drivenSSM
Financial market data1000s of feed subscribersDeterministic fan-out; latency variance mattersSSM
Enterprise software pushMillions of endpointsBulk distribution windows; OS/patch rolloutsSM or SSM
Video conferencing (legacy)DecliningH.323/SIP multiparty; largely displaced by WebRTC + SFUSM (historical)

IPTV / live TV delivery. Operators such as AT&T, BT, Deutsche Telekom, and STC deliver hundreds of HD channels over IP multicast. Each channel is a group; set-top boxes issue IGMP joins to the channel the subscriber is watching. Channel zapping produces a leave-then-join exchange. The backbone carries exactly one copy of each channel that has at least one active viewer on that branch of the tree — a channel nobody is watching consumes no core bandwidth.

Financial market data. Exchanges such as NYSE, NASDAQ, and CME push market data updates via multicast to subscriber firms. Milliseconds matter, and per-subscriber unicast replication would inject serialisation variance that distorts price-feed fairness. MVPN carries these feeds across SP backbones into customer colocation facilities with deterministic fan-out latency.

Software distribution. When a vendor like Microsoft pushes Windows Update waves to millions of endpoints, large swaths benefit from multicast inside enterprise networks and CDNs. Some SPs offer multicast-based CDN services for very large file distribution windows where unicast would saturate peering.

Video conferencing (legacy). H.323 and SIP multiparty conferences historically used multicast for efficient media distribution. Modern WebRTC architectures have largely shifted to unicast plus media-server (SFU) designs, so multicast video conferencing is a declining workload.

Bandwidth Model: Unicast vs Multicast at Scale

Worked example for a 5 Mbps stream delivered to N receivers across a core where the tree branches into B leaves at the last P router:

N (receivers)Unicast source egressUnicast core carriageMulticast source egressMulticast core carriage
100500 Mbps~500 Mbps5 Mbps~5-50 Mbps
10,00050 Gbps~50 Gbps5 Mbps~5-50 Mbps
100,000500 Gbps~500 Gbps5 Mbps~5-50 Mbps

Formula (approximate): core_bandwidth_multicast = stream_rate x tree_edges; the number of tree edges grows with topology and receiver distribution, but is independent of receiver count at any single leaf segment.

Highlight: Tip When evaluating whether a workload is a candidate for MVPN, the first question is never “how many receivers?” but “how is the receiver set distributed?” A million receivers behind a single edge switch is an IGMP problem, not a multicast-in-the-core problem. A few hundred receivers scattered across every PE is where MVPN pays for itself.

Relationship to Other SP Mechanisms

  • L3VPN (see [[01-tier1-sp-architecture-l3vpn]]). MVPN reuses the entire L3VPN control plane — RD, RT, MP-BGP, PE/P topology [7, 8]. Think of ipv4 mvpn as a second address family layered on the VPN you already have.
  • L2 multicast (see [[02-sp-services-dia-l2vpn-vpls-evpn]] — VPLS BUM, EVPN Route Type 3). VPLS and EVPN handle L2-plane multicast (broadcast, unknown-unicast, multicast) as part of Ethernet emulation. That is a different problem: L2 services replicate BUM frames across a broadcast domain, while MVPN routes customer IP multicast groups [7]. Both use the same PE infrastructure; neither subsumes the other.
  • MPLS traffic engineering. BGP/MPLS MVPN’s P2MP LSPs can be built by mLDP (simplest) [10] or by RSVP-TE P2MP (when TE already computes constrained paths) [11]. In TE-heavy backbones, reusing the TE P2MP fabric for MVPN is common and avoids a second LSP control plane [15].

See Also

References

  1. RFC 7761Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised). B. Fenner et al., IETF, March 2016 (Internet Standard, obsoletes RFC 4601). https://www.rfc-editor.org/rfc/rfc7761
  2. RFC 4607Source-Specific Multicast for IP. H. Holbrook, B. Cain, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4607
  3. RFC 4604Using IGMPv3 and MLDv2 for Source-Specific Multicast. H. Holbrook, B. Cain, B. Haberman, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4604
  4. RFC 3376Internet Group Management Protocol, Version 3. B. Cain et al., IETF, October 2002. https://www.rfc-editor.org/rfc/rfc3376
  5. RFC 5771IANA Guidelines for IPv4 Multicast Address Assignments. M. Cotton, L. Vegoda, D. Meyer, IETF (BCP 51), March 2010. https://www.rfc-editor.org/rfc/rfc5771
  6. RFC 2365Administratively Scoped IP Multicast. D. Meyer, IETF (BCP 23), July 1998. https://www.rfc-editor.org/rfc/rfc2365
  7. RFC 6513Multicast in MPLS/BGP IP VPNs. E. Rosen, R. Aggarwal (Eds.), IETF, February 2012. https://www.rfc-editor.org/rfc/rfc6513
  8. RFC 6514BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs. R. Aggarwal et al., IETF, February 2012. https://www.rfc-editor.org/rfc/rfc6514
  9. RFC 6037Cisco Systems’ Solution for Multicast in BGP/MPLS IP VPNs (Historic). E. Rosen et al., IETF, October 2010. https://www.rfc-editor.org/rfc/rfc6037
  10. RFC 6388Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths. IJ. Wijnands (Ed.) et al., IETF, November 2011. https://www.rfc-editor.org/rfc/rfc6388
  11. RFC 4875Extensions to RSVP-TE for Point-to-Multipoint TE LSPs. R. Aggarwal et al., IETF, May 2007. https://www.rfc-editor.org/rfc/rfc4875
  12. RFC 7246Multipoint Label Distribution Protocol In-Band Signaling in a Virtual Routing and Forwarding (VRF) Table Context. IJ. Wijnands et al., IETF, June 2014. https://www.rfc-editor.org/rfc/rfc7246
  13. RFC 7524Inter-Area P2MP Segmented Label Switched Paths. R. Aggarwal et al., IETF, May 2015. https://www.rfc-editor.org/rfc/rfc7524
  14. Cisco — Configure mVPN Profiles within Cisco IOS XR (Doc 200512). https://www.cisco.com/c/en/us/support/docs/ip/multicast/200512-Configure-mVPN-Profiles-within-Cisco-IOS.html
  15. I. Minei & J. Lucek, MPLS-Enabled Applications, 3rd ed., Wiley, 2011.
  16. B. Williamson, Developing IP Multicast Networks, Volume I, Cisco Press, 1999.