IP Multicast and MVPN in Service Provider Networks
IP multicast replaces N-way unicast replication with a single source stream that routers duplicate only where paths diverge, turning linear bandwidth growth into tree-shaped growth [16]. This chapter covers the fundamentals (Class D addressing [5], IGMP [4], PIM variants [1, 2]), the MVPN service models that extend multicast into customer L3VPNs — Draft-Rosen [9] and the modern BGP/MPLS MVPN per RFC 6513/6514 [7, 8] — and the real-world workloads (IPTV, financial market data, software distribution) that depend on these mechanisms.
Cards: 4
IGMP
Host-to-router signalling. End devices declare “I want group G” to their local router; leaves and joins drive edge replication state.
PIM
Router-to-router multicast routing. Builds distribution trees between routers in three flavours: Dense Mode, Sparse Mode, and Source-Specific Multicast.
Draft-Rosen MVPN
First-generation customer-multicast-over-VPN. GRE tunnels between PEs carry customer PIM. Simple, but every PE joins the default MDT whether or not it has receivers.
BGP/MPLS MVPN
Modern RFC 6513/6514 design. MP-BGP mvpn address family signals state; P2MP MPLS LSPs (built by mLDP or RSVP-TE) replicate in the data plane. Only interested PEs join the tree.
Content
Why Multicast Exists: The Replication Problem
Consider an IPTV operator streaming a live match to 100,000 subscribers at 5 Mbps each. Unicast delivery means the head-end originates 100,000 independent copies of the same content, demanding roughly 500 Gbps of egress bandwidth from a single source. This scales linearly with subscriber count and collapses the moment a popular event arrives.
Multicast inverts the model. The source emits one 5 Mbps stream into a distribution tree. Branch routers duplicate packets only at divergence points — where one downstream interface leads to two or more receiver segments. The edge router delivers a single copy per interested local listener. Total backbone carriage is bounded by tree shape, not by receiver count.
Highlight: Key Insight Multicast turns an O(N) replication problem into an O(branching factor) problem. A stream to 100,000 receivers costs the backbone the same as a stream to 100, provided the tree shape is similar.
| Delivery model | Source egress | Backbone carriage | Receiver scaling | Typical use |
|---|---|---|---|---|
| Unicast | N x stream | N x stream | Linear (O(N)) | Web, VoD, most apps |
| Multicast | 1 x stream | ~tree x stream | Near-constant | IPTV, market data |
| Broadcast | 1 x stream | Floods everywhere | Domain-wide | L2 discovery only |
IP Multicast Fundamentals
Multicast destination addresses live in 224.0.0.0/4 (the former Class D range) [5]. A multicast group is identified by its group address — for example, 232.1.1.1 might represent “Channel 5 HD” [2, 5]. Any receiver that joins the group receives traffic sent to it; any source that transmits to it contributes packets into the tree. There is no concept of “owning” a group at the IP layer — membership is open, subject to network policy [5, 6].
Two protocols cooperate to deliver multicast end-to-end. IGMP operates between hosts and their first-hop router [4]. PIM operates between routers across the network [1]. IGMP builds edge state; PIM stitches those edges into a network-wide tree.
graph LR Source["Multicast<br/>Source<br/>S"] R1["R1<br/>(ingress)"] R2["R2<br/>(branch)"] R3["R3<br/>(branch)"] E1["Edge 1"] E2["Edge 2"] E3["Edge 3"] H1["Host A<br/>(IGMP join G)"] H2["Host B<br/>(IGMP join G)"] H3["Host C<br/>(IGMP join G)"] Source --> R1 R1 -- "1 copy" --> R2 R2 -- "1 copy" --> E1 --> H1 R2 -- "1 copy" --> R3 R3 -- "1 copy" --> E2 --> H2 R3 -- "1 copy" --> E3 --> H3
One source stream reaches three receivers; routers replicate only at branching points R2 and R3.
IGMP: Host Membership
IGMP (Internet Group Management Protocol) is how an end-host signals local group interest [4]. When a subscriber’s set-top box tunes to a channel, it issues an IGMP Join for the corresponding group [4]. The first-hop router installs the host’s interface in its local outgoing interface list (OIL) for that group. When the host leaves or its query timer expires, the interface is pruned [4].
IGMP has no awareness of the network beyond the host’s LAN. It is the trigger that tells the first-hop router “pull this group toward me” — PIM then handles the pull across the backbone.
PIM: Building the Tree
PIM (Protocol Independent Multicast) is the routing protocol that constructs distribution trees between routers [1]. “Protocol independent” means PIM uses whatever unicast routing table is already present (IGP, BGP) to perform its reverse-path forwarding checks — it does not run its own routing computation [1]. Three operating modes exist, and SP relevance varies sharply [1, 2]:
| PIM Mode | Tree Construction | Scalability | SP Relevance |
|---|---|---|---|
| Dense Mode | Flood to all PIM neighbours, prune back where no receivers exist | Poor (floods first) | Never used in SPs; suits small LANs only |
| Sparse Mode | Receivers explicitly join an RP-rooted shared tree; optional SPT switchover | Good | Used in enterprise and some legacy SP MVPN |
| SSM | Receivers join a specific (S,G) directly; no Rendezvous Point involvement | Best | Preferred for IPTV and new SP deployments |
Highlight: Warning PIM Dense Mode assumes receivers are everywhere and floods first, pruning later. In an SP backbone this produces exactly the bandwidth waste multicast is meant to avoid. Dense Mode is almost never correct outside small isolated LANs.
PIM Sparse Mode (PIM-SM) uses a Rendezvous Point (RP) as a well-known meeting place [1]. Receivers send (*,G) joins toward the RP, building a shared tree [1]. Sources register with the RP, which forwards traffic down the shared tree [1]. Once traffic is flowing, receivers may switch to a source-specific shortest-path tree (S,G) for efficiency [1]. PIM-SM works, but it requires RP management (anycast RP, MSDP, static vs auto-RP) which adds operational overhead.
PIM Source-Specific Multicast (PIM-SSM) skips the RP entirely [2]. Hosts signal the specific (S,G) they want (via IGMPv3 source-filtering or statically configured mappings) [3, 4]. Routers build the shortest-path tree rooted at the source directly [2]. Most modern IPTV deployments use SSM in the 232.0.0.0/8 range precisely because there is no RP to operate [2, 5]. Signalling is simpler, failure modes are narrower, and scaling is cleaner.
stateDiagram-v2 [*] --> NoInterest NoInterest --> JoinPending: IGMP Report<br/>received on LAN JoinPending --> OnSharedTree: PIM (*,G) Join<br/>to RP (PIM-SM) JoinPending --> OnShortestPathTree: PIM (S,G) Join<br/>to source (SSM) OnSharedTree --> OnShortestPathTree: SPT-switch<br/>threshold OnShortestPathTree --> NoInterest: IGMP Leave<br/>or timeout OnSharedTree --> NoInterest: IGMP Leave<br/>or timeout
Multicast VPN (MVPN): Customer Multicast Across an SP Core
A customer who runs multicast inside their own L3VPN — for example, a financial firm delivering real-time market data from a head-office feed handler to branch-office trading desks — needs those groups to traverse the SP backbone with the same isolation guarantees that the L3VPN unicast plane already provides [7]. Simply flooding customer multicast into the global table would break everything: isolation, scaling, and policy.
MVPN is the set of mechanisms that let an SP carry customer multicast across the provider MPLS core [7]. Two generations exist in production networks today [7, 15].
Generation 1: Draft-Rosen MVPN
The first MVPN design (often called “Rosen draft” after the lead author) predates MPLS P2MP LSPs [9]. It glues customer multicast to the core using GRE tunnels between PEs [9]:
- Every PE in a given MVPN joins a default MDT (Multicast Distribution Tree) — effectively a group address in the provider’s global multicast space [9].
- GRE tunnels between PEs carry PIM signalling and customer multicast data [9].
- Inside the GRE tunnel, the PEs run PIM as if they were directly adjacent [9].
- High-bandwidth groups can migrate to a data MDT to avoid flooding the default MDT [9].
Highlight: Warning Draft-Rosen’s scaling limitation is structural: every PE belonging to the MVPN participates in the default MDT whether or not it has any local receivers [9]. A 200-PE MVPN with two interested PEs still burns state and link bandwidth on the other 198. This is why modern deployments use BGP/MPLS MVPN [7, 8].
Generation 2: BGP/MPLS MVPN (RFC 6513/6514)
The modern design eliminates GRE tunnels and the “every PE joins” problem [7]. Two planes do the work:
- Control plane: MP-BGP gains a new address family,
ipv4 mvpn(andipv6 mvpn) [8]. PEs exchange multicast state as BGP routes using the same RD/RT machinery that already isolates and targets L3VPN unicast routes [7, 8]. - Data plane: Customer multicast rides P2MP MPLS LSPs, which branch in the core and replicate at LSR branch points [7]. These LSPs are typically built by mLDP (multicast LDP extensions) [10] or by RSVP-TE P2MP when the TE infrastructure is already in place [11].
graph TD subgraph "Customer VRF CUST-A" CSrc["C-Source<br/>(market data feed)"] CRcv1["C-Receiver 1<br/>(branch trading desk)"] CRcv2["C-Receiver 2<br/>(branch trading desk)"] end subgraph "SP MPLS Core" PE_Ingress["Ingress PE<br/>(C-Source attached)"] P1["P (branch LSR)"] PE_Egress1["Egress PE 1"] PE_Egress2["Egress PE 2"] end CSrc --> PE_Ingress PE_Ingress -- "P2MP LSP<br/>(transport + MVPN label)" --> P1 P1 -- "replicate" --> PE_Egress1 P1 -- "replicate" --> PE_Egress2 PE_Egress1 --> CRcv1 PE_Egress2 --> CRcv2 PE_Ingress -. "MP-BGP mvpn<br/>Type 5 Source Active A-D" .-> PE_Egress1 PE_Ingress -. "MP-BGP mvpn<br/>Type 5 Source Active A-D" .-> PE_Egress2 PE_Egress1 -. "MP-BGP mvpn<br/>Type 4 Leaf A-D" .-> PE_Ingress PE_Egress2 -. "MP-BGP mvpn<br/>Type 4 Leaf A-D" .-> PE_Ingress
BGP-MVPN Route Types (relevant to forwarding)
| Type | Name | Originator | Purpose |
|---|---|---|---|
| 1 | Intra-AS I-PMSI A-D | Every PE in the MVPN | Discover other PEs belonging to the same MVPN |
| 2 | Inter-AS I-PMSI A-D | ASBR PEs | Inter-AS PE discovery |
| 3 | S-PMSI A-D | Ingress PE | Bind a specific (C-S,C-G) to a provider tunnel |
| 4 | Leaf A-D | Egress PE or ASBR responding to S-PMSI A-D or Inter-AS I-PMSI A-D route | ”Include me in the tree for this source/group” |
| 5 | Source Active A-D | Ingress PE | Advertise “a customer source is active here” |
| 6 | Shared Tree Join | Egress PE | C-multicast Shared Tree Join (C-*,C-G) carried via BGP |
| 7 | Source Tree Join | Egress PE | C-multicast Source Tree Join (C-S,C-G) carried via BGP |
Route types 4 and 5 drive the two-party conversation most operators care about: ingress PE says “I have a live source for (C-S,C-G)”, egress PEs reply “include me in the tree” [8].
End-to-End Forwarding Sequence
sequenceDiagram participant CRcv as C-Receiver participant PEe as Egress PE (VRF) participant BGP as MP-BGP (mvpn AF) participant PEi as Ingress PE (VRF) participant Core as P2MP MPLS LSP participant CSrc as C-Source CSrc->>PEi: Customer multicast (C-S,C-G) PEi->>BGP: Type 5 Source Active A-D BGP->>PEe: Propagate Type 5 CRcv->>PEe: PIM (*,G) or IGMP Join PEe->>BGP: Type 4 Leaf A-D (I want the tree) BGP->>PEi: Deliver Leaf A-D PEi->>Core: Build / graft P2MP LSP leaf CSrc->>PEi: Data packet PEi->>Core: Push [transport label | MVPN label] Core->>PEe: Replicate at branches, deliver PEe->>CRcv: Pop labels, IP multicast forward in VRF
Label Stack on the Wire
| Layer | Label role | Installed by | Swapped by |
|---|---|---|---|
| Outer (top) | Transport label | LDP / RSVP-TE | Core LSRs |
| Inner (bottom) | MVPN / VPN label | MP-BGP (mvpn AF) | Unchanged |
| Payload | Customer IP mcast | Customer | Unchanged |
The inner label identifies the MVPN context at the egress PE so it can forward into the correct customer VRF [7, 8]. The outer label drives the P2MP LSP through the core and is popped at the penultimate hop, same as unicast MPLS [10, 11].
Configuration: BGP-MVPN on a PE (Cisco IOS-XE)
vrf definition CUST-A
rd 65000:100
address-family ipv4
mdt default 239.1.1.1
mdt data 232.1.1.0 0.0.0.255 threshold 10
route-target export 65000:100
route-target import 65000:100
exit-address-family
!
router bgp 65000
address-family ipv4 mvpn
neighbor 10.0.0.100 activate
| Stanza | Purpose |
|---|---|
rd 65000:100 | Route distinguisher — same RD the L3VPN unicast plane already uses |
mdt default 239.1.1.1 | Default MDT group (PIM-based, Profile 0) for low-rate customer multicast |
mdt data 232.1.1.0 0.0.0.255 threshold 10 | Pool of data MDT groups switched to when a flow exceeds 10 kbps |
route-target export/import | Same RT machinery as unicast L3VPN — mvpn routes inherit VPN membership |
address-family ipv4 mvpn + neighbor | Enable mvpn AF between RR / PE peers to carry MVPN route types 1-7 |
Highlight: Key Insight The elegance of BGP/MPLS MVPN is that it reuses the L3VPN control plane wholesale [7, 8]. Same RD, same RT, same BGP neighbours, same P2MP LSP fabric that traffic engineering already built. There is no parallel overlay — multicast is just another address family in the same VPN [14].
Why BGP/MPLS MVPN Wins in Modern Networks
| Property | Draft-Rosen | BGP/MPLS MVPN (RFC 6513/6514) |
|---|---|---|
| Core tunnels | GRE (IP-in-IP) | P2MP MPLS LSPs (mLDP or RSVP-TE) |
| PE participation | Every PE joins default MDT | Only PEs with receivers join the tree |
| Control plane | PE-to-PE PIM inside GRE | MP-BGP mvpn address family |
| Integrates with L3VPN | Separate mechanism | Same RD/RT, same BGP machinery |
| Scaling ceiling | Low (every-PE state) | High (thousands of customers, tens of thousands of groups per PE) |
| TE integration | Awkward (GRE does not ride P2MP LSPs) | Native (rides the same P2MP LSPs TE provides) |
Real-World MVPN Workloads
Highlight: Note The following use cases are the commercial drivers for MVPN deployment in Tier-1 SPs. Multicast is not an academic feature; it is how these workloads are economically possible.
| Use Case | Scale | Why Multicast | Typical PIM Mode |
|---|---|---|---|
| IPTV / live TV | 100s-1000s of HD channels | One copy per active channel per region; set-top IGMP-driven | SSM |
| Financial market data | 1000s of feed subscribers | Deterministic fan-out; latency variance matters | SSM |
| Enterprise software push | Millions of endpoints | Bulk distribution windows; OS/patch rollouts | SM or SSM |
| Video conferencing (legacy) | Declining | H.323/SIP multiparty; largely displaced by WebRTC + SFU | SM (historical) |
IPTV / live TV delivery. Operators such as AT&T, BT, Deutsche Telekom, and STC deliver hundreds of HD channels over IP multicast. Each channel is a group; set-top boxes issue IGMP joins to the channel the subscriber is watching. Channel zapping produces a leave-then-join exchange. The backbone carries exactly one copy of each channel that has at least one active viewer on that branch of the tree — a channel nobody is watching consumes no core bandwidth.
Financial market data. Exchanges such as NYSE, NASDAQ, and CME push market data updates via multicast to subscriber firms. Milliseconds matter, and per-subscriber unicast replication would inject serialisation variance that distorts price-feed fairness. MVPN carries these feeds across SP backbones into customer colocation facilities with deterministic fan-out latency.
Software distribution. When a vendor like Microsoft pushes Windows Update waves to millions of endpoints, large swaths benefit from multicast inside enterprise networks and CDNs. Some SPs offer multicast-based CDN services for very large file distribution windows where unicast would saturate peering.
Video conferencing (legacy). H.323 and SIP multiparty conferences historically used multicast for efficient media distribution. Modern WebRTC architectures have largely shifted to unicast plus media-server (SFU) designs, so multicast video conferencing is a declining workload.
Bandwidth Model: Unicast vs Multicast at Scale
Worked example for a 5 Mbps stream delivered to N receivers across a core where the tree branches into B leaves at the last P router:
| N (receivers) | Unicast source egress | Unicast core carriage | Multicast source egress | Multicast core carriage |
|---|---|---|---|---|
| 100 | 500 Mbps | ~500 Mbps | 5 Mbps | ~5-50 Mbps |
| 10,000 | 50 Gbps | ~50 Gbps | 5 Mbps | ~5-50 Mbps |
| 100,000 | 500 Gbps | ~500 Gbps | 5 Mbps | ~5-50 Mbps |
Formula (approximate): core_bandwidth_multicast = stream_rate x tree_edges; the number of tree edges grows with topology and receiver distribution, but is independent of receiver count at any single leaf segment.
Highlight: Tip When evaluating whether a workload is a candidate for MVPN, the first question is never “how many receivers?” but “how is the receiver set distributed?” A million receivers behind a single edge switch is an IGMP problem, not a multicast-in-the-core problem. A few hundred receivers scattered across every PE is where MVPN pays for itself.
Relationship to Other SP Mechanisms
- L3VPN (see
[[01-tier1-sp-architecture-l3vpn]]). MVPN reuses the entire L3VPN control plane — RD, RT, MP-BGP, PE/P topology [7, 8]. Think ofipv4 mvpnas a second address family layered on the VPN you already have. - L2 multicast (see
[[02-sp-services-dia-l2vpn-vpls-evpn]]— VPLS BUM, EVPN Route Type 3). VPLS and EVPN handle L2-plane multicast (broadcast, unknown-unicast, multicast) as part of Ethernet emulation. That is a different problem: L2 services replicate BUM frames across a broadcast domain, while MVPN routes customer IP multicast groups [7]. Both use the same PE infrastructure; neither subsumes the other. - MPLS traffic engineering. BGP/MPLS MVPN’s P2MP LSPs can be built by mLDP (simplest) [10] or by RSVP-TE P2MP (when TE already computes constrained paths) [11]. In TE-heavy backbones, reusing the TE P2MP fabric for MVPN is common and avoids a second LSP control plane [15].
See Also
- Tier-1 SP Architecture and L3VPN — L3VPN control plane (RD, RT, MP-BGP, PE/P topology) reused wholesale by BGP/MPLS MVPN
- SP Services: DIA, L2VPN, VPLS, and EVPN — L2-plane multicast (VPLS BUM flooding, EVPN Route Type 3) which is a distinct mechanism from L3 MVPN
References
- RFC 7761 — Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised). B. Fenner et al., IETF, March 2016 (Internet Standard, obsoletes RFC 4601). https://www.rfc-editor.org/rfc/rfc7761
- RFC 4607 — Source-Specific Multicast for IP. H. Holbrook, B. Cain, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4607
- RFC 4604 — Using IGMPv3 and MLDv2 for Source-Specific Multicast. H. Holbrook, B. Cain, B. Haberman, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4604
- RFC 3376 — Internet Group Management Protocol, Version 3. B. Cain et al., IETF, October 2002. https://www.rfc-editor.org/rfc/rfc3376
- RFC 5771 — IANA Guidelines for IPv4 Multicast Address Assignments. M. Cotton, L. Vegoda, D. Meyer, IETF (BCP 51), March 2010. https://www.rfc-editor.org/rfc/rfc5771
- RFC 2365 — Administratively Scoped IP Multicast. D. Meyer, IETF (BCP 23), July 1998. https://www.rfc-editor.org/rfc/rfc2365
- RFC 6513 — Multicast in MPLS/BGP IP VPNs. E. Rosen, R. Aggarwal (Eds.), IETF, February 2012. https://www.rfc-editor.org/rfc/rfc6513
- RFC 6514 — BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs. R. Aggarwal et al., IETF, February 2012. https://www.rfc-editor.org/rfc/rfc6514
- RFC 6037 — Cisco Systems’ Solution for Multicast in BGP/MPLS IP VPNs (Historic). E. Rosen et al., IETF, October 2010. https://www.rfc-editor.org/rfc/rfc6037
- RFC 6388 — Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths. IJ. Wijnands (Ed.) et al., IETF, November 2011. https://www.rfc-editor.org/rfc/rfc6388
- RFC 4875 — Extensions to RSVP-TE for Point-to-Multipoint TE LSPs. R. Aggarwal et al., IETF, May 2007. https://www.rfc-editor.org/rfc/rfc4875
- RFC 7246 — Multipoint Label Distribution Protocol In-Band Signaling in a Virtual Routing and Forwarding (VRF) Table Context. IJ. Wijnands et al., IETF, June 2014. https://www.rfc-editor.org/rfc/rfc7246
- RFC 7524 — Inter-Area P2MP Segmented Label Switched Paths. R. Aggarwal et al., IETF, May 2015. https://www.rfc-editor.org/rfc/rfc7524
- Cisco — Configure mVPN Profiles within Cisco IOS XR (Doc 200512). https://www.cisco.com/c/en/us/support/docs/ip/multicast/200512-Configure-mVPN-Profiles-within-Cisco-IOS.html
- I. Minei & J. Lucek, MPLS-Enabled Applications, 3rd ed., Wiley, 2011.
- B. Williamson, Developing IP Multicast Networks, Volume I, Cisco Press, 1999.