QoS in Service-Provider Networks
Quality of Service (QoS) is the set of mechanisms an SP uses to treat packets unequally on purpose. Voice and signalling traffic must arrive with low delay and near-zero loss; bulk transfers can tolerate jitter and occasional drops. This chapter covers the DiffServ model that every Tier-1 SP deploys [2], the seven-class traffic matrix (RFC 4594 defines 12 service classes; SPs typically collapse them to 6–8 classes due to the 3-bit MPLS Traffic Class field ceiling) [5, 6], the classification-policing-queuing-shaping pipeline, Cisco MQC configuration [15], and the DiffServ-aware Traffic Engineering extension that couples per-class bandwidth reservation to MPLS-TE [8].
Cards: 3
DiffServ
A scalable marking model: packets are classified into a small number of classes at the network edge, and every core hop applies a per-hop behaviour based solely on that mark. No per-flow state [2].
Seven-class matrix
The de-facto industry layout: Network Control, Voice, Video, Business, Bulk, Best-effort, Scavenger [5]. Each class maps to a DSCP code point and an MPLS EXP/TC value [1, 6]. The DSCP→TC mapping is lossy (6 bits → 3 bits); RFC 3270 specifies the E-LSP and L-LSP mapping models for MPLS DiffServ [7].
DS-TE
DiffServ-aware Traffic Engineering. Bandwidth on each link is partitioned per class, so voice LSPs reserve from a voice pool that best-effort LSPs cannot consume [8].
The scalability problem DiffServ solves
The older IntServ/RSVP model reserved resources per flow end-to-end. Every router on the path held state for every flow. That works in a small enterprise; it collapses at SP scale where a single backbone link carries millions of concurrent flows [2].
DiffServ flips the model: classification happens once at the edge (the PE), every packet gets a small mark, and every core router treats all packets in a given class identically. Core devices hold state per class (6-8 classes) rather than per flow [2]. This is the only QoS model that scales to Tier-1 backbones.
Highlight: Key Insight DiffServ trades per-flow precision for scalability. The core cannot distinguish one voice call from another — but it does not need to, because every voice packet gets the same low-latency treatment.
Marking fields
| Plane | Field | Width | Values | Notes |
|---|---|---|---|---|
| IP | DSCP (in the Differentiated Services field, which replaced the IPv4 ToS byte per RFC 2474 §3) | 6 bits | 64 code points | Used on CE-PE and customer-visible segments |
| MPLS | Traffic Class (formerly EXP, renamed by RFC 5462) | 3 bits | 8 classes | What actually drives core forwarding decisions |
The 3-bit MPLS EXP/TC field is the binding constraint: only eight distinct classes can be signalled end-to-end across an MPLS core, which is why SP traffic matrices converge on 6-8 classes in practice [6]. RFC 4594 defines 12 service classes; SPs typically collapse them to 6–8 classes due to the 3-bit MPLS Traffic Class field ceiling [5, 6].
The standard seven-class SP traffic matrix
Most Tier-1 SPs map customer and internal traffic into a matrix very close to the one below. The exact DSCP values are not arbitrary — they are the IETF-recommended code points from RFC 4594 and deployed consistently enough that inter-SP peering usually “just works” [5].
| Class | Priority | DSCP | EXP | Typical traffic |
|---|---|---|---|---|
| Network Control | Highest | CS6 / CS7 | 6, 7 | BGP, OSPF, IS-IS, LDP — must never drop [5] |
| Voice | Very high (strict) | EF | 5 | VoIP RTP, mobile voice bearers [4, 5] |
| Real-Time Interactive | High | CS4 | 4 | Broadcast video, interactive gaming [5] |
| Multimedia Conferencing | High | AF41 | 4 | Adaptive video conferencing [3, 5] |
| Broadcast Video / IPTV | High | CS3 | 3 | Broadcast IPTV (inelastic) [5] |
| Multimedia Streaming | Medium | AF31 | 3 | Buffered IPTV / streaming video [3, 5] |
| Low-Latency Data (business critical) | Medium | AF21 | 2 | Transactional web browsing, enterprise L3VPN [3, 5] |
| High-Throughput Data | Low | AF11 | 1 | Bulk transfers, FTP, large email [3, 5] |
| Best effort | Default | 0 | 0 | Anything unclassified |
| Scavenger | Below best-effort | CS1 | 1 | Backups, P2P, anything that can wait |
Highlight: Note CS6/CS7 (Network Control) is reserved for routing-protocol packets generated by the SP’s own devices. Customer traffic is never marked CS6/CS7 by the PE — a mismarked customer packet is remarked down at the trust boundary.
The QoS pipeline
Every packet entering the SP at a PE traverses five logical stages. Classification and policing happen on ingress; queuing and shaping happen on egress; marking can happen at either boundary.
flowchart LR CE[CE ingress] --> CLS[Classify<br/>5-tuple / DSCP / DPI] CLS --> POL[Police<br/>token bucket] POL --> MRK[Mark / Remark<br/>DSCP + EXP] MRK --> CORE[(MPLS core<br/>per-class queuing)] CORE --> Q[Queue<br/>PQ / CBWFQ / WRED] Q --> SHP[Shape<br/>egress rate smoothing] SHP --> OUT[Peer / CE egress]
Classification
Identify what the packet is. Inputs: source/destination IP, L4 port, an existing DSCP mark [1], or application-level signatures from Deep Packet Inspection. Customer markings may be trusted (contracts with sophisticated enterprises) or overridden by SP policy at the PE trust boundary [2].
Policing
Enforce the customer’s contracted rate using a token bucket [9, 10]. A customer who bought “100 Mbps with a 10 Mbps voice priority tier” is policed as: up to 10 Mbps of EF-marked traffic flows with priority treatment; EF traffic above 10 Mbps is either dropped or demoted to best-effort; the remaining classes share the other 90 Mbps. Token bucket is simple, precise, and inexpensive in hardware [9, 10].
Marking and remarking
Set the DSCP (IP) and EXP (MPLS) fields per SP policy [1, 6]. Trust-boundary behaviour is the key design decision: trust-dscp on a trusted customer preserves their markings; class-map + set commands on an untrusted customer overwrite them [15].
Queuing — where congestion is actually resolved
Each egress interface has one queue per class. Three queuing disciplines compose the per-hop behaviour [15]:
| Discipline | Purpose | Classes it serves | Key risk |
|---|---|---|---|
| Priority Queue (PQ) | Strict priority — packets here jump ahead of everything else | Voice, Network Control | Starvation of lower classes; must be rate-capped |
| CBWFQ | Guaranteed minimum bandwidth share, proportional distribution of unused bandwidth | Video, Business, Bulk, Best-effort | None if configured with realistic percentages |
| WRED | Drop early and randomly as queue fills, instead of tail-dropping | Any TCP-heavy class | Misconfigured thresholds can under-utilise the link |
Highlight: Warning
A Priority Queue without a rate cap will starve every other class the moment voice traffic exceeds its planned envelope. The priority percent statement (not just priority) is mandatory in production policies.
WRED vs tail drop
When a queue fills, naive tail drop discards every incoming packet at once. Every TCP flow hitting that tail simultaneously backs off, and they all ramp up together — TCP global synchronisation [12]. Utilisation oscillates between full and empty, and throughput collapses.
WRED (Weighted Random Early Detection) starts dropping packets probabilistically as the queue depth crosses a minimum threshold, before the queue is full [12]. Drop probability rises with queue depth. Different classes have different drop profiles: scavenger drops aggressively and early, business-critical drops reluctantly and late [3, 15]. TCP flows see losses staggered in time, back off at different moments, and the aggregate arrival rate stabilises [12].
Shaping
Smooth bursty traffic by buffering it toward a target rate, instead of dropping. Used where the downstream device would policer-drop over-rate traffic — typically at an interconnect toward another carrier. Shaping trades delay for loss; policing trades loss for delay. Choose the one whose cost the service can tolerate.
Cisco MQC configuration
The Modular QoS CLI is the standard Cisco pattern: class-maps define what to match; policy-maps define what to do with each match; service-policy attaches a policy to an interface [15].
class-map match-any VOICE
match dscp ef
class-map match-any VIDEO
match dscp af41
class-map match-any BUSINESS
match dscp af31 af32
!
policy-map CORE-EGRESS
class VOICE
priority percent 20
class VIDEO
bandwidth percent 30
random-detect dscp-based
class BUSINESS
bandwidth percent 25
random-detect dscp-based
class class-default
bandwidth percent 25
random-detect
!
interface GigabitEthernet0/0
service-policy output CORE-EGRESS
Allocation summary:
| Class | Share | Discipline | Notes |
|---|---|---|---|
| VOICE | 20% | Strict priority (PQ) | Rate-capped to prevent starvation |
| VIDEO | 30% | CBWFQ + WRED | DSCP-based drop profile |
| BUSINESS | 25% | CBWFQ + WRED | DSCP-based drop profile (AF31 + AF32) |
| class-default | 25% | CBWFQ + WRED | Catches everything not matched above |
Highlight: Tip
Always leave a class class-default with a non-trivial bandwidth allocation. Traffic that fails every explicit match lands here, and starving it produces mysterious outages for unclassified but legitimate flows (DNS, NTP, new applications not yet profiled).
MPLS DiffServ-aware TE (DS-TE)
Ordinary RSVP-TE reserves an aggregate bandwidth pool per link (“2 Gbps available for TE tunnels on this link”). DS-TE partitions that pool per class: “500 Mbps for the voice sub-pool, 1.5 Gbps for the best-effort sub-pool” [8]. Voice LSPs can only draw from the voice sub-pool; best-effort LSPs cannot starve voice reservations by grabbing common bandwidth [8].
graph LR subgraph Link["10 Gbps SP backbone link"] V["Voice sub-pool<br/>2 Gbps"] B["Business sub-pool<br/>3 Gbps"] E["Best-effort sub-pool<br/>5 Gbps"] end VT[Voice LSPs] -.reserve from.-> V BT[Business LSPs] -.reserve from.-> B ET[Best-effort LSPs] -.reserve from.-> E
Highlight: Key Insight DS-TE is what makes hard-SLA voice services mathematically provable. The SP can demonstrate to a regulator or customer that voice traffic cannot exceed its reserved sub-pool on any link, no matter how much best-effort traffic the rest of the network carries.
See [[01-mpls-traffic-engineering]] for the underlying RSVP-TE bandwidth reservation mechanics that DS-TE extends.
Operational reality — where QoS matters most
Most Tier-1 SP cores run at low-to-moderate utilisation most of the time, because bandwidth is cheap relative to outage risk. During normal operation, every class fits comfortably and the per-class policy is essentially dormant.
QoS earns its keep during congestion events:
| Event | What happens | Why QoS matters |
|---|---|---|
| Fibre cut | Traffic reroutes onto smaller backup paths; links briefly saturate | Voice and network control must survive the 10-30s convergence window |
| DDoS attack | Scavenger/best-effort traffic spikes toward a target customer | WRED drop profiles protect premium classes while the attack is scrubbed |
| Flash crowd | Live-event traffic surges on a single egress | CBWFQ guarantees that business-critical L3VPN traffic keeps its share |
| Maintenance window | Deliberate traffic migration onto a subset of links | DS-TE voice sub-pools prevent SLA violations during the move |
At the access edge, the picture inverts. Customer links (50 Mbps branch circuits, 1 Gbps enterprise tails) saturate during normal business hours. A branch office needs aggressive QoS to keep a VoIP call intelligible when someone kicks off a 40 GB cloud backup. Access-edge QoS is where most operator-visible QoS complaints originate.
Highlight: Note The core is over-provisioned; the access edge is not. Design effort should scale inversely with link size — a 100 Gbps backbone port needs a modest seven-class policy, a 50 Mbps branch circuit needs careful per-application tuning.
See Also
- Internet Infrastructure: Gateways, Exchange Points, CDNs, and Deep Packet Inspection
- Tier-1 SP Architecture and L3VPN
- BGP Fundamentals
References
- RFC 2474 — Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers. K. Nichols et al., IETF, December 1998. https://www.rfc-editor.org/rfc/rfc2474
- RFC 2475 — An Architecture for Differentiated Services. S. Blake et al., IETF, December 1998. https://www.rfc-editor.org/rfc/rfc2475
- RFC 2597 — Assured Forwarding PHB Group. J. Heinanen et al., IETF, June 1999. https://www.rfc-editor.org/rfc/rfc2597
- RFC 3246 — An Expedited Forwarding PHB (Per-Hop Behavior). B. Davie et al., IETF, March 2002. https://www.rfc-editor.org/rfc/rfc3246
- RFC 4594 — Configuration Guidelines for DiffServ Service Classes. J. Babiarz, K. Chan, F. Baker, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4594
- RFC 5462 — Multiprotocol Label Switching (MPLS) Label Stack Entry: “EXP” Field Renamed to “Traffic Class” Field. L. Andersson, R. Asati, IETF, February 2009. https://www.rfc-editor.org/rfc/rfc5462
- RFC 3270 — Multi-Protocol Label Switching (MPLS) Support of Differentiated Services. F. Le Faucheur et al., IETF, May 2002. https://www.rfc-editor.org/rfc/rfc3270
- RFC 4124 — Protocol Extensions for Support of Diffserv-aware MPLS Traffic Engineering. F. Le Faucheur (Ed.), IETF, June 2005. https://www.rfc-editor.org/rfc/rfc4124
- RFC 2697 — A Single Rate Three Color Marker. J. Heinanen, R. Guérin, IETF, September 1999. https://www.rfc-editor.org/rfc/rfc2697
- RFC 2698 — A Two Rate Three Color Marker. J. Heinanen, R. Guérin, IETF, September 1999. https://www.rfc-editor.org/rfc/rfc2698
- RFC 3168 — The Addition of Explicit Congestion Notification (ECN) to IP. K. Ramakrishnan et al., IETF, September 2001. https://www.rfc-editor.org/rfc/rfc3168
- RFC 7567 — IETF Recommendations Regarding Active Queue Management (BCP 197). F. Baker, G. Fairhurst (Eds.), IETF, July 2015. https://www.rfc-editor.org/rfc/rfc7567
- ITU-T Y.1541 — Network performance objectives for IP-based services. https://www.itu.int/rec/T-REC-Y.1541
- IEEE 802.1Q-2018, Annex I — Default PCP-to-traffic-class mappings. https://standards.ieee.org/standard/802_1Q-2018.html
- T. Szigeti, C. Hattingh, R. Barton, K. Briley, End-to-End QoS Network Design, 2nd ed., Cisco Press, 2013.