QoS in Service-Provider Networks

Quality of Service (QoS) is the set of mechanisms an SP uses to treat packets unequally on purpose. Voice and signalling traffic must arrive with low delay and near-zero loss; bulk transfers can tolerate jitter and occasional drops. This chapter covers the DiffServ model that every Tier-1 SP deploys [2], the seven-class traffic matrix (RFC 4594 defines 12 service classes; SPs typically collapse them to 6–8 classes due to the 3-bit MPLS Traffic Class field ceiling) [5, 6], the classification-policing-queuing-shaping pipeline, Cisco MQC configuration [15], and the DiffServ-aware Traffic Engineering extension that couples per-class bandwidth reservation to MPLS-TE [8].

Cards: 3

DiffServ

A scalable marking model: packets are classified into a small number of classes at the network edge, and every core hop applies a per-hop behaviour based solely on that mark. No per-flow state [2].

Seven-class matrix

The de-facto industry layout: Network Control, Voice, Video, Business, Bulk, Best-effort, Scavenger [5]. Each class maps to a DSCP code point and an MPLS EXP/TC value [1, 6]. The DSCP→TC mapping is lossy (6 bits → 3 bits); RFC 3270 specifies the E-LSP and L-LSP mapping models for MPLS DiffServ [7].

DS-TE

DiffServ-aware Traffic Engineering. Bandwidth on each link is partitioned per class, so voice LSPs reserve from a voice pool that best-effort LSPs cannot consume [8].

The scalability problem DiffServ solves

The older IntServ/RSVP model reserved resources per flow end-to-end. Every router on the path held state for every flow. That works in a small enterprise; it collapses at SP scale where a single backbone link carries millions of concurrent flows [2].

DiffServ flips the model: classification happens once at the edge (the PE), every packet gets a small mark, and every core router treats all packets in a given class identically. Core devices hold state per class (6-8 classes) rather than per flow [2]. This is the only QoS model that scales to Tier-1 backbones.

Highlight: Key Insight DiffServ trades per-flow precision for scalability. The core cannot distinguish one voice call from another — but it does not need to, because every voice packet gets the same low-latency treatment.

Marking fields

PlaneFieldWidthValuesNotes
IPDSCP (in the Differentiated Services field, which replaced the IPv4 ToS byte per RFC 2474 §3)6 bits64 code pointsUsed on CE-PE and customer-visible segments
MPLSTraffic Class (formerly EXP, renamed by RFC 5462)3 bits8 classesWhat actually drives core forwarding decisions

The 3-bit MPLS EXP/TC field is the binding constraint: only eight distinct classes can be signalled end-to-end across an MPLS core, which is why SP traffic matrices converge on 6-8 classes in practice [6]. RFC 4594 defines 12 service classes; SPs typically collapse them to 6–8 classes due to the 3-bit MPLS Traffic Class field ceiling [5, 6].

The standard seven-class SP traffic matrix

Most Tier-1 SPs map customer and internal traffic into a matrix very close to the one below. The exact DSCP values are not arbitrary — they are the IETF-recommended code points from RFC 4594 and deployed consistently enough that inter-SP peering usually “just works” [5].

ClassPriorityDSCPEXPTypical traffic
Network ControlHighestCS6 / CS76, 7BGP, OSPF, IS-IS, LDP — must never drop [5]
VoiceVery high (strict)EF5VoIP RTP, mobile voice bearers [4, 5]
Real-Time InteractiveHighCS44Broadcast video, interactive gaming [5]
Multimedia ConferencingHighAF414Adaptive video conferencing [3, 5]
Broadcast Video / IPTVHighCS33Broadcast IPTV (inelastic) [5]
Multimedia StreamingMediumAF313Buffered IPTV / streaming video [3, 5]
Low-Latency Data (business critical)MediumAF212Transactional web browsing, enterprise L3VPN [3, 5]
High-Throughput DataLowAF111Bulk transfers, FTP, large email [3, 5]
Best effortDefault00Anything unclassified
ScavengerBelow best-effortCS11Backups, P2P, anything that can wait

Highlight: Note CS6/CS7 (Network Control) is reserved for routing-protocol packets generated by the SP’s own devices. Customer traffic is never marked CS6/CS7 by the PE — a mismarked customer packet is remarked down at the trust boundary.

The QoS pipeline

Every packet entering the SP at a PE traverses five logical stages. Classification and policing happen on ingress; queuing and shaping happen on egress; marking can happen at either boundary.

flowchart LR
    CE[CE ingress] --> CLS[Classify<br/>5-tuple / DSCP / DPI]
    CLS --> POL[Police<br/>token bucket]
    POL --> MRK[Mark / Remark<br/>DSCP + EXP]
    MRK --> CORE[(MPLS core<br/>per-class queuing)]
    CORE --> Q[Queue<br/>PQ / CBWFQ / WRED]
    Q --> SHP[Shape<br/>egress rate smoothing]
    SHP --> OUT[Peer / CE egress]

Classification

Identify what the packet is. Inputs: source/destination IP, L4 port, an existing DSCP mark [1], or application-level signatures from Deep Packet Inspection. Customer markings may be trusted (contracts with sophisticated enterprises) or overridden by SP policy at the PE trust boundary [2].

Policing

Enforce the customer’s contracted rate using a token bucket [9, 10]. A customer who bought “100 Mbps with a 10 Mbps voice priority tier” is policed as: up to 10 Mbps of EF-marked traffic flows with priority treatment; EF traffic above 10 Mbps is either dropped or demoted to best-effort; the remaining classes share the other 90 Mbps. Token bucket is simple, precise, and inexpensive in hardware [9, 10].

Marking and remarking

Set the DSCP (IP) and EXP (MPLS) fields per SP policy [1, 6]. Trust-boundary behaviour is the key design decision: trust-dscp on a trusted customer preserves their markings; class-map + set commands on an untrusted customer overwrite them [15].

Queuing — where congestion is actually resolved

Each egress interface has one queue per class. Three queuing disciplines compose the per-hop behaviour [15]:

DisciplinePurposeClasses it servesKey risk
Priority Queue (PQ)Strict priority — packets here jump ahead of everything elseVoice, Network ControlStarvation of lower classes; must be rate-capped
CBWFQGuaranteed minimum bandwidth share, proportional distribution of unused bandwidthVideo, Business, Bulk, Best-effortNone if configured with realistic percentages
WREDDrop early and randomly as queue fills, instead of tail-droppingAny TCP-heavy classMisconfigured thresholds can under-utilise the link

Highlight: Warning A Priority Queue without a rate cap will starve every other class the moment voice traffic exceeds its planned envelope. The priority percent statement (not just priority) is mandatory in production policies.

WRED vs tail drop

When a queue fills, naive tail drop discards every incoming packet at once. Every TCP flow hitting that tail simultaneously backs off, and they all ramp up together — TCP global synchronisation [12]. Utilisation oscillates between full and empty, and throughput collapses.

WRED (Weighted Random Early Detection) starts dropping packets probabilistically as the queue depth crosses a minimum threshold, before the queue is full [12]. Drop probability rises with queue depth. Different classes have different drop profiles: scavenger drops aggressively and early, business-critical drops reluctantly and late [3, 15]. TCP flows see losses staggered in time, back off at different moments, and the aggregate arrival rate stabilises [12].

Shaping

Smooth bursty traffic by buffering it toward a target rate, instead of dropping. Used where the downstream device would policer-drop over-rate traffic — typically at an interconnect toward another carrier. Shaping trades delay for loss; policing trades loss for delay. Choose the one whose cost the service can tolerate.

Cisco MQC configuration

The Modular QoS CLI is the standard Cisco pattern: class-maps define what to match; policy-maps define what to do with each match; service-policy attaches a policy to an interface [15].

class-map match-any VOICE
 match dscp ef
class-map match-any VIDEO
 match dscp af41
class-map match-any BUSINESS
 match dscp af31 af32
!
policy-map CORE-EGRESS
 class VOICE
  priority percent 20
 class VIDEO
  bandwidth percent 30
  random-detect dscp-based
 class BUSINESS
  bandwidth percent 25
  random-detect dscp-based
 class class-default
  bandwidth percent 25
  random-detect
!
interface GigabitEthernet0/0
 service-policy output CORE-EGRESS

Allocation summary:

ClassShareDisciplineNotes
VOICE20%Strict priority (PQ)Rate-capped to prevent starvation
VIDEO30%CBWFQ + WREDDSCP-based drop profile
BUSINESS25%CBWFQ + WREDDSCP-based drop profile (AF31 + AF32)
class-default25%CBWFQ + WREDCatches everything not matched above

Highlight: Tip Always leave a class class-default with a non-trivial bandwidth allocation. Traffic that fails every explicit match lands here, and starving it produces mysterious outages for unclassified but legitimate flows (DNS, NTP, new applications not yet profiled).

MPLS DiffServ-aware TE (DS-TE)

Ordinary RSVP-TE reserves an aggregate bandwidth pool per link (“2 Gbps available for TE tunnels on this link”). DS-TE partitions that pool per class: “500 Mbps for the voice sub-pool, 1.5 Gbps for the best-effort sub-pool” [8]. Voice LSPs can only draw from the voice sub-pool; best-effort LSPs cannot starve voice reservations by grabbing common bandwidth [8].

graph LR
    subgraph Link["10 Gbps SP backbone link"]
        V["Voice sub-pool<br/>2 Gbps"]
        B["Business sub-pool<br/>3 Gbps"]
        E["Best-effort sub-pool<br/>5 Gbps"]
    end
    VT[Voice LSPs] -.reserve from.-> V
    BT[Business LSPs] -.reserve from.-> B
    ET[Best-effort LSPs] -.reserve from.-> E

Highlight: Key Insight DS-TE is what makes hard-SLA voice services mathematically provable. The SP can demonstrate to a regulator or customer that voice traffic cannot exceed its reserved sub-pool on any link, no matter how much best-effort traffic the rest of the network carries.

See [[01-mpls-traffic-engineering]] for the underlying RSVP-TE bandwidth reservation mechanics that DS-TE extends.

Operational reality — where QoS matters most

Most Tier-1 SP cores run at low-to-moderate utilisation most of the time, because bandwidth is cheap relative to outage risk. During normal operation, every class fits comfortably and the per-class policy is essentially dormant.

QoS earns its keep during congestion events:

EventWhat happensWhy QoS matters
Fibre cutTraffic reroutes onto smaller backup paths; links briefly saturateVoice and network control must survive the 10-30s convergence window
DDoS attackScavenger/best-effort traffic spikes toward a target customerWRED drop profiles protect premium classes while the attack is scrubbed
Flash crowdLive-event traffic surges on a single egressCBWFQ guarantees that business-critical L3VPN traffic keeps its share
Maintenance windowDeliberate traffic migration onto a subset of linksDS-TE voice sub-pools prevent SLA violations during the move

At the access edge, the picture inverts. Customer links (50 Mbps branch circuits, 1 Gbps enterprise tails) saturate during normal business hours. A branch office needs aggressive QoS to keep a VoIP call intelligible when someone kicks off a 40 GB cloud backup. Access-edge QoS is where most operator-visible QoS complaints originate.

Highlight: Note The core is over-provisioned; the access edge is not. Design effort should scale inversely with link size — a 100 Gbps backbone port needs a modest seven-class policy, a 50 Mbps branch circuit needs careful per-application tuning.

See Also

References

  1. RFC 2474Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers. K. Nichols et al., IETF, December 1998. https://www.rfc-editor.org/rfc/rfc2474
  2. RFC 2475An Architecture for Differentiated Services. S. Blake et al., IETF, December 1998. https://www.rfc-editor.org/rfc/rfc2475
  3. RFC 2597Assured Forwarding PHB Group. J. Heinanen et al., IETF, June 1999. https://www.rfc-editor.org/rfc/rfc2597
  4. RFC 3246An Expedited Forwarding PHB (Per-Hop Behavior). B. Davie et al., IETF, March 2002. https://www.rfc-editor.org/rfc/rfc3246
  5. RFC 4594Configuration Guidelines for DiffServ Service Classes. J. Babiarz, K. Chan, F. Baker, IETF, August 2006. https://www.rfc-editor.org/rfc/rfc4594
  6. RFC 5462Multiprotocol Label Switching (MPLS) Label Stack Entry: “EXP” Field Renamed to “Traffic Class” Field. L. Andersson, R. Asati, IETF, February 2009. https://www.rfc-editor.org/rfc/rfc5462
  7. RFC 3270Multi-Protocol Label Switching (MPLS) Support of Differentiated Services. F. Le Faucheur et al., IETF, May 2002. https://www.rfc-editor.org/rfc/rfc3270
  8. RFC 4124Protocol Extensions for Support of Diffserv-aware MPLS Traffic Engineering. F. Le Faucheur (Ed.), IETF, June 2005. https://www.rfc-editor.org/rfc/rfc4124
  9. RFC 2697A Single Rate Three Color Marker. J. Heinanen, R. Guérin, IETF, September 1999. https://www.rfc-editor.org/rfc/rfc2697
  10. RFC 2698A Two Rate Three Color Marker. J. Heinanen, R. Guérin, IETF, September 1999. https://www.rfc-editor.org/rfc/rfc2698
  11. RFC 3168The Addition of Explicit Congestion Notification (ECN) to IP. K. Ramakrishnan et al., IETF, September 2001. https://www.rfc-editor.org/rfc/rfc3168
  12. RFC 7567IETF Recommendations Regarding Active Queue Management (BCP 197). F. Baker, G. Fairhurst (Eds.), IETF, July 2015. https://www.rfc-editor.org/rfc/rfc7567
  13. ITU-T Y.1541Network performance objectives for IP-based services. https://www.itu.int/rec/T-REC-Y.1541
  14. IEEE 802.1Q-2018, Annex I — Default PCP-to-traffic-class mappings. https://standards.ieee.org/standard/802_1Q-2018.html
  15. T. Szigeti, C. Hattingh, R. Barton, K. Briley, End-to-End QoS Network Design, 2nd ed., Cisco Press, 2013.