blob: 757b56b26b7e74a7553566c325f3c994ec584739 [file] [log] [blame]
Charles Chancaebcf32021-09-20 22:17:52 -07001QoS and Slicing
2===============
Daniele Moroed033562021-10-04 16:12:31 +02003
4.. _qos_configuration:
5
Carmelo Cascone450d9032021-10-12 01:28:02 -07006Overview
7--------
8
9Network slicing enables sharing the same physical infrastructure between
10independent logical networks, each one targeting different use cases while
11providing isolation and security guarantees. Slicing permits the implementation
12of tailor-made applications with Quality of Service (QoS) specific to the needs
13of each slice, rather than a one-size-fits-all approach.
14
15SD-Fabric supports slicing and QoS using dedicated hardware resources such as
16scheduling queues and meters. Once a packet enters the fabric, it is associated
17with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier,
18while TC is used to determine the QoS parameters. The combination of slice ID
19and TC is used by SD-Fabric to determine which switch hardware queue to use.
20
21We provide fabric-wide isolation and QoS guarantees. Packets are classified by
22the first leaf switch in the path, we then use a custom DSCP-based marking
23scheme to apply the same treatment on all switches.
24
25Classification can be achieved for both regular traffic via REST APIs, or for
26GTP-U traffic terminated by P4-UPF using PFCP integration.
27
28Traffic Classes
29^^^^^^^^^^^^^^^
30
31We supports the following traffic classes that covers the spectrum of
32applications from latency-sensitive to throughput-intensive.
33
34Control
35"""""""
36For applications demanding ultra-low latency and jitter guarantees, with
37non-bursty, low throughput requirements in the order of 100s of packets per
38second. Examples of such applications are consensus protocols, industrial
39automation, timing, etc. This class uses a queue shared by all slices, serviced
40with the highest priority. To enforce isolation between slices, and to avoid
41starvation of lower priority classes, each slice is processed through a
42single-rate two-color meter. Slices sending at rates higher than the configured
43meter rate might observe packet drops.
44
45Real-Time
46"""""""""
47For applications that require both low-latency and sustained throughput.
48Examples of such applications are video and audio streaming. Each slice gets a
49dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the
50lowest latency at all times even with bursty senders. To avoid starvation of
51lower priority classes, Real-Time queues are shaped at a maximum rate. Slices
52sending at rates higher than the configured one might observe higher latency
53because of the shaping. Real-Time queues have priority lower than Control, but
54higher than Elastic.
55
56Elastic
57"""""""
58For throughput-intensive applications with no latency requirements. This class
59is best suited for large file transfers, Intranet/enterprise applications,
60prioritized Internet access, etc. Each slice gets a dedicated Elastic queue
61serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During
62congestion, Elastic queues are guaranteed to receive minimum bandwidth that can
63grow up to the link capacity if other queues are empty.
64
65Best-Effort
66"""""""""""
67This is the default traffic class, used by packets not classified with any of
68the above classes All slices share the same best-effort queue with lowest
69priority.
70
71Classification
72^^^^^^^^^^^^^^^
73
74Slice ID and TC classification can be performed in two ways.
75
76Regular traffic
77"""""""""""""""
78We provide an ACL-like APIs that supports specifying wildcard match rules on the
79IPv4 5-tuple.
80
81P4-UPF traffic
82""""""""""""""
83When using the embedded UPF function, for GTP-U mobile traffic terminated by the
84fabric, we support integration with PFCP QoS features such as prioritization via
85QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate
86(GBR).
87
88You can configure a static one-to-one mapping between 3GPP’s QFIs and
89SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR
90and GBR configuration are translated into meter configurations.
91
92QoS classification uses the same table for GTP-U tunnel termination, for this
93reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF
94function on each leaf switch using the distributed UPF mode, such that packets
95are classified as soon as they enter the network.
96
97Support for slicing of mobile traffic is work-in-progress and will be added in
98the next SD-Fabric release.
99
Daniele Moroed033562021-10-04 16:12:31 +0200100Configuration
101-------------
102.. note:: QoS and slicing configuration is currently statically configured at switch startup.
103 Dynamic configuration will be supported in a next SD-Fabric release.
104
Carmelo Cascone450d9032021-10-12 01:28:02 -0700105QoS and slicing uses switch queue configuration provided via the
106``vendor_config`` portion of the Stratum Chassis Config (see
107:ref:`stratum_chassis_config`), where the queues and schedulers can be
108configured. For more information on the format of ``vendor_config``, see the
109`guide for running Stratum on Tofino-based switches
110<https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_
111in the Stratum repository.
112
Daniele Moroed033562021-10-04 16:12:31 +0200113We provide a convenient `script <https://github.com/stratum/fabric-tna/blob/main/util/gen-stratum-qos-config.py>`_
114to generate the configuration starting from a higher-level description provided via a YAML file.
Carmelo Cascone450d9032021-10-12 01:28:02 -0700115This file allows to configure the parameters for the traffic classes listed in the above section.
Daniele Moroed033562021-10-04 16:12:31 +0200116
117Here's a list of parameters that you can configure via the YAML QoS configuration file:
118
119* ``max_cells``: Maximum number of buffer cells, depends on the ASIC SKU/revision.
120
121* ``pool_allocations``: Percentage of buffer cells allocated to each traffic class.
122 The sum should be 100. Usually, we leave a portion of the buffer ``unassigned``
123 for queues that do not have a pool (yet).
124 Example of such queues are those for the recirculation port, CPU port, etc.
125
126 .. code-block:: yaml
127
128 pool_allocations:
129 control: 1
130 realtime: 9
131 elastic: 80
132 besteffort: 9
133 unassigned: 1
134
135* **Control** Traffic Class: The available bandwidth dedicated to Control traffic is divided in *slots*.
136 Each slot has a maximum rate and burst (in packets of the given MTU).
137 A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline.
138
139 * ``control_slot_count``: Number of slots.
140 * ``control_slot_rate_pps``: Packet per second rate of each slot.
141 * ``control_slot_burst_pkts``: Number of packets per burst of each slot.
142 * ``control_mtu_bytes``: MTU of packets for the PPS and burst values.
143
144 .. code-block:: yaml
145
146 control_slot_count: 50
147 control_slot_rate_pps: 100
148 control_slot_burst_pkts: 10
149 control_mtu_bytes: 1500
150
151* **Real-Time** Traffic Class Configuration:
152
153 * ``realtime_max_rates_bps``: List of maximum shaping rates for Real-Time queues,
154 one per slice requesting such service.
155
156 * ``realtime_max_burst_s``: Maximum amount of time that a Real-Time queue can
157 burst at the port speed. This parameter is used to limit delay for Elastic
158 queues.
159
160 .. code-block:: yaml
161
162 realtime_max_rates_bps:
163 - 45000000 # 45 Mbps
164 - 30000000 # 30 Mbps
165 - 25000000 # 25 Mbps
166 realtime_max_burst_s: 0.005 # 5 ms
167
168* **Elastic** Traffic Class Configuration:
169
170 * ``elastic_min_rates_bps``: List of minimum guaranteed rates for Elastic queues,
171 one per slice requesting such service.
172
173 .. code-block:: yaml
174
175 elastic_min_rates_bps:
176 - 100000000 # 100 Mbps
177 - 200000000 # 200 Mbps
178
179* ``port_templates`` section: List of switch port for which we want to configure
180 queues.
181
182 Every ``port_templates`` element contains:
183
184 * ``descr``: Description of the port purpose.
185
186 * ``rate_bps``: Port speed in bit per second.
187
188 * ``is_shaping_enabled``: ``true`` if the rate is enforced using shaping,
189 ``false`` if the rate is the channel speed.
190
191 * ``shaping_burst_bytes``: Burst size in bytes, meaningful only if port speed
192 is shaped (when ``is_shaping_enabled: true``).
193
194 * ``queue_count``: Number of queues assigned to the port.
195
196 * ``port_ids``: List of Stratum port IDs (:ref:`singleton_port` from Stratum Chassis Config),
197 using this port template. Used for port that corresponds to switch front-panel ports.
198
199 Mutually exclusive with ``sdk_port_ids`` field.
200
Carmelo Cascone450d9032021-10-12 01:28:02 -0700201 * ``sdk_port_ids``: List of SDK port numbers (i.e., Tofino ``DP_ID``) using this port template.
Daniele Moroed033562021-10-04 16:12:31 +0200202 Used for internal ports (e.g., recirculation ports).
203
204 Mutually exclusive with ``port_ids`` field.
205
206 .. code-block:: yaml
207
208 port_templates:
209 - descr: "Base station"
210 rate_bps: 1000000000 # 1 Gbps
211 is_shaping_enabled: true
212 shaping_burst_bytes: 18000 # 2x jumbo frames
213 queue_count: 16
214 port_ids:
215 - 100
216 - descr: "Servers"
217 port_ids:
218 - 200
219 rate_bps: 40000000000 # 40 Gbps
220 is_shaping_enabled: false
221 queue_count: 16
222 - descr: "Recirculation"
223 sdk_port_ids:
224 - 68
225 rate_bps: 100000000000 # 100 Gbps
226 is_shaping_enabled: false
227 queue_count: 16
228
229An example of a complete QoS and Slicing configuration can be found `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_.
Wailok Shum08311e52021-09-30 23:22:12 +0800230
231REST API
232--------
Carmelo Cascone450d9032021-10-12 01:28:02 -0700233We provide REST APIs with support for adding/removing/querying slices and
234traffic classes, as well as flow classification.
Wailok Shum08311e52021-09-30 23:22:12 +0800235
236Slice
237^^^^^
238
239Add a slice
240"""""""""""
241A POST request with Slice ID as path parameter.
242``/slicing/slice/{sliceId}``
243
244.. image:: ../images/qos-rest-slice-add.png
245 :width: 700px
246
247Remove a slice
248"""""""""""""""
249A DELETE request with Slice ID as path parameter.
250``/slicing/slice/{sliceId}``
251
252.. image:: ../images/qos-rest-slice-remove.png
253 :width: 700px
254
255Get all slices
256""""""""""""""
257A GET request.
258Returns a collection of slice id.
259/slicing/slice
260
261.. image:: ../images/qos-rest-slice-get.png
262 :width: 700px
263
264Traffic Class
265^^^^^^^^^^^^^
266.. tip::
267 Traffic Class has following attributes: ``BEST_EFFORT``, ``CONTROL``, ``REAL_TIME``, ``ELASTIC``.
268
269Add a traffic class to a slice
270""""""""""""""""""""""""""""""
271A POST request with Slice ID and Traffic Class as path parameters.
272``/slicing/tc/{sliceId}/{tc}``
273
274.. image:: ../images/qos-rest-tc-add.png
275 :width: 700px
276
277Remove a traffic class from a slice
278"""""""""""""""""""""""""""""""""""
279A DELETE request with Slice ID and Traffic Class as path parameters.
280``/slicing/tc/{sliceId}/{tc}``
281
282.. image:: ../images/qos-rest-tc-remove.png
283 :width: 700px
284
285Get all traffic classes from a slice
286""""""""""""""""""""""""""""""""""""
287A GET request with Slice ID as path parameters.
288Returns a collection of traffic class.
289``/slicing/tc/{sliceId}``
290
291.. image:: ../images/qos-rest-tc-get.png
292 :width: 700px
293
294Classify Flow
295^^^^^^^^^^^^^
296
297A flow can be defined as
298
299.. code-block:: json
300
301 {
302 "criteria": [
303 {
304 "type": "IPV4_SRC",
305 "ip": "10.0.0.1/32"
306 },
307 {
308 "type": "IPV4_DST",
309 "ip": "10.0.0.2/32"
310 },
311 {
312 "type": "IP_PROTO",
313 "protocol": 6
314 },
315 {
316 "type": "TCP_SRC",
317 "tcpPort": 1000
318 },
319 {
320 "type": "TCP_DST",
321 "tcpPort": 80
322 },
323 {
324 "type": "UDP_SRC",
325 "udpPort": 1000
326 },
327 {
328 "type": "UDP_DST",
329 "udpPort": 1812
330 }
331 ]
332 }
333
334- ``IPV4_SRC``: Source IPv4 prefix
335
336- ``IPV4_DST``: Destination IPv4 prefix
337
338- ``IP_PROTO``: IP Protocol, accept 6 (TCP) and 17 (UDP)
339
340- ``TCP_SRC``: Source L4 (TCP) port
341
342- ``TCP_DST``: Destination L4 (TCP) port
343
344- ``UDP_SRC``: Source L4 (UDP) port
345
346- ``UDP_DST``: Destination L4 (UDP) port
347
348.. note::
349 SD-Fabric currently supports 5-tuple only.
350
351Classify a flow to a slice and traffic class
352""""""""""""""""""""""""""""""""""""""""""""
353A POST request with Slice ID and Traffic Class as path parameters.
354And a Json of a flow as body parameters.
355``/slicing/flow/{sliceId}/{tc}``
356
357.. image:: ../images/qos-rest-classifier-add.png
358 :width: 700px
359
360Remove a flow from a slice and traffic class
361""""""""""""""""""""""""""""""""""""""""""""
362A DELETE request with Slice ID and Traffic Class as path parameters.
363And a Json of a flow as body parameters.
364``/slicing/flow/{sliceId}/{tc}``
365
366.. image:: ../images/qos-rest-classifier-remove.png
367 :width: 700px
368
369Get all classified flows from a slice and traffic class
370"""""""""""""""""""""""""""""""""""""""""""""""""""""""
371A GET request with Slice ID and Traffic Class as path parameters.
372Returns a collection of flow.
373``/slicing/flow/{sliceId}``
374
375.. image:: ../images/qos-rest-classifier-get.png
376 :width: 700px