blob: a22bd6f096d8ee46ef67f55717796d04b7717536 [file] [log] [blame]
Carmelo Cascone7623e7c2021-10-13 17:45:27 -07001.. _slicing_qos:
2
3Slicing and QoS
Charles Chancaebcf32021-09-20 22:17:52 -07004===============
Daniele Moroed033562021-10-04 16:12:31 +02005
6.. _qos_configuration:
7
Carmelo Cascone450d9032021-10-12 01:28:02 -07008Overview
9--------
10
11Network slicing enables sharing the same physical infrastructure between
12independent logical networks, each one targeting different use cases while
13providing isolation and security guarantees. Slicing permits the implementation
14of tailor-made applications with Quality of Service (QoS) specific to the needs
15of each slice, rather than a one-size-fits-all approach.
16
17SD-Fabric supports slicing and QoS using dedicated hardware resources such as
18scheduling queues and meters. Once a packet enters the fabric, it is associated
19with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier,
20while TC is used to determine the QoS parameters. The combination of slice ID
21and TC is used by SD-Fabric to determine which switch hardware queue to use.
22
23We provide fabric-wide isolation and QoS guarantees. Packets are classified by
24the first leaf switch in the path, we then use a custom DSCP-based marking
25scheme to apply the same treatment on all switches.
26
27Classification can be achieved for both regular traffic via REST APIs, or for
28GTP-U traffic terminated by P4-UPF using PFCP integration.
29
30Traffic Classes
31^^^^^^^^^^^^^^^
32
33We supports the following traffic classes that covers the spectrum of
34applications from latency-sensitive to throughput-intensive.
35
36Control
37"""""""
38For applications demanding ultra-low latency and jitter guarantees, with
39non-bursty, low throughput requirements in the order of 100s of packets per
40second. Examples of such applications are consensus protocols, industrial
41automation, timing, etc. This class uses a queue shared by all slices, serviced
42with the highest priority. To enforce isolation between slices, and to avoid
43starvation of lower priority classes, each slice is processed through a
44single-rate two-color meter. Slices sending at rates higher than the configured
45meter rate might observe packet drops.
46
47Real-Time
48"""""""""
49For applications that require both low-latency and sustained throughput.
50Examples of such applications are video and audio streaming. Each slice gets a
51dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the
52lowest latency at all times even with bursty senders. To avoid starvation of
53lower priority classes, Real-Time queues are shaped at a maximum rate. Slices
54sending at rates higher than the configured one might observe higher latency
55because of the shaping. Real-Time queues have priority lower than Control, but
56higher than Elastic.
57
58Elastic
59"""""""
60For throughput-intensive applications with no latency requirements. This class
61is best suited for large file transfers, Intranet/enterprise applications,
62prioritized Internet access, etc. Each slice gets a dedicated Elastic queue
63serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During
64congestion, Elastic queues are guaranteed to receive minimum bandwidth that can
65grow up to the link capacity if other queues are empty.
66
67Best-Effort
68"""""""""""
69This is the default traffic class, used by packets not classified with any of
70the above classes All slices share the same best-effort queue with lowest
71priority.
72
73Classification
74^^^^^^^^^^^^^^^
75
76Slice ID and TC classification can be performed in two ways.
77
78Regular traffic
79"""""""""""""""
80We provide an ACL-like APIs that supports specifying wildcard match rules on the
81IPv4 5-tuple.
82
83P4-UPF traffic
84""""""""""""""
85When using the embedded UPF function, for GTP-U mobile traffic terminated by the
86fabric, we support integration with PFCP QoS features such as prioritization via
87QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate
88(GBR).
89
90You can configure a static one-to-one mapping between 3GPP’s QFIs and
91SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR
92and GBR configuration are translated into meter configurations.
93
94QoS classification uses the same table for GTP-U tunnel termination, for this
95reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF
96function on each leaf switch using the distributed UPF mode, such that packets
97are classified as soon as they enter the network.
98
99Support for slicing of mobile traffic is work-in-progress and will be added in
100the next SD-Fabric release.
101
Daniele Moroed033562021-10-04 16:12:31 +0200102Configuration
103-------------
104.. note:: QoS and slicing configuration is currently statically configured at switch startup.
105 Dynamic configuration will be supported in a next SD-Fabric release.
106
Carmelo Cascone450d9032021-10-12 01:28:02 -0700107QoS and slicing uses switch queue configuration provided via the
108``vendor_config`` portion of the Stratum Chassis Config (see
109:ref:`stratum_chassis_config`), where the queues and schedulers can be
110configured. For more information on the format of ``vendor_config``, see the
111`guide for running Stratum on Tofino-based switches
112<https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_
113in the Stratum repository.
114
Daniele Moroed033562021-10-04 16:12:31 +0200115We provide a convenient `script <https://github.com/stratum/fabric-tna/blob/main/util/gen-stratum-qos-config.py>`_
116to generate the configuration starting from a higher-level description provided via a YAML file.
Carmelo Cascone450d9032021-10-12 01:28:02 -0700117This file allows to configure the parameters for the traffic classes listed in the above section.
Daniele Moroed033562021-10-04 16:12:31 +0200118
119Here's a list of parameters that you can configure via the YAML QoS configuration file:
120
121* ``max_cells``: Maximum number of buffer cells, depends on the ASIC SKU/revision.
122
123* ``pool_allocations``: Percentage of buffer cells allocated to each traffic class.
124 The sum should be 100. Usually, we leave a portion of the buffer ``unassigned``
125 for queues that do not have a pool (yet).
126 Example of such queues are those for the recirculation port, CPU port, etc.
127
128 .. code-block:: yaml
129
130 pool_allocations:
131 control: 1
132 realtime: 9
133 elastic: 80
134 besteffort: 9
135 unassigned: 1
136
137* **Control** Traffic Class: The available bandwidth dedicated to Control traffic is divided in *slots*.
138 Each slot has a maximum rate and burst (in packets of the given MTU).
139 A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline.
140
141 * ``control_slot_count``: Number of slots.
142 * ``control_slot_rate_pps``: Packet per second rate of each slot.
143 * ``control_slot_burst_pkts``: Number of packets per burst of each slot.
144 * ``control_mtu_bytes``: MTU of packets for the PPS and burst values.
145
146 .. code-block:: yaml
147
148 control_slot_count: 50
149 control_slot_rate_pps: 100
150 control_slot_burst_pkts: 10
151 control_mtu_bytes: 1500
152
153* **Real-Time** Traffic Class Configuration:
154
155 * ``realtime_max_rates_bps``: List of maximum shaping rates for Real-Time queues,
156 one per slice requesting such service.
157
158 * ``realtime_max_burst_s``: Maximum amount of time that a Real-Time queue can
159 burst at the port speed. This parameter is used to limit delay for Elastic
160 queues.
161
162 .. code-block:: yaml
163
164 realtime_max_rates_bps:
165 - 45000000 # 45 Mbps
166 - 30000000 # 30 Mbps
167 - 25000000 # 25 Mbps
168 realtime_max_burst_s: 0.005 # 5 ms
169
170* **Elastic** Traffic Class Configuration:
171
172 * ``elastic_min_rates_bps``: List of minimum guaranteed rates for Elastic queues,
173 one per slice requesting such service.
174
175 .. code-block:: yaml
176
177 elastic_min_rates_bps:
178 - 100000000 # 100 Mbps
179 - 200000000 # 200 Mbps
180
181* ``port_templates`` section: List of switch port for which we want to configure
182 queues.
183
184 Every ``port_templates`` element contains:
185
186 * ``descr``: Description of the port purpose.
187
188 * ``rate_bps``: Port speed in bit per second.
189
190 * ``is_shaping_enabled``: ``true`` if the rate is enforced using shaping,
191 ``false`` if the rate is the channel speed.
192
193 * ``shaping_burst_bytes``: Burst size in bytes, meaningful only if port speed
194 is shaped (when ``is_shaping_enabled: true``).
195
196 * ``queue_count``: Number of queues assigned to the port.
197
198 * ``port_ids``: List of Stratum port IDs (:ref:`singleton_port` from Stratum Chassis Config),
199 using this port template. Used for port that corresponds to switch front-panel ports.
200
201 Mutually exclusive with ``sdk_port_ids`` field.
202
Carmelo Cascone450d9032021-10-12 01:28:02 -0700203 * ``sdk_port_ids``: List of SDK port numbers (i.e., Tofino ``DP_ID``) using this port template.
Daniele Moroed033562021-10-04 16:12:31 +0200204 Used for internal ports (e.g., recirculation ports).
205
206 Mutually exclusive with ``port_ids`` field.
207
208 .. code-block:: yaml
209
210 port_templates:
211 - descr: "Base station"
212 rate_bps: 1000000000 # 1 Gbps
213 is_shaping_enabled: true
214 shaping_burst_bytes: 18000 # 2x jumbo frames
215 queue_count: 16
216 port_ids:
217 - 100
218 - descr: "Servers"
219 port_ids:
220 - 200
221 rate_bps: 40000000000 # 40 Gbps
222 is_shaping_enabled: false
223 queue_count: 16
224 - descr: "Recirculation"
225 sdk_port_ids:
226 - 68
227 rate_bps: 100000000000 # 100 Gbps
228 is_shaping_enabled: false
229 queue_count: 16
230
231An example of a complete QoS and Slicing configuration can be found `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_.
Wailok Shum08311e52021-09-30 23:22:12 +0800232
233REST API
234--------
Carmelo Cascone450d9032021-10-12 01:28:02 -0700235We provide REST APIs with support for adding/removing/querying slices and
236traffic classes, as well as flow classification.
Wailok Shum08311e52021-09-30 23:22:12 +0800237
238Slice
239^^^^^
240
241Add a slice
242"""""""""""
243A POST request with Slice ID as path parameter.
244``/slicing/slice/{sliceId}``
245
246.. image:: ../images/qos-rest-slice-add.png
247 :width: 700px
248
249Remove a slice
250"""""""""""""""
251A DELETE request with Slice ID as path parameter.
252``/slicing/slice/{sliceId}``
253
254.. image:: ../images/qos-rest-slice-remove.png
255 :width: 700px
256
257Get all slices
258""""""""""""""
259A GET request.
260Returns a collection of slice id.
261/slicing/slice
262
263.. image:: ../images/qos-rest-slice-get.png
264 :width: 700px
265
266Traffic Class
267^^^^^^^^^^^^^
268.. tip::
269 Traffic Class has following attributes: ``BEST_EFFORT``, ``CONTROL``, ``REAL_TIME``, ``ELASTIC``.
270
271Add a traffic class to a slice
272""""""""""""""""""""""""""""""
273A POST request with Slice ID and Traffic Class as path parameters.
274``/slicing/tc/{sliceId}/{tc}``
275
276.. image:: ../images/qos-rest-tc-add.png
277 :width: 700px
278
279Remove a traffic class from a slice
280"""""""""""""""""""""""""""""""""""
281A DELETE request with Slice ID and Traffic Class as path parameters.
282``/slicing/tc/{sliceId}/{tc}``
283
284.. image:: ../images/qos-rest-tc-remove.png
285 :width: 700px
286
287Get all traffic classes from a slice
288""""""""""""""""""""""""""""""""""""
289A GET request with Slice ID as path parameters.
290Returns a collection of traffic class.
291``/slicing/tc/{sliceId}``
292
293.. image:: ../images/qos-rest-tc-get.png
294 :width: 700px
295
296Classify Flow
297^^^^^^^^^^^^^
298
299A flow can be defined as
300
301.. code-block:: json
302
303 {
304 "criteria": [
305 {
306 "type": "IPV4_SRC",
307 "ip": "10.0.0.1/32"
308 },
309 {
310 "type": "IPV4_DST",
311 "ip": "10.0.0.2/32"
312 },
313 {
314 "type": "IP_PROTO",
315 "protocol": 6
316 },
317 {
318 "type": "TCP_SRC",
319 "tcpPort": 1000
320 },
321 {
322 "type": "TCP_DST",
323 "tcpPort": 80
324 },
325 {
326 "type": "UDP_SRC",
327 "udpPort": 1000
328 },
329 {
330 "type": "UDP_DST",
331 "udpPort": 1812
332 }
333 ]
334 }
335
336- ``IPV4_SRC``: Source IPv4 prefix
337
338- ``IPV4_DST``: Destination IPv4 prefix
339
340- ``IP_PROTO``: IP Protocol, accept 6 (TCP) and 17 (UDP)
341
342- ``TCP_SRC``: Source L4 (TCP) port
343
344- ``TCP_DST``: Destination L4 (TCP) port
345
346- ``UDP_SRC``: Source L4 (UDP) port
347
348- ``UDP_DST``: Destination L4 (UDP) port
349
350.. note::
351 SD-Fabric currently supports 5-tuple only.
352
353Classify a flow to a slice and traffic class
354""""""""""""""""""""""""""""""""""""""""""""
355A POST request with Slice ID and Traffic Class as path parameters.
356And a Json of a flow as body parameters.
357``/slicing/flow/{sliceId}/{tc}``
358
359.. image:: ../images/qos-rest-classifier-add.png
360 :width: 700px
361
362Remove a flow from a slice and traffic class
363""""""""""""""""""""""""""""""""""""""""""""
364A DELETE request with Slice ID and Traffic Class as path parameters.
365And a Json of a flow as body parameters.
366``/slicing/flow/{sliceId}/{tc}``
367
368.. image:: ../images/qos-rest-classifier-remove.png
369 :width: 700px
370
371Get all classified flows from a slice and traffic class
372"""""""""""""""""""""""""""""""""""""""""""""""""""""""
373A GET request with Slice ID and Traffic Class as path parameters.
374Returns a collection of flow.
375``/slicing/flow/{sliceId}``
376
377.. image:: ../images/qos-rest-classifier-get.png
378 :width: 700px