Charles Chan | fcfe890 | 2022-02-02 17:06:27 -0800 | [diff] [blame] | 1 | .. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org> |
| 2 | .. SPDX-License-Identifier: Apache-2.0 |
| 3 | |
Carmelo Cascone | 7623e7c | 2021-10-13 17:45:27 -0700 | [diff] [blame] | 4 | .. _slicing_qos: |
| 5 | |
| 6 | Slicing and QoS |
Charles Chan | caebcf3 | 2021-09-20 22:17:52 -0700 | [diff] [blame] | 7 | =============== |
Daniele Moro | ed03356 | 2021-10-04 16:12:31 +0200 | [diff] [blame] | 8 | |
| 9 | .. _qos_configuration: |
| 10 | |
Carmelo Cascone | 450d903 | 2021-10-12 01:28:02 -0700 | [diff] [blame] | 11 | Overview |
| 12 | -------- |
| 13 | |
| 14 | Network slicing enables sharing the same physical infrastructure between |
| 15 | independent logical networks, each one targeting different use cases while |
| 16 | providing isolation and security guarantees. Slicing permits the implementation |
| 17 | of tailor-made applications with Quality of Service (QoS) specific to the needs |
| 18 | of each slice, rather than a one-size-fits-all approach. |
| 19 | |
| 20 | SD-Fabric supports slicing and QoS using dedicated hardware resources such as |
| 21 | scheduling queues and meters. Once a packet enters the fabric, it is associated |
| 22 | with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier, |
| 23 | while TC is used to determine the QoS parameters. The combination of slice ID |
| 24 | and TC is used by SD-Fabric to determine which switch hardware queue to use. |
| 25 | |
| 26 | We provide fabric-wide isolation and QoS guarantees. Packets are classified by |
| 27 | the first leaf switch in the path, we then use a custom DSCP-based marking |
| 28 | scheme to apply the same treatment on all switches. |
| 29 | |
| 30 | Classification can be achieved for both regular traffic via REST APIs, or for |
| 31 | GTP-U traffic terminated by P4-UPF using PFCP integration. |
| 32 | |
| 33 | Traffic Classes |
| 34 | ^^^^^^^^^^^^^^^ |
| 35 | |
| 36 | We supports the following traffic classes that covers the spectrum of |
| 37 | applications from latency-sensitive to throughput-intensive. |
| 38 | |
| 39 | Control |
| 40 | """"""" |
| 41 | For applications demanding ultra-low latency and jitter guarantees, with |
| 42 | non-bursty, low throughput requirements in the order of 100s of packets per |
| 43 | second. Examples of such applications are consensus protocols, industrial |
| 44 | automation, timing, etc. This class uses a queue shared by all slices, serviced |
| 45 | with the highest priority. To enforce isolation between slices, and to avoid |
| 46 | starvation of lower priority classes, each slice is processed through a |
| 47 | single-rate two-color meter. Slices sending at rates higher than the configured |
| 48 | meter rate might observe packet drops. |
| 49 | |
| 50 | Real-Time |
| 51 | """"""""" |
| 52 | For applications that require both low-latency and sustained throughput. |
| 53 | Examples of such applications are video and audio streaming. Each slice gets a |
| 54 | dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the |
| 55 | lowest latency at all times even with bursty senders. To avoid starvation of |
| 56 | lower priority classes, Real-Time queues are shaped at a maximum rate. Slices |
| 57 | sending at rates higher than the configured one might observe higher latency |
| 58 | because of the shaping. Real-Time queues have priority lower than Control, but |
| 59 | higher than Elastic. |
| 60 | |
| 61 | Elastic |
| 62 | """"""" |
| 63 | For throughput-intensive applications with no latency requirements. This class |
| 64 | is best suited for large file transfers, Intranet/enterprise applications, |
| 65 | prioritized Internet access, etc. Each slice gets a dedicated Elastic queue |
| 66 | serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During |
| 67 | congestion, Elastic queues are guaranteed to receive minimum bandwidth that can |
| 68 | grow up to the link capacity if other queues are empty. |
| 69 | |
| 70 | Best-Effort |
| 71 | """"""""""" |
| 72 | This is the default traffic class, used by packets not classified with any of |
| 73 | the above classes All slices share the same best-effort queue with lowest |
| 74 | priority. |
| 75 | |
| 76 | Classification |
| 77 | ^^^^^^^^^^^^^^^ |
| 78 | |
| 79 | Slice ID and TC classification can be performed in two ways. |
| 80 | |
| 81 | Regular traffic |
| 82 | """"""""""""""" |
| 83 | We provide an ACL-like APIs that supports specifying wildcard match rules on the |
| 84 | IPv4 5-tuple. |
| 85 | |
| 86 | P4-UPF traffic |
| 87 | """""""""""""" |
| 88 | When using the embedded UPF function, for GTP-U mobile traffic terminated by the |
| 89 | fabric, we support integration with PFCP QoS features such as prioritization via |
| 90 | QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate |
| 91 | (GBR). |
| 92 | |
| 93 | You can configure a static one-to-one mapping between 3GPP’s QFIs and |
| 94 | SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR |
| 95 | and GBR configuration are translated into meter configurations. |
| 96 | |
| 97 | QoS classification uses the same table for GTP-U tunnel termination, for this |
| 98 | reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF |
| 99 | function on each leaf switch using the distributed UPF mode, such that packets |
| 100 | are classified as soon as they enter the network. |
| 101 | |
| 102 | Support for slicing of mobile traffic is work-in-progress and will be added in |
| 103 | the next SD-Fabric release. |
| 104 | |
Daniele Moro | ed03356 | 2021-10-04 16:12:31 +0200 | [diff] [blame] | 105 | Configuration |
| 106 | ------------- |
| 107 | .. note:: QoS and slicing configuration is currently statically configured at switch startup. |
| 108 | Dynamic configuration will be supported in a next SD-Fabric release. |
| 109 | |
Carmelo Cascone | 450d903 | 2021-10-12 01:28:02 -0700 | [diff] [blame] | 110 | QoS and slicing uses switch queue configuration provided via the |
| 111 | ``vendor_config`` portion of the Stratum Chassis Config (see |
| 112 | :ref:`stratum_chassis_config`), where the queues and schedulers can be |
| 113 | configured. For more information on the format of ``vendor_config``, see the |
| 114 | `guide for running Stratum on Tofino-based switches |
| 115 | <https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_ |
| 116 | in the Stratum repository. |
| 117 | |
Charles Chan | a937f77 | 2022-02-23 16:24:35 -0800 | [diff] [blame] | 118 | We provide a convenient `script <https://github.com/stratum/fabric-tna/blob/main/util/gen-qos-config.py>`_ |
Daniele Moro | ed03356 | 2021-10-04 16:12:31 +0200 | [diff] [blame] | 119 | to generate the configuration starting from a higher-level description provided via a YAML file. |
Carmelo Cascone | 450d903 | 2021-10-12 01:28:02 -0700 | [diff] [blame] | 120 | This file allows to configure the parameters for the traffic classes listed in the above section. |
Daniele Moro | ed03356 | 2021-10-04 16:12:31 +0200 | [diff] [blame] | 121 | |
| 122 | Here's a list of parameters that you can configure via the YAML QoS configuration file: |
| 123 | |
| 124 | * ``max_cells``: Maximum number of buffer cells, depends on the ASIC SKU/revision. |
| 125 | |
| 126 | * ``pool_allocations``: Percentage of buffer cells allocated to each traffic class. |
| 127 | The sum should be 100. Usually, we leave a portion of the buffer ``unassigned`` |
| 128 | for queues that do not have a pool (yet). |
| 129 | Example of such queues are those for the recirculation port, CPU port, etc. |
| 130 | |
| 131 | .. code-block:: yaml |
| 132 | |
| 133 | pool_allocations: |
| 134 | control: 1 |
| 135 | realtime: 9 |
| 136 | elastic: 80 |
| 137 | besteffort: 9 |
| 138 | unassigned: 1 |
| 139 | |
| 140 | * **Control** Traffic Class: The available bandwidth dedicated to Control traffic is divided in *slots*. |
| 141 | Each slot has a maximum rate and burst (in packets of the given MTU). |
| 142 | A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline. |
| 143 | |
| 144 | * ``control_slot_count``: Number of slots. |
| 145 | * ``control_slot_rate_pps``: Packet per second rate of each slot. |
| 146 | * ``control_slot_burst_pkts``: Number of packets per burst of each slot. |
| 147 | * ``control_mtu_bytes``: MTU of packets for the PPS and burst values. |
| 148 | |
| 149 | .. code-block:: yaml |
| 150 | |
| 151 | control_slot_count: 50 |
| 152 | control_slot_rate_pps: 100 |
| 153 | control_slot_burst_pkts: 10 |
| 154 | control_mtu_bytes: 1500 |
| 155 | |
| 156 | * **Real-Time** Traffic Class Configuration: |
| 157 | |
| 158 | * ``realtime_max_rates_bps``: List of maximum shaping rates for Real-Time queues, |
| 159 | one per slice requesting such service. |
| 160 | |
| 161 | * ``realtime_max_burst_s``: Maximum amount of time that a Real-Time queue can |
| 162 | burst at the port speed. This parameter is used to limit delay for Elastic |
| 163 | queues. |
| 164 | |
| 165 | .. code-block:: yaml |
| 166 | |
| 167 | realtime_max_rates_bps: |
| 168 | - 45000000 # 45 Mbps |
| 169 | - 30000000 # 30 Mbps |
| 170 | - 25000000 # 25 Mbps |
| 171 | realtime_max_burst_s: 0.005 # 5 ms |
| 172 | |
| 173 | * **Elastic** Traffic Class Configuration: |
| 174 | |
| 175 | * ``elastic_min_rates_bps``: List of minimum guaranteed rates for Elastic queues, |
| 176 | one per slice requesting such service. |
| 177 | |
| 178 | .. code-block:: yaml |
| 179 | |
| 180 | elastic_min_rates_bps: |
| 181 | - 100000000 # 100 Mbps |
| 182 | - 200000000 # 200 Mbps |
| 183 | |
| 184 | * ``port_templates`` section: List of switch port for which we want to configure |
| 185 | queues. |
| 186 | |
| 187 | Every ``port_templates`` element contains: |
| 188 | |
| 189 | * ``descr``: Description of the port purpose. |
| 190 | |
| 191 | * ``rate_bps``: Port speed in bit per second. |
| 192 | |
| 193 | * ``is_shaping_enabled``: ``true`` if the rate is enforced using shaping, |
| 194 | ``false`` if the rate is the channel speed. |
| 195 | |
| 196 | * ``shaping_burst_bytes``: Burst size in bytes, meaningful only if port speed |
| 197 | is shaped (when ``is_shaping_enabled: true``). |
| 198 | |
| 199 | * ``queue_count``: Number of queues assigned to the port. |
| 200 | |
| 201 | * ``port_ids``: List of Stratum port IDs (:ref:`singleton_port` from Stratum Chassis Config), |
| 202 | using this port template. Used for port that corresponds to switch front-panel ports. |
| 203 | |
| 204 | Mutually exclusive with ``sdk_port_ids`` field. |
| 205 | |
Carmelo Cascone | 450d903 | 2021-10-12 01:28:02 -0700 | [diff] [blame] | 206 | * ``sdk_port_ids``: List of SDK port numbers (i.e., Tofino ``DP_ID``) using this port template. |
Daniele Moro | ed03356 | 2021-10-04 16:12:31 +0200 | [diff] [blame] | 207 | Used for internal ports (e.g., recirculation ports). |
| 208 | |
| 209 | Mutually exclusive with ``port_ids`` field. |
| 210 | |
| 211 | .. code-block:: yaml |
| 212 | |
| 213 | port_templates: |
| 214 | - descr: "Base station" |
| 215 | rate_bps: 1000000000 # 1 Gbps |
| 216 | is_shaping_enabled: true |
| 217 | shaping_burst_bytes: 18000 # 2x jumbo frames |
| 218 | queue_count: 16 |
| 219 | port_ids: |
| 220 | - 100 |
| 221 | - descr: "Servers" |
| 222 | port_ids: |
| 223 | - 200 |
| 224 | rate_bps: 40000000000 # 40 Gbps |
| 225 | is_shaping_enabled: false |
| 226 | queue_count: 16 |
| 227 | - descr: "Recirculation" |
| 228 | sdk_port_ids: |
| 229 | - 68 |
| 230 | rate_bps: 100000000000 # 100 Gbps |
| 231 | is_shaping_enabled: false |
| 232 | queue_count: 16 |
| 233 | |
| 234 | An example of a complete QoS and Slicing configuration can be found `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_. |
Wailok Shum | 08311e5 | 2021-09-30 23:22:12 +0800 | [diff] [blame] | 235 | |
| 236 | REST API |
| 237 | -------- |
Carmelo Cascone | 450d903 | 2021-10-12 01:28:02 -0700 | [diff] [blame] | 238 | We provide REST APIs with support for adding/removing/querying slices and |
| 239 | traffic classes, as well as flow classification. |
Wailok Shum | 08311e5 | 2021-09-30 23:22:12 +0800 | [diff] [blame] | 240 | |
| 241 | Slice |
| 242 | ^^^^^ |
| 243 | |
| 244 | Add a slice |
| 245 | """"""""""" |
| 246 | A POST request with Slice ID as path parameter. |
| 247 | ``/slicing/slice/{sliceId}`` |
| 248 | |
| 249 | .. image:: ../images/qos-rest-slice-add.png |
| 250 | :width: 700px |
| 251 | |
| 252 | Remove a slice |
| 253 | """"""""""""""" |
| 254 | A DELETE request with Slice ID as path parameter. |
| 255 | ``/slicing/slice/{sliceId}`` |
| 256 | |
| 257 | .. image:: ../images/qos-rest-slice-remove.png |
| 258 | :width: 700px |
| 259 | |
| 260 | Get all slices |
| 261 | """""""""""""" |
| 262 | A GET request. |
| 263 | Returns a collection of slice id. |
| 264 | /slicing/slice |
| 265 | |
| 266 | .. image:: ../images/qos-rest-slice-get.png |
| 267 | :width: 700px |
| 268 | |
| 269 | Traffic Class |
| 270 | ^^^^^^^^^^^^^ |
| 271 | .. tip:: |
| 272 | Traffic Class has following attributes: ``BEST_EFFORT``, ``CONTROL``, ``REAL_TIME``, ``ELASTIC``. |
| 273 | |
| 274 | Add a traffic class to a slice |
| 275 | """""""""""""""""""""""""""""" |
| 276 | A POST request with Slice ID and Traffic Class as path parameters. |
| 277 | ``/slicing/tc/{sliceId}/{tc}`` |
| 278 | |
| 279 | .. image:: ../images/qos-rest-tc-add.png |
| 280 | :width: 700px |
| 281 | |
| 282 | Remove a traffic class from a slice |
| 283 | """"""""""""""""""""""""""""""""""" |
| 284 | A DELETE request with Slice ID and Traffic Class as path parameters. |
| 285 | ``/slicing/tc/{sliceId}/{tc}`` |
| 286 | |
| 287 | .. image:: ../images/qos-rest-tc-remove.png |
| 288 | :width: 700px |
| 289 | |
| 290 | Get all traffic classes from a slice |
| 291 | """""""""""""""""""""""""""""""""""" |
| 292 | A GET request with Slice ID as path parameters. |
| 293 | Returns a collection of traffic class. |
| 294 | ``/slicing/tc/{sliceId}`` |
| 295 | |
| 296 | .. image:: ../images/qos-rest-tc-get.png |
| 297 | :width: 700px |
| 298 | |
| 299 | Classify Flow |
| 300 | ^^^^^^^^^^^^^ |
| 301 | |
| 302 | A flow can be defined as |
| 303 | |
| 304 | .. code-block:: json |
| 305 | |
| 306 | { |
| 307 | "criteria": [ |
| 308 | { |
| 309 | "type": "IPV4_SRC", |
| 310 | "ip": "10.0.0.1/32" |
| 311 | }, |
| 312 | { |
| 313 | "type": "IPV4_DST", |
| 314 | "ip": "10.0.0.2/32" |
| 315 | }, |
| 316 | { |
| 317 | "type": "IP_PROTO", |
| 318 | "protocol": 6 |
| 319 | }, |
| 320 | { |
| 321 | "type": "TCP_SRC", |
| 322 | "tcpPort": 1000 |
| 323 | }, |
| 324 | { |
| 325 | "type": "TCP_DST", |
| 326 | "tcpPort": 80 |
| 327 | }, |
| 328 | { |
| 329 | "type": "UDP_SRC", |
| 330 | "udpPort": 1000 |
| 331 | }, |
| 332 | { |
| 333 | "type": "UDP_DST", |
| 334 | "udpPort": 1812 |
| 335 | } |
| 336 | ] |
| 337 | } |
| 338 | |
| 339 | - ``IPV4_SRC``: Source IPv4 prefix |
| 340 | |
| 341 | - ``IPV4_DST``: Destination IPv4 prefix |
| 342 | |
| 343 | - ``IP_PROTO``: IP Protocol, accept 6 (TCP) and 17 (UDP) |
| 344 | |
| 345 | - ``TCP_SRC``: Source L4 (TCP) port |
| 346 | |
| 347 | - ``TCP_DST``: Destination L4 (TCP) port |
| 348 | |
| 349 | - ``UDP_SRC``: Source L4 (UDP) port |
| 350 | |
| 351 | - ``UDP_DST``: Destination L4 (UDP) port |
| 352 | |
| 353 | .. note:: |
| 354 | SD-Fabric currently supports 5-tuple only. |
| 355 | |
| 356 | Classify a flow to a slice and traffic class |
| 357 | """""""""""""""""""""""""""""""""""""""""""" |
| 358 | A POST request with Slice ID and Traffic Class as path parameters. |
| 359 | And a Json of a flow as body parameters. |
| 360 | ``/slicing/flow/{sliceId}/{tc}`` |
| 361 | |
| 362 | .. image:: ../images/qos-rest-classifier-add.png |
| 363 | :width: 700px |
| 364 | |
| 365 | Remove a flow from a slice and traffic class |
| 366 | """""""""""""""""""""""""""""""""""""""""""" |
| 367 | A DELETE request with Slice ID and Traffic Class as path parameters. |
| 368 | And a Json of a flow as body parameters. |
| 369 | ``/slicing/flow/{sliceId}/{tc}`` |
| 370 | |
| 371 | .. image:: ../images/qos-rest-classifier-remove.png |
| 372 | :width: 700px |
| 373 | |
| 374 | Get all classified flows from a slice and traffic class |
| 375 | """"""""""""""""""""""""""""""""""""""""""""""""""""""" |
| 376 | A GET request with Slice ID and Traffic Class as path parameters. |
| 377 | Returns a collection of flow. |
| 378 | ``/slicing/flow/{sliceId}`` |
| 379 | |
| 380 | .. image:: ../images/qos-rest-classifier-get.png |
| 381 | :width: 700px |