| QoS and Slicing |
| =============== |
| |
| .. _qos_configuration: |
| |
| Overview |
| -------- |
| |
| Network slicing enables sharing the same physical infrastructure between |
| independent logical networks, each one targeting different use cases while |
| providing isolation and security guarantees. Slicing permits the implementation |
| of tailor-made applications with Quality of Service (QoS) specific to the needs |
| of each slice, rather than a one-size-fits-all approach. |
| |
| SD-Fabric supports slicing and QoS using dedicated hardware resources such as |
| scheduling queues and meters. Once a packet enters the fabric, it is associated |
| with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier, |
| while TC is used to determine the QoS parameters. The combination of slice ID |
| and TC is used by SD-Fabric to determine which switch hardware queue to use. |
| |
| We provide fabric-wide isolation and QoS guarantees. Packets are classified by |
| the first leaf switch in the path, we then use a custom DSCP-based marking |
| scheme to apply the same treatment on all switches. |
| |
| Classification can be achieved for both regular traffic via REST APIs, or for |
| GTP-U traffic terminated by P4-UPF using PFCP integration. |
| |
| Traffic Classes |
| ^^^^^^^^^^^^^^^ |
| |
| We supports the following traffic classes that covers the spectrum of |
| applications from latency-sensitive to throughput-intensive. |
| |
| Control |
| """"""" |
| For applications demanding ultra-low latency and jitter guarantees, with |
| non-bursty, low throughput requirements in the order of 100s of packets per |
| second. Examples of such applications are consensus protocols, industrial |
| automation, timing, etc. This class uses a queue shared by all slices, serviced |
| with the highest priority. To enforce isolation between slices, and to avoid |
| starvation of lower priority classes, each slice is processed through a |
| single-rate two-color meter. Slices sending at rates higher than the configured |
| meter rate might observe packet drops. |
| |
| Real-Time |
| """"""""" |
| For applications that require both low-latency and sustained throughput. |
| Examples of such applications are video and audio streaming. Each slice gets a |
| dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the |
| lowest latency at all times even with bursty senders. To avoid starvation of |
| lower priority classes, Real-Time queues are shaped at a maximum rate. Slices |
| sending at rates higher than the configured one might observe higher latency |
| because of the shaping. Real-Time queues have priority lower than Control, but |
| higher than Elastic. |
| |
| Elastic |
| """"""" |
| For throughput-intensive applications with no latency requirements. This class |
| is best suited for large file transfers, Intranet/enterprise applications, |
| prioritized Internet access, etc. Each slice gets a dedicated Elastic queue |
| serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During |
| congestion, Elastic queues are guaranteed to receive minimum bandwidth that can |
| grow up to the link capacity if other queues are empty. |
| |
| Best-Effort |
| """"""""""" |
| This is the default traffic class, used by packets not classified with any of |
| the above classes All slices share the same best-effort queue with lowest |
| priority. |
| |
| Classification |
| ^^^^^^^^^^^^^^^ |
| |
| Slice ID and TC classification can be performed in two ways. |
| |
| Regular traffic |
| """"""""""""""" |
| We provide an ACL-like APIs that supports specifying wildcard match rules on the |
| IPv4 5-tuple. |
| |
| P4-UPF traffic |
| """""""""""""" |
| When using the embedded UPF function, for GTP-U mobile traffic terminated by the |
| fabric, we support integration with PFCP QoS features such as prioritization via |
| QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate |
| (GBR). |
| |
| You can configure a static one-to-one mapping between 3GPP’s QFIs and |
| SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR |
| and GBR configuration are translated into meter configurations. |
| |
| QoS classification uses the same table for GTP-U tunnel termination, for this |
| reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF |
| function on each leaf switch using the distributed UPF mode, such that packets |
| are classified as soon as they enter the network. |
| |
| Support for slicing of mobile traffic is work-in-progress and will be added in |
| the next SD-Fabric release. |
| |
| Configuration |
| ------------- |
| .. note:: QoS and slicing configuration is currently statically configured at switch startup. |
| Dynamic configuration will be supported in a next SD-Fabric release. |
| |
| QoS and slicing uses switch queue configuration provided via the |
| ``vendor_config`` portion of the Stratum Chassis Config (see |
| :ref:`stratum_chassis_config`), where the queues and schedulers can be |
| configured. For more information on the format of ``vendor_config``, see the |
| `guide for running Stratum on Tofino-based switches |
| <https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_ |
| in the Stratum repository. |
| |
| We provide a convenient `script <https://github.com/stratum/fabric-tna/blob/main/util/gen-stratum-qos-config.py>`_ |
| to generate the configuration starting from a higher-level description provided via a YAML file. |
| This file allows to configure the parameters for the traffic classes listed in the above section. |
| |
| Here's a list of parameters that you can configure via the YAML QoS configuration file: |
| |
| * ``max_cells``: Maximum number of buffer cells, depends on the ASIC SKU/revision. |
| |
| * ``pool_allocations``: Percentage of buffer cells allocated to each traffic class. |
| The sum should be 100. Usually, we leave a portion of the buffer ``unassigned`` |
| for queues that do not have a pool (yet). |
| Example of such queues are those for the recirculation port, CPU port, etc. |
| |
| .. code-block:: yaml |
| |
| pool_allocations: |
| control: 1 |
| realtime: 9 |
| elastic: 80 |
| besteffort: 9 |
| unassigned: 1 |
| |
| * **Control** Traffic Class: The available bandwidth dedicated to Control traffic is divided in *slots*. |
| Each slot has a maximum rate and burst (in packets of the given MTU). |
| A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline. |
| |
| * ``control_slot_count``: Number of slots. |
| * ``control_slot_rate_pps``: Packet per second rate of each slot. |
| * ``control_slot_burst_pkts``: Number of packets per burst of each slot. |
| * ``control_mtu_bytes``: MTU of packets for the PPS and burst values. |
| |
| .. code-block:: yaml |
| |
| control_slot_count: 50 |
| control_slot_rate_pps: 100 |
| control_slot_burst_pkts: 10 |
| control_mtu_bytes: 1500 |
| |
| * **Real-Time** Traffic Class Configuration: |
| |
| * ``realtime_max_rates_bps``: List of maximum shaping rates for Real-Time queues, |
| one per slice requesting such service. |
| |
| * ``realtime_max_burst_s``: Maximum amount of time that a Real-Time queue can |
| burst at the port speed. This parameter is used to limit delay for Elastic |
| queues. |
| |
| .. code-block:: yaml |
| |
| realtime_max_rates_bps: |
| - 45000000 # 45 Mbps |
| - 30000000 # 30 Mbps |
| - 25000000 # 25 Mbps |
| realtime_max_burst_s: 0.005 # 5 ms |
| |
| * **Elastic** Traffic Class Configuration: |
| |
| * ``elastic_min_rates_bps``: List of minimum guaranteed rates for Elastic queues, |
| one per slice requesting such service. |
| |
| .. code-block:: yaml |
| |
| elastic_min_rates_bps: |
| - 100000000 # 100 Mbps |
| - 200000000 # 200 Mbps |
| |
| * ``port_templates`` section: List of switch port for which we want to configure |
| queues. |
| |
| Every ``port_templates`` element contains: |
| |
| * ``descr``: Description of the port purpose. |
| |
| * ``rate_bps``: Port speed in bit per second. |
| |
| * ``is_shaping_enabled``: ``true`` if the rate is enforced using shaping, |
| ``false`` if the rate is the channel speed. |
| |
| * ``shaping_burst_bytes``: Burst size in bytes, meaningful only if port speed |
| is shaped (when ``is_shaping_enabled: true``). |
| |
| * ``queue_count``: Number of queues assigned to the port. |
| |
| * ``port_ids``: List of Stratum port IDs (:ref:`singleton_port` from Stratum Chassis Config), |
| using this port template. Used for port that corresponds to switch front-panel ports. |
| |
| Mutually exclusive with ``sdk_port_ids`` field. |
| |
| * ``sdk_port_ids``: List of SDK port numbers (i.e., Tofino ``DP_ID``) using this port template. |
| Used for internal ports (e.g., recirculation ports). |
| |
| Mutually exclusive with ``port_ids`` field. |
| |
| .. code-block:: yaml |
| |
| port_templates: |
| - descr: "Base station" |
| rate_bps: 1000000000 # 1 Gbps |
| is_shaping_enabled: true |
| shaping_burst_bytes: 18000 # 2x jumbo frames |
| queue_count: 16 |
| port_ids: |
| - 100 |
| - descr: "Servers" |
| port_ids: |
| - 200 |
| rate_bps: 40000000000 # 40 Gbps |
| is_shaping_enabled: false |
| queue_count: 16 |
| - descr: "Recirculation" |
| sdk_port_ids: |
| - 68 |
| rate_bps: 100000000000 # 100 Gbps |
| is_shaping_enabled: false |
| queue_count: 16 |
| |
| An example of a complete QoS and Slicing configuration can be found `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_. |
| |
| REST API |
| -------- |
| We provide REST APIs with support for adding/removing/querying slices and |
| traffic classes, as well as flow classification. |
| |
| Slice |
| ^^^^^ |
| |
| Add a slice |
| """"""""""" |
| A POST request with Slice ID as path parameter. |
| ``/slicing/slice/{sliceId}`` |
| |
| .. image:: ../images/qos-rest-slice-add.png |
| :width: 700px |
| |
| Remove a slice |
| """"""""""""""" |
| A DELETE request with Slice ID as path parameter. |
| ``/slicing/slice/{sliceId}`` |
| |
| .. image:: ../images/qos-rest-slice-remove.png |
| :width: 700px |
| |
| Get all slices |
| """""""""""""" |
| A GET request. |
| Returns a collection of slice id. |
| /slicing/slice |
| |
| .. image:: ../images/qos-rest-slice-get.png |
| :width: 700px |
| |
| Traffic Class |
| ^^^^^^^^^^^^^ |
| .. tip:: |
| Traffic Class has following attributes: ``BEST_EFFORT``, ``CONTROL``, ``REAL_TIME``, ``ELASTIC``. |
| |
| Add a traffic class to a slice |
| """""""""""""""""""""""""""""" |
| A POST request with Slice ID and Traffic Class as path parameters. |
| ``/slicing/tc/{sliceId}/{tc}`` |
| |
| .. image:: ../images/qos-rest-tc-add.png |
| :width: 700px |
| |
| Remove a traffic class from a slice |
| """"""""""""""""""""""""""""""""""" |
| A DELETE request with Slice ID and Traffic Class as path parameters. |
| ``/slicing/tc/{sliceId}/{tc}`` |
| |
| .. image:: ../images/qos-rest-tc-remove.png |
| :width: 700px |
| |
| Get all traffic classes from a slice |
| """""""""""""""""""""""""""""""""""" |
| A GET request with Slice ID as path parameters. |
| Returns a collection of traffic class. |
| ``/slicing/tc/{sliceId}`` |
| |
| .. image:: ../images/qos-rest-tc-get.png |
| :width: 700px |
| |
| Classify Flow |
| ^^^^^^^^^^^^^ |
| |
| A flow can be defined as |
| |
| .. code-block:: json |
| |
| { |
| "criteria": [ |
| { |
| "type": "IPV4_SRC", |
| "ip": "10.0.0.1/32" |
| }, |
| { |
| "type": "IPV4_DST", |
| "ip": "10.0.0.2/32" |
| }, |
| { |
| "type": "IP_PROTO", |
| "protocol": 6 |
| }, |
| { |
| "type": "TCP_SRC", |
| "tcpPort": 1000 |
| }, |
| { |
| "type": "TCP_DST", |
| "tcpPort": 80 |
| }, |
| { |
| "type": "UDP_SRC", |
| "udpPort": 1000 |
| }, |
| { |
| "type": "UDP_DST", |
| "udpPort": 1812 |
| } |
| ] |
| } |
| |
| - ``IPV4_SRC``: Source IPv4 prefix |
| |
| - ``IPV4_DST``: Destination IPv4 prefix |
| |
| - ``IP_PROTO``: IP Protocol, accept 6 (TCP) and 17 (UDP) |
| |
| - ``TCP_SRC``: Source L4 (TCP) port |
| |
| - ``TCP_DST``: Destination L4 (TCP) port |
| |
| - ``UDP_SRC``: Source L4 (UDP) port |
| |
| - ``UDP_DST``: Destination L4 (UDP) port |
| |
| .. note:: |
| SD-Fabric currently supports 5-tuple only. |
| |
| Classify a flow to a slice and traffic class |
| """""""""""""""""""""""""""""""""""""""""""" |
| A POST request with Slice ID and Traffic Class as path parameters. |
| And a Json of a flow as body parameters. |
| ``/slicing/flow/{sliceId}/{tc}`` |
| |
| .. image:: ../images/qos-rest-classifier-add.png |
| :width: 700px |
| |
| Remove a flow from a slice and traffic class |
| """""""""""""""""""""""""""""""""""""""""""" |
| A DELETE request with Slice ID and Traffic Class as path parameters. |
| And a Json of a flow as body parameters. |
| ``/slicing/flow/{sliceId}/{tc}`` |
| |
| .. image:: ../images/qos-rest-classifier-remove.png |
| :width: 700px |
| |
| Get all classified flows from a slice and traffic class |
| """"""""""""""""""""""""""""""""""""""""""""""""""""""" |
| A GET request with Slice ID and Traffic Class as path parameters. |
| Returns a collection of flow. |
| ``/slicing/flow/{sliceId}`` |
| |
| .. image:: ../images/qos-rest-classifier-get.png |
| :width: 700px |