blob: 757b56b26b7e74a7553566c325f3c994ec584739 [file] [log] [blame]
QoS and Slicing
===============
.. _qos_configuration:
Overview
--------
Network slicing enables sharing the same physical infrastructure between
independent logical networks, each one targeting different use cases while
providing isolation and security guarantees. Slicing permits the implementation
of tailor-made applications with Quality of Service (QoS) specific to the needs
of each slice, rather than a one-size-fits-all approach.
SD-Fabric supports slicing and QoS using dedicated hardware resources such as
scheduling queues and meters. Once a packet enters the fabric, it is associated
with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier,
while TC is used to determine the QoS parameters. The combination of slice ID
and TC is used by SD-Fabric to determine which switch hardware queue to use.
We provide fabric-wide isolation and QoS guarantees. Packets are classified by
the first leaf switch in the path, we then use a custom DSCP-based marking
scheme to apply the same treatment on all switches.
Classification can be achieved for both regular traffic via REST APIs, or for
GTP-U traffic terminated by P4-UPF using PFCP integration.
Traffic Classes
^^^^^^^^^^^^^^^
We supports the following traffic classes that covers the spectrum of
applications from latency-sensitive to throughput-intensive.
Control
"""""""
For applications demanding ultra-low latency and jitter guarantees, with
non-bursty, low throughput requirements in the order of 100s of packets per
second. Examples of such applications are consensus protocols, industrial
automation, timing, etc. This class uses a queue shared by all slices, serviced
with the highest priority. To enforce isolation between slices, and to avoid
starvation of lower priority classes, each slice is processed through a
single-rate two-color meter. Slices sending at rates higher than the configured
meter rate might observe packet drops.
Real-Time
"""""""""
For applications that require both low-latency and sustained throughput.
Examples of such applications are video and audio streaming. Each slice gets a
dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the
lowest latency at all times even with bursty senders. To avoid starvation of
lower priority classes, Real-Time queues are shaped at a maximum rate. Slices
sending at rates higher than the configured one might observe higher latency
because of the shaping. Real-Time queues have priority lower than Control, but
higher than Elastic.
Elastic
"""""""
For throughput-intensive applications with no latency requirements. This class
is best suited for large file transfers, Intranet/enterprise applications,
prioritized Internet access, etc. Each slice gets a dedicated Elastic queue
serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During
congestion, Elastic queues are guaranteed to receive minimum bandwidth that can
grow up to the link capacity if other queues are empty.
Best-Effort
"""""""""""
This is the default traffic class, used by packets not classified with any of
the above classes All slices share the same best-effort queue with lowest
priority.
Classification
^^^^^^^^^^^^^^^
Slice ID and TC classification can be performed in two ways.
Regular traffic
"""""""""""""""
We provide an ACL-like APIs that supports specifying wildcard match rules on the
IPv4 5-tuple.
P4-UPF traffic
""""""""""""""
When using the embedded UPF function, for GTP-U mobile traffic terminated by the
fabric, we support integration with PFCP QoS features such as prioritization via
QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate
(GBR).
You can configure a static one-to-one mapping between 3GPP’s QFIs and
SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR
and GBR configuration are translated into meter configurations.
QoS classification uses the same table for GTP-U tunnel termination, for this
reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF
function on each leaf switch using the distributed UPF mode, such that packets
are classified as soon as they enter the network.
Support for slicing of mobile traffic is work-in-progress and will be added in
the next SD-Fabric release.
Configuration
-------------
.. note:: QoS and slicing configuration is currently statically configured at switch startup.
Dynamic configuration will be supported in a next SD-Fabric release.
QoS and slicing uses switch queue configuration provided via the
``vendor_config`` portion of the Stratum Chassis Config (see
:ref:`stratum_chassis_config`), where the queues and schedulers can be
configured. For more information on the format of ``vendor_config``, see the
`guide for running Stratum on Tofino-based switches
<https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_
in the Stratum repository.
We provide a convenient `script <https://github.com/stratum/fabric-tna/blob/main/util/gen-stratum-qos-config.py>`_
to generate the configuration starting from a higher-level description provided via a YAML file.
This file allows to configure the parameters for the traffic classes listed in the above section.
Here's a list of parameters that you can configure via the YAML QoS configuration file:
* ``max_cells``: Maximum number of buffer cells, depends on the ASIC SKU/revision.
* ``pool_allocations``: Percentage of buffer cells allocated to each traffic class.
The sum should be 100. Usually, we leave a portion of the buffer ``unassigned``
for queues that do not have a pool (yet).
Example of such queues are those for the recirculation port, CPU port, etc.
.. code-block:: yaml
pool_allocations:
control: 1
realtime: 9
elastic: 80
besteffort: 9
unassigned: 1
* **Control** Traffic Class: The available bandwidth dedicated to Control traffic is divided in *slots*.
Each slot has a maximum rate and burst (in packets of the given MTU).
A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline.
* ``control_slot_count``: Number of slots.
* ``control_slot_rate_pps``: Packet per second rate of each slot.
* ``control_slot_burst_pkts``: Number of packets per burst of each slot.
* ``control_mtu_bytes``: MTU of packets for the PPS and burst values.
.. code-block:: yaml
control_slot_count: 50
control_slot_rate_pps: 100
control_slot_burst_pkts: 10
control_mtu_bytes: 1500
* **Real-Time** Traffic Class Configuration:
* ``realtime_max_rates_bps``: List of maximum shaping rates for Real-Time queues,
one per slice requesting such service.
* ``realtime_max_burst_s``: Maximum amount of time that a Real-Time queue can
burst at the port speed. This parameter is used to limit delay for Elastic
queues.
.. code-block:: yaml
realtime_max_rates_bps:
- 45000000 # 45 Mbps
- 30000000 # 30 Mbps
- 25000000 # 25 Mbps
realtime_max_burst_s: 0.005 # 5 ms
* **Elastic** Traffic Class Configuration:
* ``elastic_min_rates_bps``: List of minimum guaranteed rates for Elastic queues,
one per slice requesting such service.
.. code-block:: yaml
elastic_min_rates_bps:
- 100000000 # 100 Mbps
- 200000000 # 200 Mbps
* ``port_templates`` section: List of switch port for which we want to configure
queues.
Every ``port_templates`` element contains:
* ``descr``: Description of the port purpose.
* ``rate_bps``: Port speed in bit per second.
* ``is_shaping_enabled``: ``true`` if the rate is enforced using shaping,
``false`` if the rate is the channel speed.
* ``shaping_burst_bytes``: Burst size in bytes, meaningful only if port speed
is shaped (when ``is_shaping_enabled: true``).
* ``queue_count``: Number of queues assigned to the port.
* ``port_ids``: List of Stratum port IDs (:ref:`singleton_port` from Stratum Chassis Config),
using this port template. Used for port that corresponds to switch front-panel ports.
Mutually exclusive with ``sdk_port_ids`` field.
* ``sdk_port_ids``: List of SDK port numbers (i.e., Tofino ``DP_ID``) using this port template.
Used for internal ports (e.g., recirculation ports).
Mutually exclusive with ``port_ids`` field.
.. code-block:: yaml
port_templates:
- descr: "Base station"
rate_bps: 1000000000 # 1 Gbps
is_shaping_enabled: true
shaping_burst_bytes: 18000 # 2x jumbo frames
queue_count: 16
port_ids:
- 100
- descr: "Servers"
port_ids:
- 200
rate_bps: 40000000000 # 40 Gbps
is_shaping_enabled: false
queue_count: 16
- descr: "Recirculation"
sdk_port_ids:
- 68
rate_bps: 100000000000 # 100 Gbps
is_shaping_enabled: false
queue_count: 16
An example of a complete QoS and Slicing configuration can be found `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_.
REST API
--------
We provide REST APIs with support for adding/removing/querying slices and
traffic classes, as well as flow classification.
Slice
^^^^^
Add a slice
"""""""""""
A POST request with Slice ID as path parameter.
``/slicing/slice/{sliceId}``
.. image:: ../images/qos-rest-slice-add.png
:width: 700px
Remove a slice
"""""""""""""""
A DELETE request with Slice ID as path parameter.
``/slicing/slice/{sliceId}``
.. image:: ../images/qos-rest-slice-remove.png
:width: 700px
Get all slices
""""""""""""""
A GET request.
Returns a collection of slice id.
/slicing/slice
.. image:: ../images/qos-rest-slice-get.png
:width: 700px
Traffic Class
^^^^^^^^^^^^^
.. tip::
Traffic Class has following attributes: ``BEST_EFFORT``, ``CONTROL``, ``REAL_TIME``, ``ELASTIC``.
Add a traffic class to a slice
""""""""""""""""""""""""""""""
A POST request with Slice ID and Traffic Class as path parameters.
``/slicing/tc/{sliceId}/{tc}``
.. image:: ../images/qos-rest-tc-add.png
:width: 700px
Remove a traffic class from a slice
"""""""""""""""""""""""""""""""""""
A DELETE request with Slice ID and Traffic Class as path parameters.
``/slicing/tc/{sliceId}/{tc}``
.. image:: ../images/qos-rest-tc-remove.png
:width: 700px
Get all traffic classes from a slice
""""""""""""""""""""""""""""""""""""
A GET request with Slice ID as path parameters.
Returns a collection of traffic class.
``/slicing/tc/{sliceId}``
.. image:: ../images/qos-rest-tc-get.png
:width: 700px
Classify Flow
^^^^^^^^^^^^^
A flow can be defined as
.. code-block:: json
{
"criteria": [
{
"type": "IPV4_SRC",
"ip": "10.0.0.1/32"
},
{
"type": "IPV4_DST",
"ip": "10.0.0.2/32"
},
{
"type": "IP_PROTO",
"protocol": 6
},
{
"type": "TCP_SRC",
"tcpPort": 1000
},
{
"type": "TCP_DST",
"tcpPort": 80
},
{
"type": "UDP_SRC",
"udpPort": 1000
},
{
"type": "UDP_DST",
"udpPort": 1812
}
]
}
- ``IPV4_SRC``: Source IPv4 prefix
- ``IPV4_DST``: Destination IPv4 prefix
- ``IP_PROTO``: IP Protocol, accept 6 (TCP) and 17 (UDP)
- ``TCP_SRC``: Source L4 (TCP) port
- ``TCP_DST``: Destination L4 (TCP) port
- ``UDP_SRC``: Source L4 (UDP) port
- ``UDP_DST``: Destination L4 (UDP) port
.. note::
SD-Fabric currently supports 5-tuple only.
Classify a flow to a slice and traffic class
""""""""""""""""""""""""""""""""""""""""""""
A POST request with Slice ID and Traffic Class as path parameters.
And a Json of a flow as body parameters.
``/slicing/flow/{sliceId}/{tc}``
.. image:: ../images/qos-rest-classifier-add.png
:width: 700px
Remove a flow from a slice and traffic class
""""""""""""""""""""""""""""""""""""""""""""
A DELETE request with Slice ID and Traffic Class as path parameters.
And a Json of a flow as body parameters.
``/slicing/flow/{sliceId}/{tc}``
.. image:: ../images/qos-rest-classifier-remove.png
:width: 700px
Get all classified flows from a slice and traffic class
"""""""""""""""""""""""""""""""""""""""""""""""""""""""
A GET request with Slice ID and Traffic Class as path parameters.
Returns a collection of flow.
``/slicing/flow/{sliceId}``
.. image:: ../images/qos-rest-classifier-get.png
:width: 700px