blob: fcbb514be8b8460fa73bc98544c92aabd5148e6a [file] [log] [blame]
.. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org>
.. SPDX-License-Identifier: Apache-2.0
.. _slicing_qos:
Slicing and QoS
===============
.. _qos_configuration:
Overview
--------
Network slicing enables sharing the same physical infrastructure between
independent logical networks, each one targeting different use cases while
providing isolation guarantees. Slicing permits the implementation of
tailor-made applications with Quality of Service (QoS) specific to the needs of
each slice, rather than a one-size-fits-all approach.
SD-Fabric supports slicing and QoS using dedicated hardware resources such as
scheduling queues and meters. Once a packet enters the fabric, it is associated
with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier,
while TC is used to determine the QoS parameters. The combination of slice ID
and TC is used by SD-Fabric to determine which switch hardware queue to use.
We provide fabric-wide isolation and QoS guarantees. Packets are classified by
the first leaf switch in the path, we then use a custom DSCP-based marking
scheme to apply the same treatment on all switches.
Classification can be achieved for both regular traffic via REST APIs, or for
GTP-U traffic terminated by P4-UPF using PFCP integration.
Traffic Classes
^^^^^^^^^^^^^^^
We support the following traffic classes to cover the spectrum of potential
applications, from latency-sensitive to throughput-intensive.
Control
"""""""
For applications demanding ultra-low latency and jitter guarantees, with
non-bursty, low throughput requirements in the order of 100s of packets per
second. Examples of such applications are consensus protocols, industrial
automation, timing, etc. This class uses a queue shared by all slices, serviced
with the highest priority. To enforce isolation between slices, and to avoid
starvation of lower priority classes, each slice is processed through a
single-rate two-color meter. Slices sending at rates higher than the configured
meter rate might observe packet drops.
Real-Time
"""""""""
For applications that require both low-latency and sustained throughput.
Examples of such applications are video and audio streaming. Each slice gets a
dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the
lowest latency at all times even with bursty senders. To avoid starvation of
lower priority classes, Real-Time queues are shaped at a maximum rate. Slices
sending at rates higher than the configured maximum rate might observe higher
latency because of the queue shaping enforced by the scheduler. Real-Time queues
have priority lower than Control, but higher than Elastic.
Elastic
"""""""
For throughput-intensive applications with no latency requirements. This class
is best suited for large file transfers, Intranet/enterprise applications,
prioritized Internet access, etc. Each slice gets a dedicated Elastic queue
serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During
congestion, Elastic queues are guaranteed to receive minimum bandwidth that can
grow up to the link capacity if other queues are empty.
Best-Effort
"""""""""""
This is the default traffic class, used by packets not classified with any of
the above classes All slices share the same best-effort queue with lowest
priority.
Classification
^^^^^^^^^^^^^^^
Slice ID and TC classification can be performed in two ways.
Regular traffic
"""""""""""""""
We provide ACL-like APIs that support specifying wildcard match rules on the
IPv4 5-tuple.
P4-UPF traffic
""""""""""""""
For GTP-U traffic terminated by the embedded P4-UPF function, selection of a
slice ID and TC is based on PFCP-Agent's configuration (upf.json or Helm
values). QoS classification uses the same table for GTP-U tunnel termination,
for this reason, to achieve fabric-wide QoS enforcement, we recommend enabling
the UPF function on each leaf switch using the distributed UPF mode, such that
packets are classified as soon as they enter the fabric.
The slice ID is specified using the ``p4rtciface.slice_id`` property in
PFCP-Agent's ``upf.json``. All packets terminated by the P4-UPF function will be
associated with the given Slice ID.
The TC value is instead derived from the 3GPP's QoS Flow Identifier (QFI) and
requires coordination with the mobile core control plane (e.g., SD-Core). When
deploying PFCP-Agent, you can configure a static many-to-one mapping between
3GPP’s QFIs and SD-Fabric’s TCs using the ``p4rtciface.qfi_tc_mapping`` property
in ``upf.json``. That is, multiple QFIs can be mapped to the same TC. Then, it's
up to the mobile core control plane to insert PFCP rules classifying traffic
using the specific QFIs.
Configuration
-------------
.. note:: Currently we only support static configuration at switch startup. To
add new slices or modify TC parameters, you will need to reboot the switch.
Dynamic configuration will be supported in future SD-Fabric releases.
Stratum allows configuring switch queues and schedulers using the
``vendor_config`` portion of the Chassis Config file (see
:ref:`stratum_chassis_config`). For more information on the format of
``vendor_config``, see the `guide for running Stratum on Tofino-based switches
<https://github.com/stratum/stratum/blob/main/stratum/hal/bin/barefoot/README.run.md>`_
in the Stratum repository.
The ONOS apps are responsible of inserting switch rules that map packets into
different queues. For this reason, apps needs to be aware of how queues are
mapped to the different slices and TCs.
We provide a convenient `script
<https://github.com/stratum/fabric-tna/blob/main/util/gen-qos-config.py>`_ to
generate both the Stratum and ONOS configuration starting from a high-level
description provided via a YAML file. This file allows to define slices and
configure TC parameters.
An example of such YAML file can be found here `here <https://github.com/stratum/fabric-tna/blob/main/util/sample-qos-config.yaml>`_.
To generate the Stratum config:
.. code-block:: console
$ ./gen-qos-config.py -t stratum sample-qos-config.yaml
The script will output a ``vendor_config`` section which is meant to be appended
to an existing Chassis Config file.
To generate the ONOS config:
.. code-block:: console
$ ./gen-qos-config.py -t onos sample-qos-config.yaml
The script will output a JSON snippet representing a complete ONOS netcfg file
with just the ``slicing`` portion of the ``fabric-tna`` app config. You will
have to manually integrate this into the existing ONOS netcfg used for
deployment.
REST API
--------
Adding and removing slices in ONOS can be performed only via netcfg. We provide
REST APIs to:
- Get information on slices and TCs currently in the system
- Add/remove classifier rules
For the up-to-date documentation and example API calls, please refer to the
auto-generated documentation on a live ONOS instance at the URL
``http://<ONOS-host>>:<ONOS-port>/onos/v1/docs``.
Make sure to select the Fabric-TNA RESt API view:
.. image:: ../images/fabric-tna-rest-api-select.png
:width: 700px
Classifier Flows
^^^^^^^^^^^^^^^^
We provide REST APIs to add/remove classifier flows. A classifier flow is used
to instruct switches on how to associate packets to slices and TCs. It is based
on abstraction similar to an ACL table, describing rules matching on the IPv4
5-tuple.
Here's an example classifier flow in JSON format to be used in REST API calls.
For the actual API methods, please refer to the live ONOS documentation.
.. code-block:: json
{
"criteria": [
{
"type": "IPV4_SRC",
"ip": "10.0.0.1/32"
},
{
"type": "IPV4_DST",
"ip": "10.0.0.2/32"
},
{
"type": "IP_PROTO",
"protocol": 6
},
{
"type": "TCP_SRC",
"tcpPort": 1000
},
{
"type": "TCP_DST",
"tcpPort": 80
},
{
"type": "UDP_SRC",
"udpPort": 1000
},
{
"type": "UDP_DST",
"udpPort": 1812
}
]
}