Charles Chan | fcfe890 | 2022-02-02 17:06:27 -0800 | [diff] [blame] | 1 | .. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org> |
| 2 | .. SPDX-License-Identifier: Apache-2.0 |
| 3 | |
Carmelo Cascone | 7623e7c | 2021-10-13 17:45:27 -0700 | [diff] [blame] | 4 | .. _p4_upf: |
| 5 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 6 | P4-based User Plane Function (P4-UPF) |
| 7 | ===================================== |
Daniele Moro | 69226c8 | 2021-09-28 17:37:49 +0200 | [diff] [blame] | 8 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 9 | Overview |
| 10 | -------- |
| 11 | |
| 12 | SD-Fabric supports running a 4G/5G mobile core User Plane Function (UPF) as part |
| 13 | of the switches packet processing pipeline. Like the rest of the pipeline, this |
| 14 | is realized using P4 and for this reason we call this P4-UPF. |
| 15 | |
| 16 | P4-UPF is integrated with the ONF's SD-Core project. By default, SD-Core ships |
| 17 | with BESS-UPF, a containerized UPF implementation, based on the Berkeley |
| 18 | Software Switch (BESS). |
| 19 | |
| 20 | SD-Fabric can be used with BESS-UPF or any other UPF implementation that runs on |
| 21 | servers. In this case, the fabric switches can provide routing of GTP-U packets |
| 22 | to and from radio base station and servers. When P4-UPF is enabled, the same |
| 23 | fabric switches perform GTP-U tunnel termination. |
| 24 | |
| 25 | .. image:: ../images/bess-p4-upf.png |
| 26 | :width: 700px |
| 27 | |
| 28 | **Supported Features** |
| 29 | |
| 30 | SD-Fabric's P4-UPF implements a core set of features capable of supporting |
| 31 | requirements for a broad range of enterprise use cases: |
| 32 | |
| 33 | * GTP-U tunnel encap/decap: including support for 5G extensions such as PDU |
| 34 | Session Container carrying QoS Flow Information. |
| 35 | * Accounting: we use switch counters to collect per-flow stats and support usage |
| 36 | reporting and volume-based triggers. |
| 37 | * Downlink buffering: when a user device radio goes idle (power-save mode) or |
| 38 | during a handover, switches are updated to forward all downlink traffic for |
| 39 | the specific device (UE) to DBUF, a K8s-managed buffering service running on |
| 40 | servers. Then, when the device radio becomes ready to receive traffic, |
| 41 | packets are drained from the software buffers back to the switch to be |
| 42 | delivered to base stations. |
Carmelo Cascone | 46d6303 | 2022-03-07 00:22:30 -0800 | [diff] [blame^] | 43 | * QoS: support for enforcement of maximum bitrate (MBR) at the application, |
| 44 | session, and slice level; and prioritization using switch queues and |
| 45 | scheduling policy. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 46 | * Slicing: multiple logical UPFs can be instantiated on the same switch, each |
| 47 | one with its own QoS model and isolation guarantees enforced at the hardware |
| 48 | level using separate queues. |
| 49 | |
| 50 | **Distributed UPF** |
| 51 | |
| 52 | .. image:: ../images/upf-distributed.png |
| 53 | :width: 700px |
| 54 | |
| 55 | In SD-Fabric we support different topologies to meet the requirements of |
| 56 | different deployment sizes: from a single rack with just one leaf |
Charles Chan | 3ec0461 | 2021-10-06 22:57:02 -0700 | [diff] [blame] | 57 | switch, or a paired-leaves for redundancy, to N x M leaf-spine fabric for multi-rack |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 58 | deployments. For this reason, P4-UPF is realized with a "distributed" data plane |
| 59 | implementation where all leaf switches are programmed with the same UPF |
| 60 | rules, such that any leaf can terminate any GTP-U tunnel. This provides several |
| 61 | benefits: |
| 62 | |
| 63 | * Simplified deployment: base stations can be connected via any leaf switch. |
| 64 | * Minimum latency: the UPF function is applied as soon as packets enter the |
| 65 | fabric, without going through additional devices before reaching their final |
| 66 | destination. |
| 67 | * Fast failover: when using paired-leaves, if one switch fails, the other can |
| 68 | immediately take over as it is already programmed with the same UPF state. |
| 69 | * Fabric-wide slicing & QoS guarantees: packets are classified as soon as they |
| 70 | hit the first leaf. We then use a custom DSCP-based marking to enforce the |
| 71 | same QoS rules on all hops. In case of congestion, flows deemed high priority |
| 72 | are treated as such by all switches. |
| 73 | |
| 74 | **Control Architecture and Integration with SD-Core** |
| 75 | |
| 76 | SD-Fabric's P4-UPF is integrated with the ONF SD-Core project to provide a |
| 77 | high-performance 3GPP-compliant mobile core solution. |
| 78 | |
Jon Hall | 5145d5e | 2021-10-08 16:45:27 -0700 | [diff] [blame] | 79 | The integration with SD-Core is achieved via an ONOS application called UP4, |
| 80 | which is in charge of populating the UPF tables of the switch pipeline. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 81 | |
| 82 | .. image:: ../images/up4-arch.png |
| 83 | :width: 600px |
| 84 | |
| 85 | The interface between the mobile core control plane and the UPF is defined by |
| 86 | the 3GPP standard Packet Forwarding Control Protocol (PFCP). This is a complex |
| 87 | protocol that can be difficult to understand, even though at its essence the |
| 88 | rules that it installs are simple match-action rules. The implementation of such |
| 89 | protocol, such as message parsing, state machines, and other bookkeeping can be |
| 90 | common to many different UPF realizations. For this reason, SD-Fabric relies on |
| 91 | an implementation of the PFCP protocol realized as an external microservice |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 92 | named “PFCP Agent”, which is provided by the SD-Core project. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 93 | |
| 94 | The UP4 App abstracts the whole fabric as one virtual big switch with UPF |
| 95 | capabilities, we call this the One-Big-UPF abstraction. Such abstraction allows |
| 96 | the upper layers to be independent of the underlying physical topology. |
| 97 | Communication between the PFCP Agent and the UP4 App is done via P4Runtime. This |
| 98 | is the same API that ONOS uses to communicate with the actual switches. However, |
| 99 | in the former case, it is used between two control planes, the mobile core, and |
| 100 | the SDN controller. By doing this, the deployment can be scaled up and down, |
| 101 | adding or removing racks and switches, without changing the mobile core control |
| 102 | plane, which instead is provided with the illusion of controlling just one |
| 103 | switch. |
| 104 | |
| 105 | The One-Big-UPF abstraction abstraction is realized with a ``virtual-upf.p4`` |
| 106 | program that formalizes the forwarding model described by PFCP as a series of |
| 107 | match-action tables. This program doesn't run on switches, but it's used as the |
| 108 | schema to define the content of the P4Runtime messages between PFCP Agent and |
| 109 | the UP4 App. On switches, we use a different program, fabric.p4, which |
| 110 | implements tables similar to the virtual UPF but optimized to satisfy the |
| 111 | resource constraints of Tofino, as well as tables for basic bridging, IP |
| 112 | routing, ECMP, and more. The UP4 App implements a P4Runtime server, like if it |
| 113 | were a switch, but instead it internally takes care of translating P4Runtime |
| 114 | rules from ``virtual-upf.p4`` to rules for the multiple physical switches running |
| 115 | fabric.p4, based on an up-to-date global view of the topology. |
| 116 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 117 | Downlink Buffering (DBUF) |
| 118 | ------------------------- |
| 119 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 120 | A UPF is required to buffer packets when UEs are in idle-mode or during |
| 121 | handovers, this is usually called *downlink buffering*, as it applies only to |
| 122 | the downlink direction of traffic. Most switches provide buffering capabilities |
| 123 | to handle congestion, they cannot hold packets indefinitely. For this reason, we |
| 124 | provide DBUF, a microservice |
| 125 | responsible for providing the downlink buffering capabilities to P4-UPF. |
| 126 | |
| 127 | .. image:: ../images/dbuf.png |
| 128 | :width: 400px |
| 129 | |
| 130 | When a UE goes idle and turns off its radio, or during handovers, the mobile |
| 131 | core control plane uses PFCP to update the Forwarding Action Rules (FARs) for |
| 132 | that UE to enter buffering* mode. When this happens, UP4 updates the switch rules to |
| 133 | steer packets to DBUF using GTP-U tunnels. |
| 134 | |
| 135 | UP4 uses gRPC to communicate with DBUF. DBUF notifies UP4 about buffering |
| 136 | events, which are relayed to the mobile core control plane as Downlink Data |
| 137 | Notifications (DDN). When a UE becomes available again, UP4 triggers a buffer |
| 138 | drain on DBUF and updates the switch rules to start sending traffic to the UE again. |
| 139 | |
| 140 | Deploying DBUF is optional (can be enabled in the SD-Fabric Helm Chart). |
| 141 | DBUF feature requires SR-IOV and DHCP support on NICs and Kubernetes CNIs. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 142 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 143 | ONOS Configuration |
| 144 | ------------------ |
| 145 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 146 | The UPF configuration is split in two configurations, that can be provided |
| 147 | independently to ONOS. Th first is used to configure the UP4 ONOS application |
| 148 | and defines UPF-related information such as S1U Address, network devices |
| 149 | implementing UPF etc. The second one, instead, is used to configure parameters |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 150 | related to the DBUF functionality. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 151 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 152 | Here's a list of fields that you can configure via the UPF Network Configuration |
| 153 | for UP4: |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 154 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 155 | * ``devices``: A list of devices IDs that implements the UPF data plane. This |
| 156 | list must include all the leaf switches in the topology. The UPF state is |
| 157 | replicated on all devices specified in this configuration field. The devices |
| 158 | specified in this list must use a P4 pipeline implementing the UPF |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 159 | functionality. **Required** |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 160 | |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 161 | * ``s1uAddr``: **Deprecated**. Use ``n3Addr`` instead. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 162 | |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 163 | * ``n3Addr``: The IP address of the N3 interface (equivalent to S1-U for 4G). |
| 164 | It can be an arbitrary IP address. **Optional** |
| 165 | (PFCP agent can insert interface table entry if not supplied) |
| 166 | |
| 167 | * ``uePools``: A list of subnets that are in use by the UEs. **Optional** |
| 168 | (PFCP agent can insert interface table entry if not supplied) |
| 169 | |
| 170 | * ``sliceId``: Network slice ID used by mobile traffic. **Optional** |
| 171 | Required only when either ``n3Addr`` or ``uePools`` is specified. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 172 | |
| 173 | * ``dbufDrainAddr``: The IP address of the UPF data plane interface that the |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 174 | DBUF service will drain packets towards. **Optional** |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 175 | |
| 176 | * ``pscEncapEnabled``: Set whether the UPF should use GTP-U extension PDU |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 177 | Session Container when doing encapsulation of downlink packets. **Optional** |
| 178 | (Should set to true for 5G) |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 179 | |
| 180 | * ``defaultQfi``: The default QoS Flow Identifier to use when the PDU Session |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 181 | Container encapsulation is enabled. **Optional** |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 182 | |
| 183 | Here is an example of netcfg JSON for UP4: |
| 184 | |
| 185 | .. code-block:: json |
| 186 | |
| 187 | { |
| 188 | "apps": { |
| 189 | "org.omecproject.up4": { |
| 190 | "up4": { |
| 191 | "devices": [ |
| 192 | "device:leaf1", |
| 193 | "device:leaf2" |
| 194 | ], |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 195 | "n3Addr": "10.32.11.126", |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 196 | "uePools": [ |
| 197 | "10.240.0.0/16" |
| 198 | ], |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 199 | "sliceId": 0 |
| 200 | "dbufDrainAddr": "10.32.11.126", |
| 201 | "pscEncapEnabled": false, |
| 202 | "defaultQfi": 0 |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 203 | } |
| 204 | } |
| 205 | } |
| 206 | } |
| 207 | |
Charles Chan | 2b4e695 | 2022-02-25 11:51:29 -0800 | [diff] [blame] | 208 | The DBUF configuration block is all **Optional**, we can use UP4 without the |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 209 | downlink buffering functionality. Here's a list of fields that you can |
| 210 | configure: |
| 211 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 212 | * ``serviceAddr``: The address of the DBUF service management interface in the |
| 213 | form IP:port. This address is used to communicate with the DBUF service via |
| 214 | gRPC (for example, to trigger the drain operation, or receive notification for |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 215 | buffered packets). |
| 216 | |
| 217 | * ``dataplaneAddr``: The address of the DBUF service data plane interface in the |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 218 | form IP:port. Packets sent to this address by the UPF switches will be |
| 219 | buffered by DBUF. The IP address must be a routable fabric address. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 220 | |
| 221 | Here is an example of netcfg for DBUF: |
| 222 | |
| 223 | .. code-block:: json |
| 224 | |
| 225 | { |
| 226 | "apps": { |
| 227 | "org.omecproject.up4": { |
| 228 | "dbuf": { |
| 229 | "serviceAddr": "10.76.28.72:10000", |
| 230 | "dataplaneAddr": "10.32.11.3:2152" |
| 231 | } |
| 232 | } |
| 233 | } |
| 234 | } |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 235 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 236 | .. note:: |
| 237 | When deploying DBUF using the SD-Fabric Helm Chart you do **NOT** need to |
| 238 | provide the ``"dbuf"`` part of the UP4 config. That will be pushed |
| 239 | automatically by the DBUF Kubernetes pod. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 240 | |
Charles Chan | 671e398 | 2022-03-09 19:51:31 -0800 | [diff] [blame] | 241 | .. _pfcp_agent_config: |
| 242 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 243 | PFCP Agent Configuration |
| 244 | ------------------------ |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 245 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 246 | PFCP Agent can be deployed as part of the SD-Fabric Helm Chart. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 247 | |
Charles Chan | a937f77 | 2022-02-23 16:24:35 -0800 | [diff] [blame] | 248 | See the `SD-Fabric Helm Chart README <https://gerrit.opencord.org/plugins/gitiles/sdfabric-helm-charts/+/HEAD/sdfabric/README.md>`_ for more information on the configuration |
Charles Chan | 3ecadec | 2022-03-09 14:02:45 -0800 | [diff] [blame] | 249 | parameters used in SD-Fabric. |
| 250 | See the `PFCP agent configuration guide <https://github.com/omec-project/upf/blob/master/docs/configuration_guide.md>` for other parameters provided by PFCP agent. |
| 251 | Once deployed, use ``kubectl get services -n sdfabric`` to find out |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame] | 252 | the exact UDP endpoint used to listen for PFCP connection requests. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 253 | |
| 254 | UP4 Troubleshooting |
| 255 | ------------------- |
| 256 | |
Daniele Moro | 5212da6 | 2021-10-11 16:20:26 +0200 | [diff] [blame] | 257 | See :ref:`troubleshooting_guide`. |