Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 1 | P4-based User Plane Function (P4-UPF) |
| 2 | ===================================== |
Daniele Moro | 69226c8 | 2021-09-28 17:37:49 +0200 | [diff] [blame] | 3 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 4 | Overview |
| 5 | -------- |
| 6 | |
| 7 | SD-Fabric supports running a 4G/5G mobile core User Plane Function (UPF) as part |
| 8 | of the switches packet processing pipeline. Like the rest of the pipeline, this |
| 9 | is realized using P4 and for this reason we call this P4-UPF. |
| 10 | |
| 11 | P4-UPF is integrated with the ONF's SD-Core project. By default, SD-Core ships |
| 12 | with BESS-UPF, a containerized UPF implementation, based on the Berkeley |
| 13 | Software Switch (BESS). |
| 14 | |
| 15 | SD-Fabric can be used with BESS-UPF or any other UPF implementation that runs on |
| 16 | servers. In this case, the fabric switches can provide routing of GTP-U packets |
| 17 | to and from radio base station and servers. When P4-UPF is enabled, the same |
| 18 | fabric switches perform GTP-U tunnel termination. |
| 19 | |
| 20 | .. image:: ../images/bess-p4-upf.png |
| 21 | :width: 700px |
| 22 | |
| 23 | **Supported Features** |
| 24 | |
| 25 | SD-Fabric's P4-UPF implements a core set of features capable of supporting |
| 26 | requirements for a broad range of enterprise use cases: |
| 27 | |
| 28 | * GTP-U tunnel encap/decap: including support for 5G extensions such as PDU |
| 29 | Session Container carrying QoS Flow Information. |
| 30 | * Accounting: we use switch counters to collect per-flow stats and support usage |
| 31 | reporting and volume-based triggers. |
| 32 | * Downlink buffering: when a user device radio goes idle (power-save mode) or |
| 33 | during a handover, switches are updated to forward all downlink traffic for |
| 34 | the specific device (UE) to DBUF, a K8s-managed buffering service running on |
| 35 | servers. Then, when the device radio becomes ready to receive traffic, |
| 36 | packets are drained from the software buffers back to the switch to be |
| 37 | delivered to base stations. |
| 38 | * QoS: support for enforcement of maximum bitrate (MBR), minimum guaranteed |
| 39 | bitrate (GBR, via admission control), and prioritization using switch |
| 40 | queues and scheduling policy. |
| 41 | * Slicing: multiple logical UPFs can be instantiated on the same switch, each |
| 42 | one with its own QoS model and isolation guarantees enforced at the hardware |
| 43 | level using separate queues. |
| 44 | |
| 45 | **Distributed UPF** |
| 46 | |
| 47 | .. image:: ../images/upf-distributed.png |
| 48 | :width: 700px |
| 49 | |
| 50 | In SD-Fabric we support different topologies to meet the requirements of |
| 51 | different deployment sizes: from a single rack with just one leaf |
Charles Chan | 3ec0461 | 2021-10-06 22:57:02 -0700 | [diff] [blame] | 52 | switch, or a paired-leaves for redundancy, to N x M leaf-spine fabric for multi-rack |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 53 | deployments. For this reason, P4-UPF is realized with a "distributed" data plane |
| 54 | implementation where all leaf switches are programmed with the same UPF |
| 55 | rules, such that any leaf can terminate any GTP-U tunnel. This provides several |
| 56 | benefits: |
| 57 | |
| 58 | * Simplified deployment: base stations can be connected via any leaf switch. |
| 59 | * Minimum latency: the UPF function is applied as soon as packets enter the |
| 60 | fabric, without going through additional devices before reaching their final |
| 61 | destination. |
| 62 | * Fast failover: when using paired-leaves, if one switch fails, the other can |
| 63 | immediately take over as it is already programmed with the same UPF state. |
| 64 | * Fabric-wide slicing & QoS guarantees: packets are classified as soon as they |
| 65 | hit the first leaf. We then use a custom DSCP-based marking to enforce the |
| 66 | same QoS rules on all hops. In case of congestion, flows deemed high priority |
| 67 | are treated as such by all switches. |
| 68 | |
| 69 | **Control Architecture and Integration with SD-Core** |
| 70 | |
| 71 | SD-Fabric's P4-UPF is integrated with the ONF SD-Core project to provide a |
| 72 | high-performance 3GPP-compliant mobile core solution. |
| 73 | |
Jon Hall | 5145d5e | 2021-10-08 16:45:27 -0700 | [diff] [blame] | 74 | The integration with SD-Core is achieved via an ONOS application called UP4, |
| 75 | which is in charge of populating the UPF tables of the switch pipeline. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 76 | |
| 77 | .. image:: ../images/up4-arch.png |
| 78 | :width: 600px |
| 79 | |
| 80 | The interface between the mobile core control plane and the UPF is defined by |
| 81 | the 3GPP standard Packet Forwarding Control Protocol (PFCP). This is a complex |
| 82 | protocol that can be difficult to understand, even though at its essence the |
| 83 | rules that it installs are simple match-action rules. The implementation of such |
| 84 | protocol, such as message parsing, state machines, and other bookkeeping can be |
| 85 | common to many different UPF realizations. For this reason, SD-Fabric relies on |
| 86 | an implementation of the PFCP protocol realized as an external microservice |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 87 | named “PFCP Agent”, which is provided by the SD-Core project. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 88 | |
| 89 | The UP4 App abstracts the whole fabric as one virtual big switch with UPF |
| 90 | capabilities, we call this the One-Big-UPF abstraction. Such abstraction allows |
| 91 | the upper layers to be independent of the underlying physical topology. |
| 92 | Communication between the PFCP Agent and the UP4 App is done via P4Runtime. This |
| 93 | is the same API that ONOS uses to communicate with the actual switches. However, |
| 94 | in the former case, it is used between two control planes, the mobile core, and |
| 95 | the SDN controller. By doing this, the deployment can be scaled up and down, |
| 96 | adding or removing racks and switches, without changing the mobile core control |
| 97 | plane, which instead is provided with the illusion of controlling just one |
| 98 | switch. |
| 99 | |
| 100 | The One-Big-UPF abstraction abstraction is realized with a ``virtual-upf.p4`` |
| 101 | program that formalizes the forwarding model described by PFCP as a series of |
| 102 | match-action tables. This program doesn't run on switches, but it's used as the |
| 103 | schema to define the content of the P4Runtime messages between PFCP Agent and |
| 104 | the UP4 App. On switches, we use a different program, fabric.p4, which |
| 105 | implements tables similar to the virtual UPF but optimized to satisfy the |
| 106 | resource constraints of Tofino, as well as tables for basic bridging, IP |
| 107 | routing, ECMP, and more. The UP4 App implements a P4Runtime server, like if it |
| 108 | were a switch, but instead it internally takes care of translating P4Runtime |
| 109 | rules from ``virtual-upf.p4`` to rules for the multiple physical switches running |
| 110 | fabric.p4, based on an up-to-date global view of the topology. |
| 111 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 112 | Downlink Buffering (DBUF) |
| 113 | ------------------------- |
| 114 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 115 | A UPF is required to buffer packets when UEs are in idle-mode or during |
| 116 | handovers, this is usually called *downlink buffering*, as it applies only to |
| 117 | the downlink direction of traffic. Most switches provide buffering capabilities |
| 118 | to handle congestion, they cannot hold packets indefinitely. For this reason, we |
| 119 | provide DBUF, a microservice |
| 120 | responsible for providing the downlink buffering capabilities to P4-UPF. |
| 121 | |
| 122 | .. image:: ../images/dbuf.png |
| 123 | :width: 400px |
| 124 | |
| 125 | When a UE goes idle and turns off its radio, or during handovers, the mobile |
| 126 | core control plane uses PFCP to update the Forwarding Action Rules (FARs) for |
| 127 | that UE to enter buffering* mode. When this happens, UP4 updates the switch rules to |
| 128 | steer packets to DBUF using GTP-U tunnels. |
| 129 | |
| 130 | UP4 uses gRPC to communicate with DBUF. DBUF notifies UP4 about buffering |
| 131 | events, which are relayed to the mobile core control plane as Downlink Data |
| 132 | Notifications (DDN). When a UE becomes available again, UP4 triggers a buffer |
| 133 | drain on DBUF and updates the switch rules to start sending traffic to the UE again. |
| 134 | |
| 135 | Deploying DBUF is optional (can be enabled in the SD-Fabric Helm Chart). |
| 136 | DBUF feature requires SR-IOV and DHCP support on NICs and Kubernetes CNIs. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 137 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 138 | ONOS Configuration |
| 139 | ------------------ |
| 140 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 141 | The UPF configuration is split in two configurations, that can be provided |
| 142 | independently to ONOS. Th first is used to configure the UP4 ONOS application |
| 143 | and defines UPF-related information such as S1U Address, network devices |
| 144 | implementing UPF etc. The second one, instead, is used to configure parameters |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 145 | related to the DBUF functionality. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 146 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 147 | Here's a list of fields that you can configure via the UPF Network Configuration |
| 148 | for UP4: |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 149 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 150 | * ``devices``: A list of devices IDs that implements the UPF data plane. This |
| 151 | list must include all the leaf switches in the topology. The UPF state is |
| 152 | replicated on all devices specified in this configuration field. The devices |
| 153 | specified in this list must use a P4 pipeline implementing the UPF |
| 154 | functionality. *Required* |
| 155 | |
| 156 | * ``s1uAddr``: The IP address of the S1-U interface (equivalent to N3 for 5G). |
| 157 | It can be an arbitrary IP address. *Required* |
| 158 | |
| 159 | * ``uePools``: A list of subnets that are in use by the UEs. *Required* |
| 160 | |
| 161 | * ``dbufDrainAddr``: The IP address of the UPF data plane interface that the |
| 162 | DBUF service will drain packets towards. *Optional* |
| 163 | |
| 164 | * ``pscEncapEnabled``: Set whether the UPF should use GTP-U extension PDU |
| 165 | Session Container when doing encapsulation of downlink packets. *Optional* |
| 166 | |
| 167 | * ``defaultQfi``: The default QoS Flow Identifier to use when the PDU Session |
| 168 | Container encapsulation is enabled. *Optional* |
| 169 | |
| 170 | Here is an example of netcfg JSON for UP4: |
| 171 | |
| 172 | .. code-block:: json |
| 173 | |
| 174 | { |
| 175 | "apps": { |
| 176 | "org.omecproject.up4": { |
| 177 | "up4": { |
| 178 | "devices": [ |
| 179 | "device:leaf1", |
| 180 | "device:leaf2" |
| 181 | ], |
| 182 | "s1uAddr": "10.32.11.126", |
| 183 | "uePools": [ |
| 184 | "10.240.0.0/16" |
| 185 | ], |
| 186 | "dbufDrainAddr": "10.32.11.126", |
| 187 | "pscEncapEnabled": false, |
| 188 | "defaultQfi": 0 |
| 189 | } |
| 190 | } |
| 191 | } |
| 192 | } |
| 193 | |
| 194 | The DBUF configuration block is all *optional*, we can use UP4 without the |
| 195 | downlink buffering functionality. Here's a list of fields that you can |
| 196 | configure: |
| 197 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 198 | * ``serviceAddr``: The address of the DBUF service management interface in the |
| 199 | form IP:port. This address is used to communicate with the DBUF service via |
| 200 | gRPC (for example, to trigger the drain operation, or receive notification for |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 201 | buffered packets). |
| 202 | |
| 203 | * ``dataplaneAddr``: The address of the DBUF service data plane interface in the |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 204 | form IP:port. Packets sent to this address by the UPF switches will be |
| 205 | buffered by DBUF. The IP address must be a routable fabric address. |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 206 | |
| 207 | Here is an example of netcfg for DBUF: |
| 208 | |
| 209 | .. code-block:: json |
| 210 | |
| 211 | { |
| 212 | "apps": { |
| 213 | "org.omecproject.up4": { |
| 214 | "dbuf": { |
| 215 | "serviceAddr": "10.76.28.72:10000", |
| 216 | "dataplaneAddr": "10.32.11.3:2152" |
| 217 | } |
| 218 | } |
| 219 | } |
| 220 | } |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 221 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 222 | .. note:: |
| 223 | When deploying DBUF using the SD-Fabric Helm Chart you do **NOT** need to |
| 224 | provide the ``"dbuf"`` part of the UP4 config. That will be pushed |
| 225 | automatically by the DBUF Kubernetes pod. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 226 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 227 | PFCP Agent Configuration |
| 228 | ------------------------ |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 229 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 230 | PFCP Agent can be deployed as part of the SD-Fabric Helm Chart. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 231 | |
Carmelo Cascone | 1935fde | 2021-10-12 00:57:05 -0700 | [diff] [blame^] | 232 | See the Helm Chart documentation for more information on the configuration |
| 233 | parameters. Once deployed, use ``kubectl get services -n sdfabric`` to find out |
| 234 | the exact UDP endpoint used to listen for PFCP connection requests. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 235 | |
| 236 | UP4 Troubleshooting |
| 237 | ------------------- |
| 238 | |
Daniele Moro | 5212da6 | 2021-10-11 16:20:26 +0200 | [diff] [blame] | 239 | See :ref:`troubleshooting_guide`. |