Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 1 | P4-based User Plane Function (P4-UPF) |
| 2 | ===================================== |
Daniele Moro | 69226c8 | 2021-09-28 17:37:49 +0200 | [diff] [blame] | 3 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 4 | Overview |
| 5 | -------- |
| 6 | |
| 7 | SD-Fabric supports running a 4G/5G mobile core User Plane Function (UPF) as part |
| 8 | of the switches packet processing pipeline. Like the rest of the pipeline, this |
| 9 | is realized using P4 and for this reason we call this P4-UPF. |
| 10 | |
| 11 | P4-UPF is integrated with the ONF's SD-Core project. By default, SD-Core ships |
| 12 | with BESS-UPF, a containerized UPF implementation, based on the Berkeley |
| 13 | Software Switch (BESS). |
| 14 | |
| 15 | SD-Fabric can be used with BESS-UPF or any other UPF implementation that runs on |
| 16 | servers. In this case, the fabric switches can provide routing of GTP-U packets |
| 17 | to and from radio base station and servers. When P4-UPF is enabled, the same |
| 18 | fabric switches perform GTP-U tunnel termination. |
| 19 | |
| 20 | .. image:: ../images/bess-p4-upf.png |
| 21 | :width: 700px |
| 22 | |
| 23 | **Supported Features** |
| 24 | |
| 25 | SD-Fabric's P4-UPF implements a core set of features capable of supporting |
| 26 | requirements for a broad range of enterprise use cases: |
| 27 | |
| 28 | * GTP-U tunnel encap/decap: including support for 5G extensions such as PDU |
| 29 | Session Container carrying QoS Flow Information. |
| 30 | * Accounting: we use switch counters to collect per-flow stats and support usage |
| 31 | reporting and volume-based triggers. |
| 32 | * Downlink buffering: when a user device radio goes idle (power-save mode) or |
| 33 | during a handover, switches are updated to forward all downlink traffic for |
| 34 | the specific device (UE) to DBUF, a K8s-managed buffering service running on |
| 35 | servers. Then, when the device radio becomes ready to receive traffic, |
| 36 | packets are drained from the software buffers back to the switch to be |
| 37 | delivered to base stations. |
| 38 | * QoS: support for enforcement of maximum bitrate (MBR), minimum guaranteed |
| 39 | bitrate (GBR, via admission control), and prioritization using switch |
| 40 | queues and scheduling policy. |
| 41 | * Slicing: multiple logical UPFs can be instantiated on the same switch, each |
| 42 | one with its own QoS model and isolation guarantees enforced at the hardware |
| 43 | level using separate queues. |
| 44 | |
| 45 | **Distributed UPF** |
| 46 | |
| 47 | .. image:: ../images/upf-distributed.png |
| 48 | :width: 700px |
| 49 | |
| 50 | In SD-Fabric we support different topologies to meet the requirements of |
| 51 | different deployment sizes: from a single rack with just one leaf |
| 52 | switch, or a paired-leaves for redundancy, to NxM leaf-spine fabric for multi-rack |
| 53 | deployments. For this reason, P4-UPF is realized with a "distributed" data plane |
| 54 | implementation where all leaf switches are programmed with the same UPF |
| 55 | rules, such that any leaf can terminate any GTP-U tunnel. This provides several |
| 56 | benefits: |
| 57 | |
| 58 | * Simplified deployment: base stations can be connected via any leaf switch. |
| 59 | * Minimum latency: the UPF function is applied as soon as packets enter the |
| 60 | fabric, without going through additional devices before reaching their final |
| 61 | destination. |
| 62 | * Fast failover: when using paired-leaves, if one switch fails, the other can |
| 63 | immediately take over as it is already programmed with the same UPF state. |
| 64 | * Fabric-wide slicing & QoS guarantees: packets are classified as soon as they |
| 65 | hit the first leaf. We then use a custom DSCP-based marking to enforce the |
| 66 | same QoS rules on all hops. In case of congestion, flows deemed high priority |
| 67 | are treated as such by all switches. |
| 68 | |
| 69 | **Control Architecture and Integration with SD-Core** |
| 70 | |
| 71 | SD-Fabric's P4-UPF is integrated with the ONF SD-Core project to provide a |
| 72 | high-performance 3GPP-compliant mobile core solution. |
| 73 | |
| 74 | The integration with SD-Core is achieved via an ONOS called UP4, which is in |
| 75 | charge of populating the UPF tables of the switch pipeline. |
| 76 | |
| 77 | .. image:: ../images/up4-arch.png |
| 78 | :width: 600px |
| 79 | |
| 80 | The interface between the mobile core control plane and the UPF is defined by |
| 81 | the 3GPP standard Packet Forwarding Control Protocol (PFCP). This is a complex |
| 82 | protocol that can be difficult to understand, even though at its essence the |
| 83 | rules that it installs are simple match-action rules. The implementation of such |
| 84 | protocol, such as message parsing, state machines, and other bookkeeping can be |
| 85 | common to many different UPF realizations. For this reason, SD-Fabric relies on |
| 86 | an implementation of the PFCP protocol realized as an external microservice |
| 87 | named “PFCP-Agent”, which is provided by the SD-Core project. |
| 88 | |
| 89 | The UP4 App abstracts the whole fabric as one virtual big switch with UPF |
| 90 | capabilities, we call this the One-Big-UPF abstraction. Such abstraction allows |
| 91 | the upper layers to be independent of the underlying physical topology. |
| 92 | Communication between the PFCP Agent and the UP4 App is done via P4Runtime. This |
| 93 | is the same API that ONOS uses to communicate with the actual switches. However, |
| 94 | in the former case, it is used between two control planes, the mobile core, and |
| 95 | the SDN controller. By doing this, the deployment can be scaled up and down, |
| 96 | adding or removing racks and switches, without changing the mobile core control |
| 97 | plane, which instead is provided with the illusion of controlling just one |
| 98 | switch. |
| 99 | |
| 100 | The One-Big-UPF abstraction abstraction is realized with a ``virtual-upf.p4`` |
| 101 | program that formalizes the forwarding model described by PFCP as a series of |
| 102 | match-action tables. This program doesn't run on switches, but it's used as the |
| 103 | schema to define the content of the P4Runtime messages between PFCP Agent and |
| 104 | the UP4 App. On switches, we use a different program, fabric.p4, which |
| 105 | implements tables similar to the virtual UPF but optimized to satisfy the |
| 106 | resource constraints of Tofino, as well as tables for basic bridging, IP |
| 107 | routing, ECMP, and more. The UP4 App implements a P4Runtime server, like if it |
| 108 | were a switch, but instead it internally takes care of translating P4Runtime |
| 109 | rules from ``virtual-upf.p4`` to rules for the multiple physical switches running |
| 110 | fabric.p4, based on an up-to-date global view of the topology. |
| 111 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 112 | Downlink Buffering (DBUF) |
| 113 | ------------------------- |
| 114 | |
| 115 | TODO Carmelo: overview of DBUF |
| 116 | |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 117 | ONOS Configuration |
| 118 | ------------------ |
| 119 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 120 | The UPF configuration is split in two configurations, that can be provided |
| 121 | independently to ONOS. Th first is used to configure the UP4 ONOS application |
| 122 | and defines UPF-related information such as S1U Address, network devices |
| 123 | implementing UPF etc. The second one, instead, is used to configure parameters |
| 124 | related to the Downlink Buffering (DBUF) functionality. |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 125 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 126 | Here's a list of fields that you can configure via the UPF Network Configuration |
| 127 | for UP4: |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 128 | |
Carmelo Cascone | cad8b34 | 2021-09-29 17:29:59 -0700 | [diff] [blame] | 129 | * ``devices``: A list of devices IDs that implements the UPF data plane. This |
| 130 | list must include all the leaf switches in the topology. The UPF state is |
| 131 | replicated on all devices specified in this configuration field. The devices |
| 132 | specified in this list must use a P4 pipeline implementing the UPF |
| 133 | functionality. *Required* |
| 134 | |
| 135 | * ``s1uAddr``: The IP address of the S1-U interface (equivalent to N3 for 5G). |
| 136 | It can be an arbitrary IP address. *Required* |
| 137 | |
| 138 | * ``uePools``: A list of subnets that are in use by the UEs. *Required* |
| 139 | |
| 140 | * ``dbufDrainAddr``: The IP address of the UPF data plane interface that the |
| 141 | DBUF service will drain packets towards. *Optional* |
| 142 | |
| 143 | * ``pscEncapEnabled``: Set whether the UPF should use GTP-U extension PDU |
| 144 | Session Container when doing encapsulation of downlink packets. *Optional* |
| 145 | |
| 146 | * ``defaultQfi``: The default QoS Flow Identifier to use when the PDU Session |
| 147 | Container encapsulation is enabled. *Optional* |
| 148 | |
| 149 | Here is an example of netcfg JSON for UP4: |
| 150 | |
| 151 | .. code-block:: json |
| 152 | |
| 153 | { |
| 154 | "apps": { |
| 155 | "org.omecproject.up4": { |
| 156 | "up4": { |
| 157 | "devices": [ |
| 158 | "device:leaf1", |
| 159 | "device:leaf2" |
| 160 | ], |
| 161 | "s1uAddr": "10.32.11.126", |
| 162 | "uePools": [ |
| 163 | "10.240.0.0/16" |
| 164 | ], |
| 165 | "dbufDrainAddr": "10.32.11.126", |
| 166 | "pscEncapEnabled": false, |
| 167 | "defaultQfi": 0 |
| 168 | } |
| 169 | } |
| 170 | } |
| 171 | } |
| 172 | |
| 173 | The DBUF configuration block is all *optional*, we can use UP4 without the |
| 174 | downlink buffering functionality. Here's a list of fields that you can |
| 175 | configure: |
| 176 | |
| 177 | * ``serviceAddr``: The address DBUF service management interface in the form |
| 178 | IP:port. This address is used to communicate with the DBUF service via gRPC |
| 179 | (for example, to trigger the drain operation, or receive notification for |
| 180 | buffered packets). |
| 181 | |
| 182 | * ``dataplaneAddr``: The address of the DBUF service data plane interface in the |
| 183 | form IP:port. Packets sent to this address by the UPF will be buffered by |
| 184 | DBUF. The IP address must be a routable fabric address. |
| 185 | |
| 186 | Here is an example of netcfg for DBUF: |
| 187 | |
| 188 | .. code-block:: json |
| 189 | |
| 190 | { |
| 191 | "apps": { |
| 192 | "org.omecproject.up4": { |
| 193 | "dbuf": { |
| 194 | "serviceAddr": "10.76.28.72:10000", |
| 195 | "dataplaneAddr": "10.32.11.3:2152" |
| 196 | } |
| 197 | } |
| 198 | } |
| 199 | } |
Carmelo Cascone | 4a883cb | 2021-09-28 18:20:15 -0700 | [diff] [blame] | 200 | |
| 201 | SD-Core Configuration |
| 202 | --------------------- |
| 203 | |
| 204 | TODO Carmelo: |
| 205 | |
| 206 | * Assuming SD-Core is already installed... |
| 207 | * Instructions to install PFCP Agent for UP4 |
| 208 | * Reference for helm values configuration |
| 209 | |
| 210 | Should be similar to BESS install instructions (where the same helm chart |
| 211 | installs both PFCP agent and BESS): |
| 212 | https://docs.aetherproject.org/master/edge_deployment/bess_upf_deployment.html |
| 213 | |
| 214 | But using this helm chart (without BESS), just PFCP Agent: |
| 215 | https://gerrit.opencord.org/plugins/gitiles/aether-helm-charts/+/refs/heads/master/omec/omec-upf-pfcp-agent/ |
| 216 | |
| 217 | UP4 Troubleshooting |
| 218 | ------------------- |
| 219 | |
| 220 | ``TODO Daniele`` |
| 221 | |
| 222 | Example of UP4 CLI commands to debug the state of UP4. |
| 223 | |
| 224 | DBUF |
| 225 | ---- |
| 226 | |
| 227 | |
| 228 | ``TODO Carmelo`` overview |
| 229 | |
| 230 | |
| 231 | ``TODO Hung-Wei`` deployment instructions (helm chart) |