blob: f8a4d82a02d69c66e700a36f9e18328d5750ea72 [file] [log] [blame]
Carmelo Cascone4a883cb2021-09-28 18:20:15 -07001P4-based User Plane Function (P4-UPF)
2=====================================
Daniele Moro69226c82021-09-28 17:37:49 +02003
Carmelo Cascone4a883cb2021-09-28 18:20:15 -07004Overview
5--------
6
7SD-Fabric supports running a 4G/5G mobile core User Plane Function (UPF) as part
8of the switches packet processing pipeline. Like the rest of the pipeline, this
9is realized using P4 and for this reason we call this P4-UPF.
10
11P4-UPF is integrated with the ONF's SD-Core project. By default, SD-Core ships
12with BESS-UPF, a containerized UPF implementation, based on the Berkeley
13Software Switch (BESS).
14
15SD-Fabric can be used with BESS-UPF or any other UPF implementation that runs on
16servers. In this case, the fabric switches can provide routing of GTP-U packets
17to and from radio base station and servers. When P4-UPF is enabled, the same
18fabric switches perform GTP-U tunnel termination.
19
20.. image:: ../images/bess-p4-upf.png
21 :width: 700px
22
23**Supported Features**
24
25SD-Fabric's P4-UPF implements a core set of features capable of supporting
26requirements for a broad range of enterprise use cases:
27
28* GTP-U tunnel encap/decap: including support for 5G extensions such as PDU
29 Session Container carrying QoS Flow Information.
30* Accounting: we use switch counters to collect per-flow stats and support usage
31 reporting and volume-based triggers.
32* Downlink buffering: when a user device radio goes idle (power-save mode) or
33 during a handover, switches are updated to forward all downlink traffic for
34 the specific device (UE) to DBUF, a K8s-managed buffering service running on
35 servers. Then, when the device radio becomes ready to receive traffic,
36 packets are drained from the software buffers back to the switch to be
37 delivered to base stations.
38* QoS: support for enforcement of maximum bitrate (MBR), minimum guaranteed
39 bitrate (GBR, via admission control), and prioritization using switch
40 queues and scheduling policy.
41* Slicing: multiple logical UPFs can be instantiated on the same switch, each
42 one with its own QoS model and isolation guarantees enforced at the hardware
43 level using separate queues.
44
45**Distributed UPF**
46
47.. image:: ../images/upf-distributed.png
48 :width: 700px
49
50In SD-Fabric we support different topologies to meet the requirements of
51different deployment sizes: from a single rack with just one leaf
Charles Chan3ec04612021-10-06 22:57:02 -070052switch, or a paired-leaves for redundancy, to N x M leaf-spine fabric for multi-rack
Carmelo Cascone4a883cb2021-09-28 18:20:15 -070053deployments. For this reason, P4-UPF is realized with a "distributed" data plane
54implementation where all leaf switches are programmed with the same UPF
55rules, such that any leaf can terminate any GTP-U tunnel. This provides several
56benefits:
57
58* Simplified deployment: base stations can be connected via any leaf switch.
59* Minimum latency: the UPF function is applied as soon as packets enter the
60 fabric, without going through additional devices before reaching their final
61 destination.
62* Fast failover: when using paired-leaves, if one switch fails, the other can
63 immediately take over as it is already programmed with the same UPF state.
64* Fabric-wide slicing & QoS guarantees: packets are classified as soon as they
65 hit the first leaf. We then use a custom DSCP-based marking to enforce the
66 same QoS rules on all hops. In case of congestion, flows deemed high priority
67 are treated as such by all switches.
68
69**Control Architecture and Integration with SD-Core**
70
71SD-Fabric's P4-UPF is integrated with the ONF SD-Core project to provide a
72high-performance 3GPP-compliant mobile core solution.
73
Jon Hall5145d5e2021-10-08 16:45:27 -070074The integration with SD-Core is achieved via an ONOS application called UP4,
75which is in charge of populating the UPF tables of the switch pipeline.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -070076
77.. image:: ../images/up4-arch.png
78 :width: 600px
79
80The interface between the mobile core control plane and the UPF is defined by
81the 3GPP standard Packet Forwarding Control Protocol (PFCP). This is a complex
82protocol that can be difficult to understand, even though at its essence the
83rules that it installs are simple match-action rules. The implementation of such
84protocol, such as message parsing, state machines, and other bookkeeping can be
85common to many different UPF realizations. For this reason, SD-Fabric relies on
86an implementation of the PFCP protocol realized as an external microservice
Carmelo Cascone1935fde2021-10-12 00:57:05 -070087named “PFCP Agent”, which is provided by the SD-Core project.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -070088
89The UP4 App abstracts the whole fabric as one virtual big switch with UPF
90capabilities, we call this the One-Big-UPF abstraction. Such abstraction allows
91the upper layers to be independent of the underlying physical topology.
92Communication between the PFCP Agent and the UP4 App is done via P4Runtime. This
93is the same API that ONOS uses to communicate with the actual switches. However,
94in the former case, it is used between two control planes, the mobile core, and
95the SDN controller. By doing this, the deployment can be scaled up and down,
96adding or removing racks and switches, without changing the mobile core control
97plane, which instead is provided with the illusion of controlling just one
98switch.
99
100The One-Big-UPF abstraction abstraction is realized with a ``virtual-upf.p4``
101program that formalizes the forwarding model described by PFCP as a series of
102match-action tables. This program doesn't run on switches, but it's used as the
103schema to define the content of the P4Runtime messages between PFCP Agent and
104the UP4 App. On switches, we use a different program, fabric.p4, which
105implements tables similar to the virtual UPF but optimized to satisfy the
106resource constraints of Tofino, as well as tables for basic bridging, IP
107routing, ECMP, and more. The UP4 App implements a P4Runtime server, like if it
108were a switch, but instead it internally takes care of translating P4Runtime
109rules from ``virtual-upf.p4`` to rules for the multiple physical switches running
110fabric.p4, based on an up-to-date global view of the topology.
111
Carmelo Casconecad8b342021-09-29 17:29:59 -0700112Downlink Buffering (DBUF)
113-------------------------
114
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700115A UPF is required to buffer packets when UEs are in idle-mode or during
116handovers, this is usually called *downlink buffering*, as it applies only to
117the downlink direction of traffic. Most switches provide buffering capabilities
118to handle congestion, they cannot hold packets indefinitely. For this reason, we
119provide DBUF, a microservice
120responsible for providing the downlink buffering capabilities to P4-UPF.
121
122.. image:: ../images/dbuf.png
123 :width: 400px
124
125When a UE goes idle and turns off its radio, or during handovers, the mobile
126core control plane uses PFCP to update the Forwarding Action Rules (FARs) for
127that UE to enter buffering* mode. When this happens, UP4 updates the switch rules to
128steer packets to DBUF using GTP-U tunnels.
129
130UP4 uses gRPC to communicate with DBUF. DBUF notifies UP4 about buffering
131events, which are relayed to the mobile core control plane as Downlink Data
132Notifications (DDN). When a UE becomes available again, UP4 triggers a buffer
133drain on DBUF and updates the switch rules to start sending traffic to the UE again.
134
135Deploying DBUF is optional (can be enabled in the SD-Fabric Helm Chart).
136DBUF feature requires SR-IOV and DHCP support on NICs and Kubernetes CNIs.
Carmelo Casconecad8b342021-09-29 17:29:59 -0700137
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700138ONOS Configuration
139------------------
140
Carmelo Casconecad8b342021-09-29 17:29:59 -0700141The UPF configuration is split in two configurations, that can be provided
142independently to ONOS. Th first is used to configure the UP4 ONOS application
143and defines UPF-related information such as S1U Address, network devices
144implementing UPF etc. The second one, instead, is used to configure parameters
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700145related to the DBUF functionality.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700146
Carmelo Casconecad8b342021-09-29 17:29:59 -0700147Here's a list of fields that you can configure via the UPF Network Configuration
148for UP4:
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700149
Carmelo Casconecad8b342021-09-29 17:29:59 -0700150* ``devices``: A list of devices IDs that implements the UPF data plane. This
151 list must include all the leaf switches in the topology. The UPF state is
152 replicated on all devices specified in this configuration field. The devices
153 specified in this list must use a P4 pipeline implementing the UPF
154 functionality. *Required*
155
156* ``s1uAddr``: The IP address of the S1-U interface (equivalent to N3 for 5G).
157 It can be an arbitrary IP address. *Required*
158
159* ``uePools``: A list of subnets that are in use by the UEs. *Required*
160
161* ``dbufDrainAddr``: The IP address of the UPF data plane interface that the
162 DBUF service will drain packets towards. *Optional*
163
164* ``pscEncapEnabled``: Set whether the UPF should use GTP-U extension PDU
165 Session Container when doing encapsulation of downlink packets. *Optional*
166
167* ``defaultQfi``: The default QoS Flow Identifier to use when the PDU Session
168 Container encapsulation is enabled. *Optional*
169
170Here is an example of netcfg JSON for UP4:
171
172.. code-block:: json
173
174 {
175 "apps": {
176 "org.omecproject.up4": {
177 "up4": {
178 "devices": [
179 "device:leaf1",
180 "device:leaf2"
181 ],
182 "s1uAddr": "10.32.11.126",
183 "uePools": [
184 "10.240.0.0/16"
185 ],
186 "dbufDrainAddr": "10.32.11.126",
187 "pscEncapEnabled": false,
188 "defaultQfi": 0
189 }
190 }
191 }
192 }
193
194The DBUF configuration block is all *optional*, we can use UP4 without the
195downlink buffering functionality. Here's a list of fields that you can
196configure:
197
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700198* ``serviceAddr``: The address of the DBUF service management interface in the
199 form IP:port. This address is used to communicate with the DBUF service via
200 gRPC (for example, to trigger the drain operation, or receive notification for
Carmelo Casconecad8b342021-09-29 17:29:59 -0700201 buffered packets).
202
203* ``dataplaneAddr``: The address of the DBUF service data plane interface in the
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700204 form IP:port. Packets sent to this address by the UPF switches will be
205 buffered by DBUF. The IP address must be a routable fabric address.
Carmelo Casconecad8b342021-09-29 17:29:59 -0700206
207Here is an example of netcfg for DBUF:
208
209.. code-block:: json
210
211 {
212 "apps": {
213 "org.omecproject.up4": {
214 "dbuf": {
215 "serviceAddr": "10.76.28.72:10000",
216 "dataplaneAddr": "10.32.11.3:2152"
217 }
218 }
219 }
220 }
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700221
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700222.. note::
223 When deploying DBUF using the SD-Fabric Helm Chart you do **NOT** need to
224 provide the ``"dbuf"`` part of the UP4 config. That will be pushed
225 automatically by the DBUF Kubernetes pod.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700226
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700227PFCP Agent Configuration
228------------------------
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700229
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700230PFCP Agent can be deployed as part of the SD-Fabric Helm Chart.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700231
Carmelo Cascone1935fde2021-10-12 00:57:05 -0700232See the Helm Chart documentation for more information on the configuration
233parameters. Once deployed, use ``kubectl get services -n sdfabric`` to find out
234the exact UDP endpoint used to listen for PFCP connection requests.
Carmelo Cascone4a883cb2021-09-28 18:20:15 -0700235
236UP4 Troubleshooting
237-------------------
238
Daniele Moro5212da62021-10-11 16:20:26 +0200239See :ref:`troubleshooting_guide`.