Charles Chan | fcfe890 | 2022-02-02 17:06:27 -0800 | [diff] [blame] | 1 | .. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org> |
| 2 | .. SPDX-License-Identifier: Apache-2.0 |
| 3 | |
Charles Chan | caebcf3 | 2021-09-20 22:17:52 -0700 | [diff] [blame] | 4 | Specification |
| 5 | ============= |
| 6 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 7 | In the following we provide an exhaustive list of all features supported. |
| 8 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 9 | SDN Features |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 10 | ------------ |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 11 | - ONOS cluster of all-active N instances affording N-way redundancy and scale, where N = 3 or N = 5 |
| 12 | - Unified operations interface (GUI/REST/CLI) |
| 13 | - Centralized configuration: all configuration is done on the controller instead of each individual switch |
| 14 | - Centralized role-based access control (RBAC) |
| 15 | - Automatic host (end-point) discovery: attached hosts, access-devices, appliances (PNFs), routers, etc. |
| 16 | based on ARP, DHCP, NDP, etc. |
| 17 | - Automatic switch, link and topology discovery and maintenance (keepalives, failure recovery) |
Charles Chan | caebcf3 | 2021-09-20 22:17:52 -0700 | [diff] [blame] | 18 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 19 | L2 Features |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 20 | ----------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 21 | Various L2 connectivity and tunneling support |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 22 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 23 | - VLAN-based bridging |
| 24 | |
| 25 | - Access, Trunk and Native VLAN support |
| 26 | - VLAN cross connect |
| 27 | |
| 28 | - Forward traffic based on outer VLAN id |
| 29 | - Forward traffic based on outer and inner VLAN id (QinQ) |
| 30 | - Pseudowire |
| 31 | |
| 32 | - L2 tunneling across the L3 fabric |
| 33 | - Support tunneling based on double tagged and single tagged traffic |
| 34 | |
| 35 | - Support VLAN translation of outer tag |
| 36 | |
| 37 | L3 Features |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 38 | ----------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 39 | IP connectivity |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 40 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 41 | - IPv4 and IPv6 [#f1]_ unicast routing (internal use of MPLS Segment Routing) |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 42 | - Subnetting configuration on all non-spine facing leaf ports; no configuration required on any spine port |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 43 | - Equal Cost Multi-Path (ECMP) for traffic across spine switches |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 44 | - IPv6 router advertisement |
| 45 | - ARP, NDP, IGMP handling |
| 46 | - Number of flows in spines greatly simplified by MPLS Segment Routing |
| 47 | - Further reduction of per-leaf flows with route optimization logic |
| 48 | |
| 49 | DHCP Relay |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 50 | ---------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 51 | DHCP L3 relay |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 52 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 53 | - DHCPv4 and DHCPv6 |
| 54 | - DHCP server either directly attached to fabric leaves, or indirectly connected via upstream router |
| 55 | - DHCP client directly either attached to fabric leaves, or indirectly connected via LDRA |
| 56 | - Multiple DHCP servers for HA |
| 57 | |
| 58 | vRouter |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 59 | ------- |
| 60 | vRouter presents the entire SD-Fabric as a single router (or dual-routers for HA), |
| 61 | with disaggregated control/data plane |
| 62 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 63 | - Uses open-source protocol implementations like Quagga (or FRR) |
| 64 | - BGPv4 and BGPv6 |
| 65 | - Static routes |
| 66 | - Route blackholing |
| 67 | - ACLs based on port, L2, L3 and L4 headers |
| 68 | |
| 69 | Multicast |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 70 | --------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 71 | Centralized multicast tree computation, programming and management |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 72 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 73 | - Support both IPv4 and IPv6 multicast |
| 74 | - Dual-homed multicast sinks for HA |
| 75 | - Multiple multicast sources for HA |
| 76 | |
| 77 | API |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 78 | --- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 79 | - Provide easy access for 3rd party edge application developers and for the Aether centralized management platform |
| 80 | - Support for traffic redirecting, dropping, network slicing and QoS |
| 81 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 82 | Data Plane Programmability |
| 83 | -------------------------- |
| 84 | - Support for Stratum, P4Runtime/gNMI, and P4 programs |
Carmelo Cascone | a67dc7e | 2022-02-24 17:08:13 -0800 | [diff] [blame] | 85 | - Open source fabric-tna P4 program that can be modified for additional features |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 86 | |
Carmelo Cascone | a67dc7e | 2022-02-24 17:08:13 -0800 | [diff] [blame] | 87 | 4G & 5G |
| 88 | ------- |
| 89 | - Two User Plane Function (UPF) implementations: |
| 90 | |
| 91 | - Switch-based with fast path realized on Tofino with P4 (P4-UPF) |
| 92 | - CPU-based with fast path realized with Berkeley Extensible Software Switch framework (BESS-UPF) |
| 93 | |
| 94 | - Integration with mobile core control plane via PFCP protocol (3GPP standard interface) |
| 95 | |
| 96 | - Supported features: |
| 97 | |
| 98 | - GTP encap/decap, including support for 5G QFI extension header |
| 99 | - Usage reporting rules (URR) |
| 100 | - Downlink buffering and data notifications |
| 101 | - Application filtering (via SDF filters) |
| 102 | - Per-application, per-session, per-slice rate limiting (via QER) |
| 103 | - Per-flow QoS metric (BESS-UPF only) |
| 104 | |
| 105 | Visibility |
| 106 | ---------- |
| 107 | - Inband Network Telemetry (INT): |
| 108 | |
| 109 | - INT-XD mode with support for flow reports, drop reports, |
| 110 | queue congestion reports |
| 111 | - Smart triggers/filters to reduce volume of reports ingested by the INT collector |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 112 | |
| 113 | Troubleshooting & Diagnostics |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 114 | ----------------------------- |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 115 | - T3: Troubleshooting tool to diagnose broken forwarding paths fabric wide (work in progress) |
| 116 | - ONOS-diags: One-click diagnostics collection tool for issue reporting |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 117 | |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 118 | .. _Topology: |
| 119 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 120 | Topology |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 121 | -------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 122 | SD-Fabric can start at the smallest scale (single leaf) and grow horizontally. |
| 123 | |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 124 | .. image:: images/topology-scale.png |
| 125 | :width: 900px |
| 126 | |
| 127 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 128 | Single Leaf (ToR) |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 129 | ^^^^^^^^^^^^^^^^^ |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 130 | This is the minimum SD-Fabric setup. In this setup, all servers are connected to a single switch. |
| 131 | |
| 132 | .. image:: images/topology-single.png |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 133 | :width: 160px |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 134 | |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 135 | Single Leaf Pair (Dual-Homing) |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 136 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 137 | Compared to a single switch, it provides redundancy in terms of server NIC failure and link failure. |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 138 | |
| 139 | .. image:: images/topology-pair.png |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 140 | :width: 225px |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 141 | |
| 142 | Leaf-Spine (without pairing) |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 143 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 144 | Provide horizontal-scalability for multi-rack deployments, with redundancy for spine switch failures: |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 145 | |
| 146 | .. image:: images/topology-2x2.png |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 147 | :width: 300px |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 148 | |
| 149 | Leaf-Spine (with pairing) |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 150 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 151 | It supports all the redundancy and scalability features mentioned above. |
| 152 | |
| 153 | .. image:: images/topology-2x4.png |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 154 | :width: 450px |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 155 | |
| 156 | Multi-Stage Leaf-Spine |
Charles Chan | 10ad144 | 2021-10-05 16:57:26 -0700 | [diff] [blame] | 157 | ^^^^^^^^^^^^^^^^^^^^^^ |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 158 | Multi-stage is specifically designed for telco service providers. |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 159 | The first stage can be installed in the central office, while the second stage |
| 160 | can be installed in a field office that is closer to the subscribers. |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 161 | Two stages are typically connected via long distance optical transport. |
| 162 | |
| 163 | .. image:: images/topology-full.png |
Charles Chan | 33528a9 | 2021-10-05 13:28:13 -0700 | [diff] [blame] | 164 | :width: 700px |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 165 | |
| 166 | Resiliency |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 167 | ---------- |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 168 | Provides HA in the following scenarios: |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 169 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 170 | - Controller instance failure (requires 3 or 5 node ONOS cluster) |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 171 | - Leaf-spine link failures |
| 172 | - Spine switch failure |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 173 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 174 | Further HA support in following failure scenarios with dual-homing enabled: |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 175 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 176 | - Leaf switch failure |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 177 | - Upstream router failure |
| 178 | - Host NIC failure |
| 179 | |
| 180 | Scalability |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 181 | ----------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 182 | In Production |
| 183 | - Up to 80k routes (with route optimization) |
| 184 | - 170k Flows |
| 185 | - 600 direct-attached hosts |
| 186 | - 8 leaf switches |
| 187 | - 2 spine switches |
| 188 | |
| 189 | In Pre-Production |
| 190 | - Up to 120k routes (with route optimization) |
| 191 | - 250k flows |
| 192 | - 600 direct-attached hosts |
| 193 | - 8 leaf switches |
| 194 | - 2 spine switches |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 195 | |
| 196 | 4G/5G specific |
| 197 | - 5000 active UEs, 10 calls per second |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 198 | |
| 199 | Security |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 200 | -------- |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 201 | - TLS-secured gRPC connection between controllers and switches (work-in-progress) |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 202 | |
| 203 | Aether-ready |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 204 | ------------ |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 205 | Fully integrated with Aether (5G/4G private enterprise edge cloud solution) |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 206 | including deployment automation, CI/CD, logging, monitoring, and alerting. |
| 207 | |
| 208 | Overlay Support |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 209 | --------------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 210 | Can be used/integrated with 3rd party overlay networks (e.g., OpenStack Neutron, Kubernetes CNI). |
| 211 | |
| 212 | Orchestrator Support |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 213 | -------------------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 214 | Can be integrated with an external orchestrator, optionally running from the public cloud |
| 215 | Supports logging, telemetry, monitoring and alarm services via |
| 216 | REST APIs and Elastic/Fluentbit/Kibana, Prometheus/Grafana |
| 217 | |
| 218 | Controller Server Specs |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 219 | ----------------------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 220 | Recommendation (per ONOS instance) based on 50K routes |
| 221 | - CPU: 32 Cores |
| 222 | - RAM: 128GB RAM. 64GB dedicated to ONOS JVM heap |
| 223 | |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 224 | Recommendation (per ONOS instance) for 5K UEs when enabling UPF: |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 225 | - CPU: 1 Cores |
| 226 | - RAM: 4GB RAM |
| 227 | |
Charles Chan | b732368 | 2022-03-02 12:33:15 -0800 | [diff] [blame] | 228 | .. _all_switch: |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 229 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 230 | White Box Switch Hardware |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 231 | ------------------------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 232 | - Multi-vendor: APS Networks™, Dell™, Delta Networks™, Edgecore Networks™, Inventec™, Netburg™, QCT™ |
| 233 | - Multi-chipset: |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 234 | - Intel Tofino (supports all features, including UPF & INT) |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 235 | - Broadcom Tomahawk®, Tomahawk+®, Trident2 (traditional fabric features only) |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 236 | - 1/10G, 25G, 40G, and 100G ports |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 237 | - Refer to Supported Devices list in https://github.com/stratum/stratum for the most up-to-date hardware list |
| 238 | |
Charles Chan | b732368 | 2022-03-02 12:33:15 -0800 | [diff] [blame] | 239 | .. _verified_switch: |
| 240 | |
| 241 | Aether-verified Switch Hardware |
| 242 | ------------------------------- |
| 243 | - `EdgeCore DCS800 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=335>`_ |
| 244 | with Dual Pipe Tofino ASIC (formerly Wedge100BF-32X) |
| 245 | |
| 246 | - `EdgeCore DCS801 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=770>`_ |
| 247 | with Quad Pipe Tofino ASIC (formerly Wedge100BF-32QS) |
| 248 | |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 249 | White Box Switch Software |
Charles Chan | 7cc9b40 | 2021-10-04 16:14:20 -0700 | [diff] [blame] | 250 | ------------------------- |
Charles Chan | e6bb27a | 2021-10-03 23:26:50 -0700 | [diff] [blame] | 251 | - Open source ONL, ONIE, Docker, Kubernetes |
| 252 | - Stratum available from ONF |
Carmelo Cascone | 4398998 | 2021-10-12 00:01:19 -0700 | [diff] [blame] | 253 | |
| 254 | .. rubric:: Footnotes |
| 255 | |
| 256 | .. [#f1] IPv6 support on the data plane (P4 program) is still work-in-progress. |