blob: c8fa01626a179c94c00822d0dbe4887346e94682 [file] [log] [blame]
Charles Chanfcfe8902022-02-02 17:06:27 -08001.. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org>
2.. SPDX-License-Identifier: Apache-2.0
3
Charles Chancaebcf32021-09-20 22:17:52 -07004Specification
5=============
6
Carmelo Cascone43989982021-10-12 00:01:19 -07007In the following we provide an exhaustive list of all features supported.
8
Charles Chane6bb27a2021-10-03 23:26:50 -07009SDN Features
Charles Chan7cc9b402021-10-04 16:14:20 -070010------------
Charles Chane6bb27a2021-10-03 23:26:50 -070011 - ONOS cluster of all-active N instances affording N-way redundancy and scale, where N = 3 or N = 5
12 - Unified operations interface (GUI/REST/CLI)
13 - Centralized configuration: all configuration is done on the controller instead of each individual switch
14 - Centralized role-based access control (RBAC)
15 - Automatic host (end-point) discovery: attached hosts, access-devices, appliances (PNFs), routers, etc.
16 based on ARP, DHCP, NDP, etc.
17 - Automatic switch, link and topology discovery and maintenance (keepalives, failure recovery)
Charles Chancaebcf32021-09-20 22:17:52 -070018
Charles Chane6bb27a2021-10-03 23:26:50 -070019L2 Features
Charles Chan7cc9b402021-10-04 16:14:20 -070020-----------
Charles Chane6bb27a2021-10-03 23:26:50 -070021Various L2 connectivity and tunneling support
Charles Chan7cc9b402021-10-04 16:14:20 -070022
Charles Chane6bb27a2021-10-03 23:26:50 -070023 - VLAN-based bridging
24
25 - Access, Trunk and Native VLAN support
26 - VLAN cross connect
27
28 - Forward traffic based on outer VLAN id
29 - Forward traffic based on outer and inner VLAN id (QinQ)
30 - Pseudowire
31
32 - L2 tunneling across the L3 fabric
33 - Support tunneling based on double tagged and single tagged traffic
34
35 - Support VLAN translation of outer tag
36
37L3 Features
Charles Chan7cc9b402021-10-04 16:14:20 -070038-----------
Charles Chane6bb27a2021-10-03 23:26:50 -070039IP connectivity
Charles Chan7cc9b402021-10-04 16:14:20 -070040
Carmelo Cascone43989982021-10-12 00:01:19 -070041 - IPv4 and IPv6 [#f1]_ unicast routing (internal use of MPLS Segment Routing)
Charles Chane6bb27a2021-10-03 23:26:50 -070042 - Subnetting configuration on all non-spine facing leaf ports; no configuration required on any spine port
Carmelo Cascone43989982021-10-12 00:01:19 -070043 - Equal Cost Multi-Path (ECMP) for traffic across spine switches
Charles Chane6bb27a2021-10-03 23:26:50 -070044 - IPv6 router advertisement
45 - ARP, NDP, IGMP handling
46 - Number of flows in spines greatly simplified by MPLS Segment Routing
47 - Further reduction of per-leaf flows with route optimization logic
48
49DHCP Relay
Charles Chan7cc9b402021-10-04 16:14:20 -070050----------
Charles Chane6bb27a2021-10-03 23:26:50 -070051DHCP L3 relay
Charles Chan7cc9b402021-10-04 16:14:20 -070052
Charles Chane6bb27a2021-10-03 23:26:50 -070053 - DHCPv4 and DHCPv6
54 - DHCP server either directly attached to fabric leaves, or indirectly connected via upstream router
55 - DHCP client directly either attached to fabric leaves, or indirectly connected via LDRA
56 - Multiple DHCP servers for HA
57
58vRouter
Charles Chan7cc9b402021-10-04 16:14:20 -070059-------
60vRouter presents the entire SD-Fabric as a single router (or dual-routers for HA),
61with disaggregated control/data plane
62
Charles Chane6bb27a2021-10-03 23:26:50 -070063 - Uses open-source protocol implementations like Quagga (or FRR)
64 - BGPv4 and BGPv6
65 - Static routes
66 - Route blackholing
67 - ACLs based on port, L2, L3 and L4 headers
68
69Multicast
Charles Chan7cc9b402021-10-04 16:14:20 -070070---------
Charles Chane6bb27a2021-10-03 23:26:50 -070071Centralized multicast tree computation, programming and management
Charles Chan7cc9b402021-10-04 16:14:20 -070072
Charles Chane6bb27a2021-10-03 23:26:50 -070073 - Support both IPv4 and IPv6 multicast
74 - Dual-homed multicast sinks for HA
75 - Multiple multicast sources for HA
76
77API
Charles Chan7cc9b402021-10-04 16:14:20 -070078---
Charles Chane6bb27a2021-10-03 23:26:50 -070079- Provide easy access for 3rd party edge application developers and for the Aether centralized management platform
80- Support for traffic redirecting, dropping, network slicing and QoS
81
Carmelo Cascone43989982021-10-12 00:01:19 -070082Data Plane Programmability
83--------------------------
84- Support for Stratum, P4Runtime/gNMI, and P4 programs
Carmelo Casconea67dc7e2022-02-24 17:08:13 -080085- Open source fabric-tna P4 program that can be modified for additional features
Carmelo Cascone43989982021-10-12 00:01:19 -070086
Carmelo Casconea67dc7e2022-02-24 17:08:13 -0800874G & 5G
88-------
89- Two User Plane Function (UPF) implementations:
90
91 - Switch-based with fast path realized on Tofino with P4 (P4-UPF)
92 - CPU-based with fast path realized with Berkeley Extensible Software Switch framework (BESS-UPF)
93
94- Integration with mobile core control plane via PFCP protocol (3GPP standard interface)
95
96- Supported features:
97
98 - GTP encap/decap, including support for 5G QFI extension header
99 - Usage reporting rules (URR)
100 - Downlink buffering and data notifications
101 - Application filtering (via SDF filters)
102 - Per-application, per-session, per-slice rate limiting (via QER)
103 - Per-flow QoS metric (BESS-UPF only)
104
105Visibility
106----------
107 - Inband Network Telemetry (INT):
108
109 - INT-XD mode with support for flow reports, drop reports,
110 queue congestion reports
111 - Smart triggers/filters to reduce volume of reports ingested by the INT collector
Charles Chane6bb27a2021-10-03 23:26:50 -0700112
113Troubleshooting & Diagnostics
Charles Chan7cc9b402021-10-04 16:14:20 -0700114-----------------------------
Carmelo Cascone43989982021-10-12 00:01:19 -0700115- T3: Troubleshooting tool to diagnose broken forwarding paths fabric wide (work in progress)
116- ONOS-diags: One-click diagnostics collection tool for issue reporting
Charles Chane6bb27a2021-10-03 23:26:50 -0700117
Charles Chan10ad1442021-10-05 16:57:26 -0700118.. _Topology:
119
Charles Chane6bb27a2021-10-03 23:26:50 -0700120Topology
Charles Chan7cc9b402021-10-04 16:14:20 -0700121--------
Charles Chane6bb27a2021-10-03 23:26:50 -0700122SD-Fabric can start at the smallest scale (single leaf) and grow horizontally.
123
Charles Chan33528a92021-10-05 13:28:13 -0700124.. image:: images/topology-scale.png
125 :width: 900px
126
127
Charles Chane6bb27a2021-10-03 23:26:50 -0700128Single Leaf (ToR)
Charles Chan10ad1442021-10-05 16:57:26 -0700129^^^^^^^^^^^^^^^^^
Charles Chane6bb27a2021-10-03 23:26:50 -0700130This is the minimum SD-Fabric setup. In this setup, all servers are connected to a single switch.
131
132.. image:: images/topology-single.png
Charles Chan33528a92021-10-05 13:28:13 -0700133 :width: 160px
Charles Chane6bb27a2021-10-03 23:26:50 -0700134
Charles Chan33528a92021-10-05 13:28:13 -0700135Single Leaf Pair (Dual-Homing)
Charles Chan10ad1442021-10-05 16:57:26 -0700136^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Carmelo Cascone43989982021-10-12 00:01:19 -0700137Compared to a single switch, it provides redundancy in terms of server NIC failure and link failure.
Charles Chane6bb27a2021-10-03 23:26:50 -0700138
139.. image:: images/topology-pair.png
Charles Chan33528a92021-10-05 13:28:13 -0700140 :width: 225px
Charles Chane6bb27a2021-10-03 23:26:50 -0700141
142Leaf-Spine (without pairing)
Charles Chan10ad1442021-10-05 16:57:26 -0700143^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Carmelo Cascone43989982021-10-12 00:01:19 -0700144Provide horizontal-scalability for multi-rack deployments, with redundancy for spine switch failures:
Charles Chane6bb27a2021-10-03 23:26:50 -0700145
146.. image:: images/topology-2x2.png
Charles Chan33528a92021-10-05 13:28:13 -0700147 :width: 300px
Charles Chane6bb27a2021-10-03 23:26:50 -0700148
149Leaf-Spine (with pairing)
Charles Chan10ad1442021-10-05 16:57:26 -0700150^^^^^^^^^^^^^^^^^^^^^^^^^
Charles Chane6bb27a2021-10-03 23:26:50 -0700151It supports all the redundancy and scalability features mentioned above.
152
153.. image:: images/topology-2x4.png
Charles Chan33528a92021-10-05 13:28:13 -0700154 :width: 450px
Charles Chane6bb27a2021-10-03 23:26:50 -0700155
156Multi-Stage Leaf-Spine
Charles Chan10ad1442021-10-05 16:57:26 -0700157^^^^^^^^^^^^^^^^^^^^^^
Charles Chane6bb27a2021-10-03 23:26:50 -0700158Multi-stage is specifically designed for telco service providers.
Charles Chan33528a92021-10-05 13:28:13 -0700159The first stage can be installed in the central office, while the second stage
160can be installed in a field office that is closer to the subscribers.
Charles Chane6bb27a2021-10-03 23:26:50 -0700161Two stages are typically connected via long distance optical transport.
162
163.. image:: images/topology-full.png
Charles Chan33528a92021-10-05 13:28:13 -0700164 :width: 700px
Charles Chane6bb27a2021-10-03 23:26:50 -0700165
166Resiliency
Charles Chan7cc9b402021-10-04 16:14:20 -0700167----------
Carmelo Cascone43989982021-10-12 00:01:19 -0700168Provides HA in the following scenarios:
Charles Chan7cc9b402021-10-04 16:14:20 -0700169
Charles Chane6bb27a2021-10-03 23:26:50 -0700170 - Controller instance failure (requires 3 or 5 node ONOS cluster)
Carmelo Cascone43989982021-10-12 00:01:19 -0700171 - Leaf-spine link failures
172 - Spine switch failure
Charles Chan7cc9b402021-10-04 16:14:20 -0700173
Carmelo Cascone43989982021-10-12 00:01:19 -0700174Further HA support in following failure scenarios with dual-homing enabled:
Charles Chan7cc9b402021-10-04 16:14:20 -0700175
Carmelo Cascone43989982021-10-12 00:01:19 -0700176 - Leaf switch failure
Charles Chane6bb27a2021-10-03 23:26:50 -0700177 - Upstream router failure
178 - Host NIC failure
179
180Scalability
Charles Chan7cc9b402021-10-04 16:14:20 -0700181-----------
Charles Chane6bb27a2021-10-03 23:26:50 -0700182In Production
183 - Up to 80k routes (with route optimization)
184 - 170k Flows
185 - 600 direct-attached hosts
186 - 8 leaf switches
187 - 2 spine switches
188
189In Pre-Production
190 - Up to 120k routes (with route optimization)
191 - 250k flows
192 - 600 direct-attached hosts
193 - 8 leaf switches
194 - 2 spine switches
Carmelo Cascone43989982021-10-12 00:01:19 -0700195
1964G/5G specific
197 - 5000 active UEs, 10 calls per second
Charles Chane6bb27a2021-10-03 23:26:50 -0700198
199Security
Charles Chan7cc9b402021-10-04 16:14:20 -0700200--------
Carmelo Cascone43989982021-10-12 00:01:19 -0700201 - TLS-secured gRPC connection between controllers and switches (work-in-progress)
Charles Chane6bb27a2021-10-03 23:26:50 -0700202
203Aether-ready
Charles Chan7cc9b402021-10-04 16:14:20 -0700204------------
Carmelo Cascone43989982021-10-12 00:01:19 -0700205Fully integrated with Aether (5G/4G private enterprise edge cloud solution)
Charles Chane6bb27a2021-10-03 23:26:50 -0700206including deployment automation, CI/CD, logging, monitoring, and alerting.
207
208Overlay Support
Charles Chan7cc9b402021-10-04 16:14:20 -0700209---------------
Charles Chane6bb27a2021-10-03 23:26:50 -0700210Can be used/integrated with 3rd party overlay networks (e.g., OpenStack Neutron, Kubernetes CNI).
211
212Orchestrator Support
Charles Chan7cc9b402021-10-04 16:14:20 -0700213--------------------
Charles Chane6bb27a2021-10-03 23:26:50 -0700214Can be integrated with an external orchestrator, optionally running from the public cloud
215Supports logging, telemetry, monitoring and alarm services via
216REST APIs and Elastic/Fluentbit/Kibana, Prometheus/Grafana
217
218Controller Server Specs
Charles Chan7cc9b402021-10-04 16:14:20 -0700219-----------------------
Charles Chane6bb27a2021-10-03 23:26:50 -0700220Recommendation (per ONOS instance) based on 50K routes
221 - CPU: 32 Cores
222 - RAM: 128GB RAM. 64GB dedicated to ONOS JVM heap
223
Carmelo Cascone43989982021-10-12 00:01:19 -0700224Recommendation (per ONOS instance) for 5K UEs when enabling UPF:
Carmelo Cascone43989982021-10-12 00:01:19 -0700225 - CPU: 1 Cores
226 - RAM: 4GB RAM
227
Charles Chanb7323682022-03-02 12:33:15 -0800228.. _all_switch:
Carmelo Cascone43989982021-10-12 00:01:19 -0700229
Charles Chane6bb27a2021-10-03 23:26:50 -0700230White Box Switch Hardware
Charles Chan7cc9b402021-10-04 16:14:20 -0700231-------------------------
Charles Chane6bb27a2021-10-03 23:26:50 -0700232- Multi-vendor: APS Networks™, Dell™, Delta Networks™, Edgecore Networks™, Inventec™, Netburg™, QCT
233- Multi-chipset:
Carmelo Cascone43989982021-10-12 00:01:19 -0700234 - Intel Tofino (supports all features, including UPF & INT)
Charles Chane6bb27a2021-10-03 23:26:50 -0700235 - Broadcom Tomahawk®, Tomahawk+®, Trident2 (traditional fabric features only)
Carmelo Cascone43989982021-10-12 00:01:19 -0700236- 1/10G, 25G, 40G, and 100G ports
Charles Chane6bb27a2021-10-03 23:26:50 -0700237- Refer to Supported Devices list in https://github.com/stratum/stratum for the most up-to-date hardware list
238
Charles Chanb7323682022-03-02 12:33:15 -0800239.. _verified_switch:
240
241Aether-verified Switch Hardware
242-------------------------------
243 - `EdgeCore DCS800 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=335>`_
244 with Dual Pipe Tofino ASIC (formerly Wedge100BF-32X)
245
246 - `EdgeCore DCS801 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=770>`_
247 with Quad Pipe Tofino ASIC (formerly Wedge100BF-32QS)
248
Charles Chane6bb27a2021-10-03 23:26:50 -0700249White Box Switch Software
Charles Chan7cc9b402021-10-04 16:14:20 -0700250-------------------------
Charles Chane6bb27a2021-10-03 23:26:50 -0700251- Open source ONL, ONIE, Docker, Kubernetes
252- Stratum available from ONF
Carmelo Cascone43989982021-10-12 00:01:19 -0700253
254.. rubric:: Footnotes
255
256.. [#f1] IPv6 support on the data plane (P4 program) is still work-in-progress.