blob: 3d424c996c43d884fc40197027fbf098e02db767 [file] [log] [blame]
Charles Chanfcfe8902022-02-02 17:06:27 -08001.. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org>
2.. SPDX-License-Identifier: Apache-2.0
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -07003
Carmelo Cascone43989982021-10-12 00:01:19 -07004.. _deployment_guide:
5
Charles Chancaebcf32021-09-20 22:17:52 -07006Deployment Guide
7================
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -07008
Charles Chanb7323682022-03-02 12:33:15 -08009Switch Hardware Selection
10-------------------------
11We have verified and therefore recommend using the switch model listed in :ref:`verified_switch`.
12Other Stratum-enabled switches listed in :ref:`all_switch` should also work in theory
13but more integration work may be required.
14
15To use the P4 UPF, you must use fabric switches based on the `Intel (formerly Barefoot) Tofino chipset
16<https://www.intel.com/content/www/us/en/products/network-io/programmable-ethernet-switch/tofino-series.html>`_.
17There are two variants of this switching chipset, with different resources and capabilities.
18The **Dual Pipe** Tofino ASIC is less expensive,
19while the **Quad Pipe** Tofino ASIC has more chip resources and a faster embedded system with more memory and storage.
20
21The P4 UPF and SD-Fabric features run within the constraints of the Dual Pipe
22system for production deployments, but for development of features in P4, the
23larger capacity of the Quad Pipe is desirable.
24
25These switches feature 32 QSFP+ ports capable of running in 100GbE, 40GbE, or
264x 10GbE mode (using a split DAC or fiber cable) and have a 1GbE management
27network interface.
28
29See also the :ref:`Rackmount of Equipment
30<aether:edge_deployment/site_planning:rackmount of equipment>` for how the Fabric
31switches should be rack-mounted to ensure proper airflow within a rack.
32
Charles Chan2caff7b2021-10-11 20:25:16 -070033Deployment Overview
34-------------------
35SD-Fabric is released with Helm chart and container images.
36We recommend using **Kubernetes** and **Helm** to deploy SD-Fabric.
37Here's a list of high level steps required to deploy SD-Fabric:
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070038
Charles Chan2caff7b2021-10-11 20:25:16 -0700391. **Provision switch**
40
41 We first need to install operating system with Docker and Kubernetes on the bare-metal switches.
42
432. **Prepare switches as special Kubernetes nodes**
44
45 Kubernetes ``label`` and ``taint`` are used to configure switches as special Kubernetes worker nodes.
46 This is to make sure we deploy Stratum (and only Stratum) on switches.
47
Charles Chana937f772022-02-23 16:24:35 -0800483. **Prepare ONOS network configuration**
Charles Chan2caff7b2021-10-11 20:25:16 -070049
50 Network configuration defines properties such as switch pipeconf, subnet and VLAN.
51
Charles Chana937f772022-02-23 16:24:35 -0800524. **Prepare Stratum chassis configuration for each switch**
Charles Chan2caff7b2021-10-11 20:25:16 -070053
54 Chassis config defines switch properties such as port speed and breakout.
55
Charles Chana937f772022-02-23 16:24:35 -0800565. **Install SD-Fabric** using Helm
Charles Chan2caff7b2021-10-11 20:25:16 -070057
58 Finally, we are going to install SD-Fabric with the information we prepared in Step 1 to 5.
59
60Step 1: Provision Switches
61--------------------------
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070062
Charles Chanb7323682022-03-02 12:33:15 -080063We follow Open Network Install Environment (ONIE) way to install Open Network Linux (ONL) image to switch.
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070064To work with the SD-Fabric environment, we have customized the ONL image to support related packages and dependencies.
65
66Image source file can be found on ONF repository `opennetworkinglab/OpenNetworkLinux <https://github.com/opennetworkinglab/OpenNetworkLinux>`_.
67You can also download pre-compiled artifacts from `Github Release page <https://github.com/opennetworkinglab/OpenNetworkLinux/releases>`_
68
69
70.. note::
71 If you're not familiar with ONIE/ONL environment, please check `Getting Started <https://github.com/opencomputeproject/OpenNetworkLinux/blob/master/docs/GettingStarted.md>`_ to
72 see how to install the ONL image to an ONIE supported switch.
73
74Below is an example about how to install the ONL image.
75
761. Prepare a server which is accessible by the switch and then download the
77pre-compiled installer from the release page.
78
79.. code-block::
80
Charles Chanb7323682022-03-02 12:33:15 -080081 wget https://github.com/opennetworkinglab/OpenNetworkLinux/releases/download/v1.4.3/ONL-onf-ONLPv2_ONL-OS_2021-07-16.2159-5195444_AMD64_INSTALLED_INSTALLER -o onl-installer
82 sudo python -m http.server 80
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070083
842. Reboot the switch to enter ONIE installation mode
85
Charles Chanb7323682022-03-02 12:33:15 -080086 In order to reinstall an ONL image, you must change the ONIE bootloader to
87 "Rescue Mode".
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070088
Charles Chanb7323682022-03-02 12:33:15 -080089 Once the switch is powered on, it should retrieve an IP address on the OpenBMC
90 interface with DHCP. Here we use ``10.0.0.131`` as an example.
91 OpenBMC uses these default credentials
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070092
Charles Chanb7323682022-03-02 12:33:15 -080093 .. code-block::
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -070094
Charles Chanb7323682022-03-02 12:33:15 -080095 username: root
96 password: 0penBmc
97
98 Login to OpenBMC with SSH:
99
100 .. code-block::
101
102 $ ssh root@10.0.0.131
103 The authenticity of host '10.0.0.131 (10.0.0.131)' can't be established.
104 ECDSA key fingerprint is SHA256:...
105 Are you sure you want to continue connecting (yes/no)? yes
106 Warning: Permanently added '10.0.0.131' (ECDSA) to the list of known hosts.
107 root@10.0.0.131's password:
108 root@bmc:~#
109
110 Using the Serial-over-LAN Console, enter ONL
111
112 .. code-block::
113
114 root@bmc:~# /usr/local/bin/sol.sh
115 You are in SOL session.
116 Use ctrl-x to quit.
117 -----------------------
118
119 root@onl:~#
120
121 .. note::
122
123 If `sol.sh` is unresponsive, please try to restart the mainboard with
124
125 .. code-block::
126
127 root@onl:~# wedge_power.sh reset
128
129 Change the boot mode to rescue mode and reboot
130
131 .. code-block::
132
133 root@onl:~# onl-onie-boot-mode rescue
134 [1053033.768512] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
135 [1053033.936893] EXT4-fs (sda3): re-mounted. Opts: (null)
136 [1053033.996727] EXT4-fs (sda3): re-mounted. Opts: (null)
137 The system will boot into ONIE rescue mode at the next restart.
138
139 root@onl:~# reboot
140
141 At this point, ONL will go through it's shutdown sequence and ONIE will start.
142 If it does not start right away, press the Enter/Return key a few times - it
143 may show you a boot selection screen. Pick ``ONIE`` and ``Rescue`` if given a
144 choice.
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700145
1463. Install ONL installer
147
Charles Chanb7323682022-03-02 12:33:15 -0800148 Now that the switch is in Rescue mode
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700149
Charles Chanb7323682022-03-02 12:33:15 -0800150 Then run the ``onie-nos-install`` command, with the URL of the management
151 server (here we use ``10.0.0.129`` as an example) on the management network segment
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700152
Charles Chanb7323682022-03-02 12:33:15 -0800153 .. code-block::
154
155 ONIE:/ # onie-nos-install http://10.0.0.129/onie-installer
156 discover: Rescue mode detected. No discover stopped.
157 ONIE: Unable to find 'Serial Number' TLV in EEPROM data.
158 Info: Fetching http://10.0.0.129/onie-installer ...
159 Connecting to 10.0.0.129 (10.0.0.129:80)
160 installer 100% |*******************************| 322M 0:00:00 ETA
161 ONIE: Executing installer: http://10.0.0.129/onie-installer
162 installer: computing checksum of original archive
163 installer: checksum is OK
164 ...
165
166 The installation will now start, and then ONL will boot culminating in
167
168 .. code-block::
169
170 Open Network Linux OS ONL-wedge100bf-32qs, 2020-11-04.19:44-64100e9
171
172 localhost login:
173
174 The default ONL login is::
175
176 username: root
177 password: onl
178
179 If you login, you can verify that the switch is getting it's IP address via DHCP
180
181 .. code-block::
182
183 root@localhost:~# ip addr
184 ...
185 3: ma1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
186 link/ether 00:90:fb:5c:e1:97 brd ff:ff:ff:ff:ff:ff
187 inet 10.0.0.130/25 brd 10.0.0.255 scope global ma1
188 ...
189
1904. (Optional) Setup switch IP and hostname after the installation if DHCP is not available
191
192.. warning::
193
194 Stop and return to :ref:`Post-ONL configuration <aether:edge_deployment/fabric_switch_bootstrap:post-onl configuration>`
195 and continue the remaining steps there if you came from Aether docs.
196 Otherwise, please continue the rest of the page here.
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700197
198
Charles Chan2caff7b2021-10-11 20:25:16 -0700199Step 2: Configure switches as special Kubernetes nodes
200------------------------------------------------------
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700201
202Our `ONL <https://github.com/opennetworkinglab/OpenNetworkLinux>`_ version
203includes all packages required by running the Kubernetes on top of it.
204Once the Kubernetes is ready, the `Stratum <https://opennetworking.org/stratum/>`_ application will be deployed to the switch to manage it.
205
206Unlike server, switch has less CPU and memory resources and we should avoid
207deploying unnecessary workloads into switch.
208Besides, the Stratum application should only be deployed to all switches.
209
210To achieve the above goals, please apply the resources to your Kubernetes cluster.
211
2121. Set up Label to all switch node, e.g ``node-role.kubernetes.io=switch``
2132. Set up Taint with ``NoSchedule`` to all switch node, e.g ``node-role.kubernetes.io=switch:NoSchedule``
2143. Properly configure the ``NodeSelector`` and ``Toleration`` when deploying Stratum via DaemonSet
215
216Example of a five nodes Kubernetes cluster, two switches and three servers
217
218.. code-block::
219
220 ╰─$ kubectl get node -o custom-columns=NAME:.metadata.name,TAINT:.spec.taints
221 NAME TAINT
222 compute1 <none>
223 compute2 <none>
224 compute3 <none>
225 leaf1 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
226 leaf2 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
Hung-Wei Chiub0232a12021-10-11 11:17:54 -0700227 ╰─$ kubectl get nodes -lnode-role.kubernetes.io=switch
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700228 NAME STATUS ROLES AGE VERSION
229 leaf1 Ready worker 27d v1.18.8
230 leaf2 Ready worker 27d v1.18.8
231
Charles Chana937f772022-02-23 16:24:35 -0800232Step 3: Prepare ONOS network configuration
Charles Chan2caff7b2021-10-11 20:25:16 -0700233------------------------------------------
234 See :ref:`onos_network_config` for instructions
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700235
Charles Chana937f772022-02-23 16:24:35 -0800236Step 4: Prepare Stratum chassis configuration
Charles Chan2caff7b2021-10-11 20:25:16 -0700237---------------------------------------------
238 See See :ref:`stratum_chassis_config` for instructions
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700239
Hung-Wei Chiub0232a12021-10-11 11:17:54 -0700240.. _install_sd_fabric:
Hung-Wei Chiue49ef3e2021-10-04 14:13:36 -0700241
Charles Chana937f772022-02-23 16:24:35 -0800242Step 5: Install SD-Fabric with Helm
Charles Chan2caff7b2021-10-11 20:25:16 -0700243-----------------------------------
Hung-Wei Chiub0232a12021-10-11 11:17:54 -0700244
245To install SD-Fabric into your Kubernetes cluster, follow instructions
Charles Chan2caff7b2021-10-11 20:25:16 -0700246described on the `SD-Fabric Helm Chart README <https://gerrit.opencord.org/plugins/gitiles/sdfabric-helm-charts/+/HEAD/sdfabric/README.md>`_