blob: 3d424c996c43d884fc40197027fbf098e02db767 [file] [log] [blame]
.. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org>
.. SPDX-License-Identifier: Apache-2.0
.. _deployment_guide:
Deployment Guide
================
Switch Hardware Selection
-------------------------
We have verified and therefore recommend using the switch model listed in :ref:`verified_switch`.
Other Stratum-enabled switches listed in :ref:`all_switch` should also work in theory
but more integration work may be required.
To use the P4 UPF, you must use fabric switches based on the `Intel (formerly Barefoot) Tofino chipset
<https://www.intel.com/content/www/us/en/products/network-io/programmable-ethernet-switch/tofino-series.html>`_.
There are two variants of this switching chipset, with different resources and capabilities.
The **Dual Pipe** Tofino ASIC is less expensive,
while the **Quad Pipe** Tofino ASIC has more chip resources and a faster embedded system with more memory and storage.
The P4 UPF and SD-Fabric features run within the constraints of the Dual Pipe
system for production deployments, but for development of features in P4, the
larger capacity of the Quad Pipe is desirable.
These switches feature 32 QSFP+ ports capable of running in 100GbE, 40GbE, or
4x 10GbE mode (using a split DAC or fiber cable) and have a 1GbE management
network interface.
See also the :ref:`Rackmount of Equipment
<aether:edge_deployment/site_planning:rackmount of equipment>` for how the Fabric
switches should be rack-mounted to ensure proper airflow within a rack.
Deployment Overview
-------------------
SD-Fabric is released with Helm chart and container images.
We recommend using **Kubernetes** and **Helm** to deploy SD-Fabric.
Here's a list of high level steps required to deploy SD-Fabric:
1. **Provision switch**
We first need to install operating system with Docker and Kubernetes on the bare-metal switches.
2. **Prepare switches as special Kubernetes nodes**
Kubernetes ``label`` and ``taint`` are used to configure switches as special Kubernetes worker nodes.
This is to make sure we deploy Stratum (and only Stratum) on switches.
3. **Prepare ONOS network configuration**
Network configuration defines properties such as switch pipeconf, subnet and VLAN.
4. **Prepare Stratum chassis configuration for each switch**
Chassis config defines switch properties such as port speed and breakout.
5. **Install SD-Fabric** using Helm
Finally, we are going to install SD-Fabric with the information we prepared in Step 1 to 5.
Step 1: Provision Switches
--------------------------
We follow Open Network Install Environment (ONIE) way to install Open Network Linux (ONL) image to switch.
To work with the SD-Fabric environment, we have customized the ONL image to support related packages and dependencies.
Image source file can be found on ONF repository `opennetworkinglab/OpenNetworkLinux <https://github.com/opennetworkinglab/OpenNetworkLinux>`_.
You can also download pre-compiled artifacts from `Github Release page <https://github.com/opennetworkinglab/OpenNetworkLinux/releases>`_
.. note::
If you're not familiar with ONIE/ONL environment, please check `Getting Started <https://github.com/opencomputeproject/OpenNetworkLinux/blob/master/docs/GettingStarted.md>`_ to
see how to install the ONL image to an ONIE supported switch.
Below is an example about how to install the ONL image.
1. Prepare a server which is accessible by the switch and then download the
pre-compiled installer from the release page.
.. code-block::
wget https://github.com/opennetworkinglab/OpenNetworkLinux/releases/download/v1.4.3/ONL-onf-ONLPv2_ONL-OS_2021-07-16.2159-5195444_AMD64_INSTALLED_INSTALLER -o onl-installer
sudo python -m http.server 80
2. Reboot the switch to enter ONIE installation mode
In order to reinstall an ONL image, you must change the ONIE bootloader to
"Rescue Mode".
Once the switch is powered on, it should retrieve an IP address on the OpenBMC
interface with DHCP. Here we use ``10.0.0.131`` as an example.
OpenBMC uses these default credentials
.. code-block::
username: root
password: 0penBmc
Login to OpenBMC with SSH:
.. code-block::
$ ssh root@10.0.0.131
The authenticity of host '10.0.0.131 (10.0.0.131)' can't be established.
ECDSA key fingerprint is SHA256:...
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.0.131' (ECDSA) to the list of known hosts.
root@10.0.0.131's password:
root@bmc:~#
Using the Serial-over-LAN Console, enter ONL
.. code-block::
root@bmc:~# /usr/local/bin/sol.sh
You are in SOL session.
Use ctrl-x to quit.
-----------------------
root@onl:~#
.. note::
If `sol.sh` is unresponsive, please try to restart the mainboard with
.. code-block::
root@onl:~# wedge_power.sh reset
Change the boot mode to rescue mode and reboot
.. code-block::
root@onl:~# onl-onie-boot-mode rescue
[1053033.768512] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
[1053033.936893] EXT4-fs (sda3): re-mounted. Opts: (null)
[1053033.996727] EXT4-fs (sda3): re-mounted. Opts: (null)
The system will boot into ONIE rescue mode at the next restart.
root@onl:~# reboot
At this point, ONL will go through it's shutdown sequence and ONIE will start.
If it does not start right away, press the Enter/Return key a few times - it
may show you a boot selection screen. Pick ``ONIE`` and ``Rescue`` if given a
choice.
3. Install ONL installer
Now that the switch is in Rescue mode
Then run the ``onie-nos-install`` command, with the URL of the management
server (here we use ``10.0.0.129`` as an example) on the management network segment
.. code-block::
ONIE:/ # onie-nos-install http://10.0.0.129/onie-installer
discover: Rescue mode detected. No discover stopped.
ONIE: Unable to find 'Serial Number' TLV in EEPROM data.
Info: Fetching http://10.0.0.129/onie-installer ...
Connecting to 10.0.0.129 (10.0.0.129:80)
installer 100% |*******************************| 322M 0:00:00 ETA
ONIE: Executing installer: http://10.0.0.129/onie-installer
installer: computing checksum of original archive
installer: checksum is OK
...
The installation will now start, and then ONL will boot culminating in
.. code-block::
Open Network Linux OS ONL-wedge100bf-32qs, 2020-11-04.19:44-64100e9
localhost login:
The default ONL login is::
username: root
password: onl
If you login, you can verify that the switch is getting it's IP address via DHCP
.. code-block::
root@localhost:~# ip addr
...
3: ma1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:90:fb:5c:e1:97 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.130/25 brd 10.0.0.255 scope global ma1
...
4. (Optional) Setup switch IP and hostname after the installation if DHCP is not available
.. warning::
Stop and return to :ref:`Post-ONL configuration <aether:edge_deployment/fabric_switch_bootstrap:post-onl configuration>`
and continue the remaining steps there if you came from Aether docs.
Otherwise, please continue the rest of the page here.
Step 2: Configure switches as special Kubernetes nodes
------------------------------------------------------
Our `ONL <https://github.com/opennetworkinglab/OpenNetworkLinux>`_ version
includes all packages required by running the Kubernetes on top of it.
Once the Kubernetes is ready, the `Stratum <https://opennetworking.org/stratum/>`_ application will be deployed to the switch to manage it.
Unlike server, switch has less CPU and memory resources and we should avoid
deploying unnecessary workloads into switch.
Besides, the Stratum application should only be deployed to all switches.
To achieve the above goals, please apply the resources to your Kubernetes cluster.
1. Set up Label to all switch node, e.g ``node-role.kubernetes.io=switch``
2. Set up Taint with ``NoSchedule`` to all switch node, e.g ``node-role.kubernetes.io=switch:NoSchedule``
3. Properly configure the ``NodeSelector`` and ``Toleration`` when deploying Stratum via DaemonSet
Example of a five nodes Kubernetes cluster, two switches and three servers
.. code-block::
╰─$ kubectl get node -o custom-columns=NAME:.metadata.name,TAINT:.spec.taints
NAME TAINT
compute1 <none>
compute2 <none>
compute3 <none>
leaf1 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
leaf2 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
╰─$ kubectl get nodes -lnode-role.kubernetes.io=switch
NAME STATUS ROLES AGE VERSION
leaf1 Ready worker 27d v1.18.8
leaf2 Ready worker 27d v1.18.8
Step 3: Prepare ONOS network configuration
------------------------------------------
See :ref:`onos_network_config` for instructions
Step 4: Prepare Stratum chassis configuration
---------------------------------------------
See See :ref:`stratum_chassis_config` for instructions
.. _install_sd_fabric:
Step 5: Install SD-Fabric with Helm
-----------------------------------
To install SD-Fabric into your Kubernetes cluster, follow instructions
described on the `SD-Fabric Helm Chart README <https://gerrit.opencord.org/plugins/gitiles/sdfabric-helm-charts/+/HEAD/sdfabric/README.md>`_