Zack Williams | 794532a | 2021-03-18 17:38:36 -0700 | [diff] [blame] | 1 | .. |
| 2 | SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org> |
| 3 | SPDX-License-Identifier: Apache-2.0 |
| 4 | |
| 5 | Fabric Switch Bootstrap |
| 6 | ======================= |
| 7 | |
| 8 | The installation of the ONL OS image on the fabric switches uses the DHCP and |
| 9 | HTTP server set up on the management server. |
| 10 | |
| 11 | The default image is downloaded during that installation process by the |
| 12 | ``onieboot`` role. Make changes to that roll and rerun the management playbook |
| 13 | to download a newer switch image. |
| 14 | |
| 15 | Preparation |
| 16 | ----------- |
| 17 | |
| 18 | The switches have a single ethernet port that is shared between OpenBMC and |
| 19 | ONL. Find out the MAC addresses for both of these ports and enter it into |
| 20 | NetBox. |
| 21 | |
| 22 | Change boot mode to ONIE Rescue mode |
| 23 | ------------------------------------ |
| 24 | |
| 25 | In order to reinstall an ONL image, you must change the ONIE bootloader to |
| 26 | "Rescue Mode". |
| 27 | |
| 28 | Once the switch is powered on, it should retrieve an IP address on the OpenBMC |
| 29 | interface with DHCP. OpenBMC uses these default credentials:: |
| 30 | |
| 31 | username: root |
| 32 | password: 0penBmc |
| 33 | |
| 34 | Login to OpenBMC with SSH:: |
| 35 | |
| 36 | $ ssh root@10.0.0.131 |
| 37 | The authenticity of host '10.0.0.131 (10.0.0.131)' can't be established. |
| 38 | ECDSA key fingerprint is SHA256:... |
| 39 | Are you sure you want to continue connecting (yes/no)? yes |
| 40 | Warning: Permanently added '10.0.0.131' (ECDSA) to the list of known hosts. |
| 41 | root@10.0.0.131's password: |
| 42 | root@bmc:~# |
| 43 | |
| 44 | Using the Serial-over-LAN Console, enter ONL:: |
| 45 | |
| 46 | root@bmc:~# /usr/local/bin/sol.sh |
| 47 | You are in SOL session. |
| 48 | Use ctrl-x to quit. |
| 49 | ----------------------- |
| 50 | |
| 51 | root@onl:~# |
| 52 | |
| 53 | .. note:: |
| 54 | If `sol.sh` is unresponsive, please try to restart the mainboard with:: |
| 55 | |
Wei-Yu Chen | f6c0642 | 2021-08-11 11:43:10 +0800 | [diff] [blame] | 56 | root@onl:~# wedge_power.sh reset |
Zack Williams | 794532a | 2021-03-18 17:38:36 -0700 | [diff] [blame] | 57 | |
| 58 | |
| 59 | Change the boot mode to rescue mode with the command ``onl-onie-boot-mode |
| 60 | rescue``, and reboot:: |
| 61 | |
| 62 | root@onl:~# onl-onie-boot-mode rescue |
| 63 | [1053033.768512] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) |
| 64 | [1053033.936893] EXT4-fs (sda3): re-mounted. Opts: (null) |
| 65 | [1053033.996727] EXT4-fs (sda3): re-mounted. Opts: (null) |
| 66 | The system will boot into ONIE rescue mode at the next restart. |
| 67 | root@onl:~# reboot |
| 68 | |
| 69 | At this point, ONL will go through it's shutdown sequence and ONIE will start. |
| 70 | If it does not start right away, press the Enter/Return key a few times - it |
| 71 | may show you a boot selection screen. Pick ``ONIE`` and ``Rescue`` if given a |
| 72 | choice. |
| 73 | |
| 74 | Installing an ONL image over HTTP |
| 75 | --------------------------------- |
| 76 | |
| 77 | Now that the switch is in Rescue mode |
| 78 | |
| 79 | First, activate the Console by pressing Enter:: |
| 80 | |
| 81 | discover: Rescue mode detected. Installer disabled. |
| 82 | |
| 83 | Please press Enter to activate this console. |
| 84 | To check the install status inspect /var/log/onie.log. |
| 85 | Try this: tail -f /var/log/onie.log |
| 86 | |
| 87 | ** Rescue Mode Enabled ** |
| 88 | ONIE:/ # |
| 89 | |
| 90 | Then run the ``onie-nos-install`` command, with the URL of the management |
| 91 | server on the management network segment:: |
| 92 | |
| 93 | ONIE:/ # onie-nos-install http://10.0.0.129/onie-installer |
| 94 | discover: Rescue mode detected. No discover stopped. |
| 95 | ONIE: Unable to find 'Serial Number' TLV in EEPROM data. |
| 96 | Info: Fetching http://10.0.0.129/onie-installer ... |
| 97 | Connecting to 10.0.0.129 (10.0.0.129:80) |
| 98 | installer 100% |*******************************| 322M 0:00:00 ETA |
| 99 | ONIE: Executing installer: http://10.0.0.129/onie-installer |
| 100 | installer: computing checksum of original archive |
| 101 | installer: checksum is OK |
| 102 | ... |
| 103 | |
| 104 | The installation will now start, and then ONL will boot culminating in:: |
| 105 | |
| 106 | Open Network Linux OS ONL-wedge100bf-32qs, 2020-11-04.19:44-64100e9 |
| 107 | |
| 108 | localhost login: |
| 109 | |
| 110 | The default ONL login is:: |
| 111 | |
| 112 | username: root |
| 113 | password: onl |
| 114 | |
| 115 | If you login, you can verify that the switch is getting it's IP address via |
| 116 | DHCP:: |
| 117 | |
| 118 | root@localhost:~# ip addr |
| 119 | ... |
| 120 | 3: ma1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 |
| 121 | link/ether 00:90:fb:5c:e1:97 brd ff:ff:ff:ff:ff:ff |
| 122 | inet 10.0.0.130/25 brd 10.0.0.255 scope global ma1 |
| 123 | ... |
| 124 | |
| 125 | |
| 126 | Post-ONL Configuration |
| 127 | ---------------------- |
| 128 | |
| 129 | A ``terraform`` user must be created on the switches to allow them to be |
| 130 | configured. |
| 131 | |
| 132 | This is done using Ansible. Verify that your inventory (Created earlier from the |
| 133 | ``inventory/example-aether.ini`` file) includes an ``[aetherfabric]`` section |
| 134 | that has all the names and IP addresses of the compute nodes in it. |
| 135 | |
| 136 | Then run a ping test:: |
| 137 | |
| 138 | ansible -i inventory/sitename.ini -m ping aetherfabric |
| 139 | |
| 140 | This may fail with the error:: |
| 141 | |
| 142 | "msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host." |
| 143 | |
| 144 | Comment out the ``ansible_ssh_pass="onl"`` line, then rerun the ping test. It |
| 145 | may ask you about authorized keys - answer ``yes`` for each host to trust the |
| 146 | keys:: |
| 147 | |
| 148 | The authenticity of host '10.0.0.138 (<no hostip for proxy command>)' can't be established. |
| 149 | ECDSA key fingerprint is SHA256:... |
| 150 | Are you sure you want to continue connecting (yes/no/[fingerprint])? yes |
| 151 | |
| 152 | Once you've trusted the host keys, the ping test should succeed:: |
| 153 | |
| 154 | spine1.role1.site | SUCCESS => { |
| 155 | "changed": false, |
| 156 | "ping": "pong" |
| 157 | } |
| 158 | leaf1.role1.site | SUCCESS => { |
| 159 | "changed": false, |
| 160 | "ping": "pong" |
| 161 | } |
| 162 | ... |
| 163 | |
| 164 | Then run the playbook to create the ``terraform`` user:: |
| 165 | |
| 166 | ansible-playbook -i inventory/sitename.ini playbooks/aetherfabric-playbook.yml |
| 167 | |
| 168 | Once completed, the switch should now be ready for TOST runtime install. |