Merge "Verify fabric connectivity of the servers once TOST is ready"
diff --git a/pronto_deployment_guide/hw_installation.rst b/pronto_deployment_guide/hw_installation.rst
index df60ea7..a55f64e 100644
--- a/pronto_deployment_guide/hw_installation.rst
+++ b/pronto_deployment_guide/hw_installation.rst
@@ -97,6 +97,15 @@
* Set the description to ``fab2.<deployment>.<site>.aetherproject.net`` (or
``prontoproject.net``).
+ There also needs to be a parent range of the two fabric ranges added:
+
+ * ``10.0.1.0/24``
+
+ * This is used to configure the correct routes, DNS, and TFTP servers
+ provided by DHCP to the equipment that is connected to the fabric
+ leaf switch that the management server (which provides those
+ services) is not connected to.
+
Additionally, these edge prefixes are used for Kubernetes but don't need to
be created in NetBox:
@@ -221,14 +230,14 @@
14. Add router IP reservations to the IP Prefix for both Fabric prefixes. These
are IP addresses that:
- - Have the last usable address in range (in a ``/25``, this would be
- ``.126`` or ``.254``)
+ - Have the last usable address in range (in the ``/25`` fabric examples
+ above, these would be ``10.0.1.126/25`` and ``10.0.1.254/25``)
- Have a ``Status`` of ``Reserved``, and the VRF, Tenant Group, and Tenant
set.
- The Description must start with the word ``router``, such as: ``router
- for for leaf1 Fabric``
+ for leaf1 Fabric``
15. Add Cables between physical interfaces on the devices
@@ -360,10 +369,11 @@
In the case of the Fabric that has two leaves and IP ranges, add the Management
server IP address used for the leaf that it is connected to, and then add a
route for the other IP address range for the non-Management-connected leaf that
-is via the Fabric router address in that range.
+is via the Fabric router address in the connected leaf range.
This configures the `netplan <https://netplan.io>`_ on the management server,
-and will be automated away soon::
+and creates a SNAT rule for the UE range route, and will be automated away
+soon::
# added manually
netprep_netplan:
@@ -385,7 +395,11 @@
routes:
- to: 10.0.1.0/25
via: 10.0.1.254
- weight: 100
+ metric: 100
+
+ netprep_nftables_nat_postrouting: >
+ ip saddr 10.0.1.0/25 ip daddr 10.168.0.0/20 counter snat to 10.0.1.129;
+
Using the ``inventory/example-aether.ini`` as a template, create an
:doc:`ansible inventory <ansible:user_guide/intro_inventory>` file for the
diff --git a/pronto_deployment_guide/troubleshooting.rst b/pronto_deployment_guide/troubleshooting.rst
index 7828b76..e9cd463 100644
--- a/pronto_deployment_guide/troubleshooting.rst
+++ b/pronto_deployment_guide/troubleshooting.rst
@@ -5,6 +5,42 @@
Troubleshooting
===============
+Unable to access a system
+-------------------------
+
+If it's a system behind another system (ex: the compute nodes behind a
+management server) and you're trying to interactively login to it, make sure
+that you've enabled SSH Agent Forwarding in your ``~/.ssh/config`` file::
+
+ Host mgmtserver1.prod.site.aetherproject.net
+ ForwardAgent yes
+
+If you still have problems after verifying that this is set up, run ssh with
+the ``-v`` option, which will print out all the connection details and
+whether an agent is used on the second ssh::
+
+ onfadmin@mgmtserver1:~$ ssh onfadmin@node2.mgmt.prod.site.aetherproject.net
+ debug1: client_input_channel_open: ctype auth-agent@openssh.com rchan 2 win 65536 max 16384
+ debug1: channel 1: new [authentication agent connection]
+ debug1: confirm auth-agent@openssh.com
+ Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-56-generic x86_64)
+ ...
+ onfadmin@node2:~$
+
+Problems with OS installation
+-----------------------------
+
+OS installs, but doesn't boot
+"""""""""""""""""""""""""""""
+
+If you've completed the installation but the system won't start the OS, check
+these BIOS settings:
+
+- If the startup disk is nVME, under ``Advanced -> PCIe/PCI/PnP Configuration``
+ the option ``NVMe Firmware Source`` should be set to ``AMI Native Support``,
+ per `this Supermicro FAQ entry
+ <https://supermicro.com/support/faqs/faq.cfm?faq=28248>`_.
+
Unknown MAC addresses
---------------------
@@ -23,7 +59,9 @@
4. If you don't have a login to the server, but can get to the management
server, ``ip neighbor`` will show the arp table of MAC addresses known to
that system. It's output is unsorted - ``ip neigh | sort`` is easier to
- read.
+ read. This can be useful for determining if there's a cabling problem -
+ a device plugged into the wrong port of the management switch could show up
+ in the DHCP pool range for a different segment.
Cabling issues
--------------