blob: 192226cbcf7c5b0c92a702530d539024cc579799 [file] [log] [blame]
Zack Williams9026f532020-11-30 11:34:32 -07001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
5Troubleshooting
6===============
7
Zack Williams9d94b4f2020-12-14 11:25:29 -07008
9Firewalls and other host network issues
10---------------------------------------
11
Zack Williams5fd7a232020-12-03 12:45:56 -070012Unable to access a system
Zack Williams9d94b4f2020-12-14 11:25:29 -070013"""""""""""""""""""""""""
Zack Williams5fd7a232020-12-03 12:45:56 -070014
15If it's a system behind another system (ex: the compute nodes behind a
16management server) and you're trying to interactively login to it, make sure
17that you've enabled SSH Agent Forwarding in your ``~/.ssh/config`` file::
18
19 Host mgmtserver1.prod.site.aetherproject.net
20 ForwardAgent yes
21
22If you still have problems after verifying that this is set up, run ssh with
23the ``-v`` option, which will print out all the connection details and
24whether an agent is used on the second ssh::
25
26 onfadmin@mgmtserver1:~$ ssh onfadmin@node2.mgmt.prod.site.aetherproject.net
27 debug1: client_input_channel_open: ctype auth-agent@openssh.com rchan 2 win 65536 max 16384
28 debug1: channel 1: new [authentication agent connection]
29 debug1: confirm auth-agent@openssh.com
30 Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-56-generic x86_64)
31 ...
32 onfadmin@node2:~$
33
Zack Williams9d94b4f2020-12-14 11:25:29 -070034Root/Public DNS port is blocked
35"""""""""""""""""""""""""""""""
36
37In some cases access to the public DNS root and other servers is blocked, which
38prevents DNS lookups from working within the pod.
39
40To resolve this, forwarding addresses on the local network can be provided in
41the Ansible YAML ``host_vars`` file, using the ``unbound_forward_zones`` list
42to configure the Unbound recursive nameserver. An example::
43
44 unbound_forward_zones:
45 - name: "."
46 servers:
47 - "8.8.8.8"
48 - "8.8.4.4"
49
50
51The items in the ``servers`` list would be the locally accessible nameservers.
52
Zack Williams5fd7a232020-12-03 12:45:56 -070053Problems with OS installation
54-----------------------------
55
56OS installs, but doesn't boot
57"""""""""""""""""""""""""""""
58
59If you've completed the installation but the system won't start the OS, check
60these BIOS settings:
61
62- If the startup disk is nVME, under ``Advanced -> PCIe/PCI/PnP Configuration``
63 the option ``NVMe Firmware Source`` should be set to ``AMI Native Support``,
64 per `this Supermicro FAQ entry
65 <https://supermicro.com/support/faqs/faq.cfm?faq=28248>`_.
66
Zack Williams9026f532020-11-30 11:34:32 -070067Unknown MAC addresses
68---------------------
69
70Sometimes it's hard to find out all the MAC addresses assigned to network
71cards. These can be found in a variety of ways:
72
731. On servers, the BMC webpage will list the built-in network card MAC
74 addresses.
75
762. If you login to a server, ``ip link`` or ``ip addr`` will show the MAC
77 address of each interface, including on add-in cards.
78
793. If you can login to a server but don't know the BMC IP or MAC address for
80 that server, you can find it with ``sudo ipmitool lan print``.
81
824. If you don't have a login to the server, but can get to the management
83 server, ``ip neighbor`` will show the arp table of MAC addresses known to
84 that system. It's output is unsorted - ``ip neigh | sort`` is easier to
Zack Williams5fd7a232020-12-03 12:45:56 -070085 read. This can be useful for determining if there's a cabling problem -
86 a device plugged into the wrong port of the management switch could show up
87 in the DHCP pool range for a different segment.
Zack Williams9026f532020-11-30 11:34:32 -070088
89Cabling issues
90--------------
91
92The system may not come up correctly if cabling isn't connected properly.
93If you don't have hands-on with the cabling, here are some ways to check on the
94cabling remotely:
95
961. On servers you can check which ports are connected with ``ip link show``::
97
98 $ ip link show
99 ...
100 3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
101 link/ether 3c:ec:ef:4d:55:a8 brd ff:ff:ff:ff:ff:ff
102 ...
103 5: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
104 link/ether 3c:ec:ef:4d:55:a9 brd ff:ff:ff:ff:ff:ff
105
106 Ports that are up will show ``state UP``
107
1082. You can determine which remote ports are connected with LLDP, assuming that
109 the remote switch supports LLDP and has it enabled. This can be done with
110 ``networkctl lldp``, which shows both the name and the MAC address of the
111 connected switch on a per-link basis::
112
113 $ networkctl lldp
114 LINK CHASSIS ID SYSTEM NAME CAPS PORT ID PORT DESCRIPTION
115 eno1 10:4f:58:e7:d5:60 Aruba-2540-24PP ..b........ 10 10
116 eno2 10:4f:58:e7:d5:60 Aruba-2540-24PP ..b........ 1 1