blob: 46c30d6bc262fb7390bec4002cd7930d29b73551 [file] [log] [blame]
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
Charles Chan4a107222020-10-30 17:23:48 -07005Hardware Installation
6=====================
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00007
Zack Williams34c30e52020-11-16 10:55:00 -07008Hardware installation breaks down into a few steps:
9
101. `Planning`_
112. `Inventory`_
123. `Rackmount of Equipment`_
134. `Cabling and Network Topology`_
145. `Management Switch Bootstrap`_
156. `Management Server Bootstrap`_
167. `Server Software Bootstrap`_
17
18Installation of the fabric switch hardware is covered in :ref:`OS Installation
19- Switches <switch-install>`.
20
21Installation of the radio hardware is covered in :ref:`eNB Installation
22<enb-installation>`.
23
24Planning
25--------
26The planning of the network topology and devices, and required cabling
27
28Once planning is complete, equipment is ordered to match the plan.
29
30Network Cable Plan
31""""""""""""""""""
32
33If a 2x2 TOST fabric is used it should be configured as a :doc:`Single-Stage
34Leaf-Spine <trellis:supported-topology>`.
35
36- The links between each leaf and spine switch must be made up of two separate
37 cables.
38
39- Each compute server is dual-homed via a separate cable to two different leaf
40 switches (as in the "paired switches" diagrams).
41
42If only a single P4 switch is used, the :doc:`Simple
43<trellis:supported-topology>` topology is used, with two connections from each
44compute server to the single switch
45
46Additionally a non-fabric switch is required to provide a set of management
47networks. This management switch is configured with multiple VLANs to separate
48the management plane, fabric, and the out-of-band and lights out management
49connections on the equipment.
50
51Device Naming
52"""""""""""""
53
54Site Design and Bookkeeping
55"""""""""""""""""""""""""""
56
57The following items need to be added to `NetBox
58<https://netbox.readthedocs.io/en/stable>`_ to describe each edge site:
59
601. Add a Site for the edge (if one doesn't already exist), which has the
61 physical location and contact information for the edge.
62
632. Add Racks to the Site (if they don't already exist)
64
653. Add a Tenant for the edge (who owns/manages it), assigned to the ``Pronto``
66 or ``Aether`` Tenant Group.
67
684. Add a VRF (Routing Table) for the edge site.
69
705. Add a VLAN Group to the edge site, which groups the site's VLANs and
71 prevents duplication.
72
736. Add VLANs for the edge site. These should be assigned a VLAN Group, the
74 Site, and Tenant.
75
76 There can be multiple of the same VLAN in NetBox (VLANs are layer 2, and
77 local to the site), but not within the VLAN group.
78
79 The minimal list of VLANs:
80
81 * ADMIN 1
82 * UPLINK 10
83 * MGMT 800
84 * FAB 801
85
86 If you have multiple deployments at a site using the same management server,
87 add additional VLANs incremented by 10 for the MGMT/FAB - for example:
88
89 * DEVMGMT 810
90 * DEVFAB 801
91
927. Add IP Prefixes for the site. This should have the Tenant and VRF assigned.
93
94 All edge IP prefixes fit into a ``/22`` sized block.
95
96 The description of the Prefix contains the DNS suffix for all Devices that
97 have IP addresses within this Prefix. The full DNS names are generated by
98 combining the first ``<devname>`` component of the Device names with this
99 suffix.
100
Zack Williamsa7c170f2020-11-25 12:59:49 -0700101 An examples using the ``10.0.0.0/22`` block. There are 4 edge
Zack Williams34c30e52020-11-16 10:55:00 -0700102 prefixes, with the following purposes:
103
104 * ``10.0.0.0/25``
Zack Williamsa7c170f2020-11-25 12:59:49 -0700105
Zack Williams34c30e52020-11-16 10:55:00 -0700106 * Has the Server BMC/LOM and Management Switch
107 * Assign the ADMIN 1 VLAN
108 * Set the description to ``admin.<deployment>.<site>.aetherproject.net`` (or
109 ``prontoproject.net``).
110
111 * ``10.0.0.128/25``
Zack Williamsa7c170f2020-11-25 12:59:49 -0700112
Zack Williams34c30e52020-11-16 10:55:00 -0700113 * Has the Server Management plane, Fabric Switch Management/BMC
114 * Assign MGMT 800 VLAN
115 * Set the description to ``<deployment>.<site>.aetherproject.net`` (or
116 ``prontoproject.net``).
117
Zack Williamsa7c170f2020-11-25 12:59:49 -0700118 * ``10.0.1.0/25``
119
120 * IP addresses of the qsfp0 port of the Compute Nodes to Fabric switches, devices
121 connected to the Fabric like the eNB
Zack Williams34c30e52020-11-16 10:55:00 -0700122 * Assign FAB 801 VLAN
Zack Williamsa7c170f2020-11-25 12:59:49 -0700123 * Set the description to ``fab1.<deployment>.<site>.aetherproject.net`` (or
Zack Williams34c30e52020-11-16 10:55:00 -0700124 ``prontoproject.net``).
125
Zack Williamsa7c170f2020-11-25 12:59:49 -0700126 * ``10.0.1.128/25``
127
128 * IP addresses of the qsfp1 port of the Compute Nodes to fabric switches
129 * Assign FAB 801 VLAN
130 * Set the description to ``fab2.<deployment>.<site>.aetherproject.net`` (or
131 ``prontoproject.net``).
132
133 Additionally, these edge prefixes are used for Kubernetes but don't need to
134 be created in NetBox:
135
Zack Williams34c30e52020-11-16 10:55:00 -0700136 * ``10.0.2.0/24``
Zack Williamsa7c170f2020-11-25 12:59:49 -0700137
Zack Williams34c30e52020-11-16 10:55:00 -0700138 * Kubernetes Pod IP's
139
140 * ``10.0.3.0/24``
Zack Williamsa7c170f2020-11-25 12:59:49 -0700141
Zack Williams34c30e52020-11-16 10:55:00 -0700142 * Kubernetes Cluster IP's
143
1448. Add Devices to the site, for each piece of equipment. These are named with a
145 scheme similar to the DNS names used for the pod, given in this format::
146
147 <devname>.<deployment>.<site>
148
149 Examples::
150
151 mgmtserver1.ops1.tucson
152 node1.stage1.menlo
153
154 Note that these names are transformed into DNS names using the Prefixes, and
155 may have additional components - ``admin`` or ``fabric`` may be added after
156 the ``<devname>`` for devices on those networks.
157
158 Set the following fields when creating a device:
159
160 * Site
161 * Tenant
162 * Rack & Rack Position
163 * Serial number
164
165 If a specific Device Type doesn't exist for the device, it must be created,
166 which is detailed in the NetBox documentation, or ask the OPs team for help.
167
Zack Williamsa7c170f2020-11-25 12:59:49 -07001689. Add Services to the management server:
169
170 * name: ``dns``
171 protocol: UDP
172 port: 53
173
174 * name: ``tftp``
175 protocol: UDP
176 port: 69
177
178 These are used by the DHCP and DNS config to know which servers offer a
179 dns service and tftp.
180
18110. Set the MAC address for the physical interfaces on the device.
Zack Williams34c30e52020-11-16 10:55:00 -0700182
183 You may also need to add physical network interfaces if aren't already
184 created by the Device Type. An example would be if additional add-in
185 network cards were installed.
186
Zack Williamsa7c170f2020-11-25 12:59:49 -070018711. Add any virtual interfaces to the Devices. When creating a virtual
Zack Williams34c30e52020-11-16 10:55:00 -0700188 interface, it should have it's ``label`` field set to the physical network
189 interface that it is assigned
190
191 These are needed are two cases for the Pronto deployment:
192
193 1. On the Management Server, there should bet (at least) two VLAN
194 interfaces created attached to the ``eno2`` network port, which
195 are used to provide connectivity to the management plane and fabric.
196 These should be named ``<name of vlan><vlan ID>``, so the MGMT 800 VLAN
197 would become a virtual interface named ``mgmt800``, with the label
198 ``eno2``.
199
200 2. On the Fabric switches, the ``eth0`` port is shared between the OpenBMC
201 interface and the ONIE/ONL installation. Add a ``bmc`` virtual
Zack Williamsa7c170f2020-11-25 12:59:49 -0700202 interface with a label of ``eth0`` on each fabric switch, and check the
203 ``OOB Management`` checkbox.
Zack Williams34c30e52020-11-16 10:55:00 -0700204
Zack Williamsa7c170f2020-11-25 12:59:49 -070020512. Create IP addresses for the physical and virtual interfaces. These should
Zack Williams34c30e52020-11-16 10:55:00 -0700206 have the Tenant and VRF set.
207
208 The Management Server should always have the first IP address in each
209 range, and they should be incremental, in this order. Examples are given as
210 if there was a single instance of each device - adding additional devices
211 would increment the later IP addresses.
212
213 * Management Server
Zack Williamsa7c170f2020-11-25 12:59:49 -0700214
Zack Williams34c30e52020-11-16 10:55:00 -0700215 * ``eno1`` - site provided public IP address, or blank if DHCP
Zack Williamsa7c170f2020-11-25 12:59:49 -0700216 provided
217
Zack Williams34c30e52020-11-16 10:55:00 -0700218 * ``eno2`` - 10.0.0.1/25 (first of ADMIN) - set as primary IP
219 * ``bmc`` - 10.0.0.2/25 (next of ADMIN)
220 * ``mgmt800`` - 10.0.0.129/25 (first of MGMT)
Zack Williamsa7c170f2020-11-25 12:59:49 -0700221 * ``fab801`` - 10.0.1.1/25 (first of FAB)
Zack Williams34c30e52020-11-16 10:55:00 -0700222
223 * Management Switch
Zack Williamsa7c170f2020-11-25 12:59:49 -0700224
Zack Williams34c30e52020-11-16 10:55:00 -0700225 * ``gbe1`` - 10.0.0.3/25 (next of ADMIN) - set as primary IP
226
227 * Fabric Switch
Zack Williamsa7c170f2020-11-25 12:59:49 -0700228
Zack Williams34c30e52020-11-16 10:55:00 -0700229 * ``eth0`` - 10.0.0.130/25 (next of MGMT), set as primary IP
230 * ``bmc`` - 10.0.0.131/25
231
232 * Compute Server
Zack Williamsa7c170f2020-11-25 12:59:49 -0700233
Zack Williams34c30e52020-11-16 10:55:00 -0700234 * ``eth0`` - 10.0.0.132/25 (next of MGMT), set as primary IP
235 * ``bmc`` - 10.0.0.4/25 (next of ADMIN)
236 * ``qsfp0`` - 10.0.1.2/25 (next of FAB)
237 * ``qsfp1`` - 10.0.1.3/25
238
239 * Other Fabric devices (eNB, etc.)
Zack Williamsa7c170f2020-11-25 12:59:49 -0700240
Zack Williams34c30e52020-11-16 10:55:00 -0700241 * ``eth0`` or other primary interface - 10.0.1.4/25 (next of FAB)
242
Zack Williamsa7c170f2020-11-25 12:59:49 -070024313. Add DHCP ranges to the IP Prefixes for IP's that aren't reserved. These are
244 done like any other IP Address, but with the ``Status`` field is set to
245 ``DHCP``, and they'll consume the entire range of IP addresses given in the
246 CIDR mask.
247
248 For example ``10.0.0.32/27`` as a DHCP block would take up 1/4 of the ADMIN
249 prefix.
250
25114. Add Cables between physical interfaces on the devices
Zack Williams34c30e52020-11-16 10:55:00 -0700252
253 TODO: Explain the cabling topology
254
255Hardware
256""""""""
257
258Fabric Switches
259'''''''''''''''
260
261Pronto currently uses fabric switches based on the Intel (was Barefoot) Tofino
262chipset. There are multiple variants of this switching chipset, with different
263speeds and capabilities.
264
265The specific hardware models in use in Pronto:
266
267* `EdgeCore Wedge100BF-32X
268 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=335>`_
269 - a "Dual Pipe" chipset variant, used for the Spine switches
270
271* `EdgeCore Wedge100BF-32QS
272 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=770>`_
273 - a "Quad Pipe" chipset variant, used for the Leaf switches
274
275Compute Servers
276
277These servers run Kubernetes and edge applications.
278
279The requirements for these servers:
280
281* AMD64 (aka x86-64) architecture
282* Sufficient resources to run Kubernetes
283* Two 40GbE or 100GbE Ethernet connections to the fabric switches
284* One management 1GbE port
285
286The specific hardware models in use in Pronto:
287
288* `Supermicro 6019U-TRTP2
289 <https://www.supermicro.com/en/products/system/1U/6019/SYS-6019U-TRTP2.cfm>`_
290 1U server
291
292* `Supermicro 6029U-TR4
293 <https://www.supermicro.com/en/products/system/2U/6029/SYS-6029U-TR4.cfm>`_
294 2U server
295
296These servers are configured with:
297
298* 2x `Intel Xeon 5220R CPUs
299 <https://ark.intel.com/content/www/us/en/ark/products/199354/intel-xeon-gold-5220r-processor-35-75m-cache-2-20-ghz.html>`_,
300 each with 24 cores, 48 threads
301* 384GB of DDR4 Memory, made up with 12x 16GB ECC DIMMs
302* 2TB of nVME Flash Storage
303* 2x 6TB SATA Disk storage
304* 2x 40GbE ports using an XL710QDA2 NIC
305
306The 1U servers additionally have:
307
308- 2x 1GbE copper network ports
309- 2x 10GbE SFP+ network ports
310
311The 2U servers have:
312
313- 4x 1GbE copper network ports
314
315Management Server
316'''''''''''''''''
317
318One management server is required, which must have at least two 1GbE network
319ports, and runs a variety of network services to support the edge.
320
321The model used in Pronto is a `Supermicro 5019D-FTN4
322<https://www.supermicro.com/en/Aplus/system/Embedded/AS-5019D-FTN4.cfm>`_
323
324Which is configured with:
325
326* AMD Epyc 3251 CPU with 8 cores, 16 threads
327* 32GB of DDR4 memory, in 2x 16GB ECC DIMMs
328* 1TB of nVME Flash storage
329* 4x 1GbE copper network ports
330
331Management Switch
332'''''''''''''''''
333
334This switch connects the configuration interfaces and management networks on
335all the servers and switches together.
336
337In the Pronto deployment this hardware is a `HP/Aruba 2540 Series JL356A
338<https://www.arubanetworks.com/products/switches/access/2540-series/>`_.
339
340Inventory
341---------
342
343Once equipment arrives, any device needs to be recorded in inventory if it:
344
3451. Connects to the network (has a MAC address)
3462. Has a serial number
3473. Isn't a subcomponent (disk, add-in card, linecard, etc.) of a larger device.
348
349The following information should be recorded for every device:
350
351- Manufacturer
352- Model
353- Serial Number
354- MAC address (for the primary and any management/BMC/IPMI interfaces)
355
356This information should be be added to the corresponding Devices ONF NetBox
357instance. The accuracy of this information is very important as it is used in
358bootstrapping the systems.
359
360Once inventory has been completed, let the Infra team know, and the pxeboot
361configuration will be generated to have the OS preseed files corresponding to the
362new servers based on their serial numbers.
363
364Rackmount of Equipment
365----------------------
366
367Most of the Pronto equipment is in a 19" rackmount form factor.
368
369Guidelines for mounting this equipment:
370
371- The EdgeCore Wedge Switches have a front-to-back (aka "port-to-power") fan
372 configuration, so hot air exhaust is out the back of the switch near the
373 power inlets, away from the 32 QSFP network ports on the front of the switch.
374
375- The full-depth 1U and 2U Supermicro servers also have front-to-back airflow
376 but have most of their ports on the rear of the device.
377
378- Airflow through the rack should be in one direction to avoid heat being
379 pulled from one device into another. This means that to connect the QSFP
380 network ports from the servers to the switches, cabling should be routed
381 through the rack from front (switch) to back (server).
382
383- The short-depth management HP Switch and 1U Supermicro servers should be
384 mounted to the rear of the rack. They both don't generate an appreciable
385 amount of heat, so the airflow direction isn't a significant factor in
386 racking them.
387
388Cabling and Network Topology
389----------------------------
390
391TODO: Add diagrams of network here, and cabling plan
392
393Management Switch Bootstrap
394---------------------------
395
396TODO: Add instructions for bootstrapping management switch, from document that
397has the linked config file.
398
399Server Software Bootstrap
400-------------------------
401
402Management Server Bootstrap
403"""""""""""""""""""""""""""
404
405The management server is bootstrapped into a customized version of the standard
406Ubuntu 18.04 OS installer.
407
408The `iPXE boot firmware <https://ipxe.org/>`_. is used to start this process
409and is built using the steps detailed in the `ipxe-build
410<https://gerrit.opencord.org/plugins/gitiles/ipxe-build>`_. repo, which
411generates both USB and PXE chainloadable boot images.
412
413Once a system has been started using these images started, these images will
414download a customized script from an external webserver to continue the boot
415process. This iPXE to webserver connection is secured with mutual TLS
416authentication, enforced by the nginx webserver.
417
418The iPXE scripts are created by the `pxeboot
419<https://gerrit.opencord.org/plugins/gitiles/ansible/role/pxeboot>`_ role,
420which creates both a boot menu, downloads the appropriate binaries for
421bootstrapping an OS installation, and creates per-node installation preseed files.
422
423The preseed files contain configuration steps to install the OS from the
424upstream Ubuntu repos, as well as customization of packages and creating the
425``onfadmin`` user.
426
427TODO: convert instructions for bootstrapping the management server with iPXE here.
428
429Once the OS is installed on the management server, Ansible is used to remotely
430install software on the management server.
431
432To checkout the ONF ansible repo and enter the virtualenv with the tooling::
433
434 mkdir infra
435 cd infra
436 repo init -u ssh://<your gerrit username>@gerrit.opencord.org:29418/infra-manifest
437 repo sync
438 cd ansible
439 make galaxy
440 source venv_onfansible/bin/activate
441
Zack Williamsa7c170f2020-11-25 12:59:49 -0700442Obtain the ``undionly.kpxe`` iPXE artifact for bootstrapping the compute
443servers, and put it in the ``files`` directory.
444
Zack Williams34c30e52020-11-16 10:55:00 -0700445Next, create an inventory file to access the NetBox API. An example is given
446in ``inventory/example-netbox.yml`` - duplicate this file and modify it. Fill
447in the ``api_endpoint`` address and ``token`` with an API key you get out of
448the NetBox instance. List the IP Prefixes used by the site in the
449``ip_prefixes`` list.
450
451Next, run the ``scripts/netbox_edgeconfig.py`` to generate a host_vars file for
452the management server. Assuming that the management server in the edge is
453named ``mgmtserver1.stage1.menlo``, you'd run::
454
455 python scripts/netbox_edgeconfig.py inventory/my-netbox.yml > inventory/host_vars/mgmtserver1.stage1.menlo.yml
456
Zack Williamsa7c170f2020-11-25 12:59:49 -0700457One manual change needs to be made to this output - edit the
458``inventory/host_vars/mgmtserver1.stage1.menlo.yml`` file and add the following
459to the bottom of the file, replacing the IP addresses with the ones that the
460management server is configured with on each VLAN. This configures the `netplan
461<https://netplan.io>`_ on the management server, and will be automated away
462soon::
463
464 # added manually
465 netprep_netplan:
466 ethernets:
467 eno2:
468 addresses:
469 - 10.0.0.1/25
470 vlans:
471 mgmt800:
472 id: 800
473 link: eno2
474 addresses:
475 - 10.0.0.129/25
476 fabr801:
477 id: 801
478 link: eno2
479 addresses:
480 - 10.0.1.1/25
481
482Create an inventory file for the management server in
Zack Williams34c30e52020-11-16 10:55:00 -0700483``inventory/menlo-staging.ini`` which contains::
484
485 [mgmt]
486 mgmtserver1.stage1.menlo ansible_host=<public ip address> ansible_user="onfadmin" ansible_become_password=<password>
487
488Then, to configure a management server, run::
489
Zack Williamsa7c170f2020-11-25 12:59:49 -0700490 ansible-playbook -i inventory/menlo-staging.ini playbooks/aethermgmt-playbook.yml
Zack Williams34c30e52020-11-16 10:55:00 -0700491
492This installs software with the following functionality:
493
494- VLANs on second Ethernet port to provide connectivity to the rest of the pod.
495- Firewall with NAT for routing traffic
496- DHCP and TFTP for bootstrapping servers and switches
497- DNS for host naming and identification
Zack Williamsa7c170f2020-11-25 12:59:49 -0700498- HTTP server for serving files used for bootstrapping switches
Zack Williams34c30e52020-11-16 10:55:00 -0700499
500Compute Server Bootstrap
501""""""""""""""""""""""""
502
503Once the management server has finished installation, it will be set to offer
504the same iPXE bootstrap file to the computer.
505
506Each node will be booted, and when iPXE loads select the ``Ubuntu 18.04
507Installer (fully automatic)`` option.
Zack Williamsa7c170f2020-11-25 12:59:49 -0700508
509The nodes can be controlled remotely via their BMC management interfaces - if
510the BMC is at ``10.0.0.3`` a remote user can SSH into them with::
511
512 ssh -L 2443:10.0.0.3:443 onfadmin@<mgmt server ip>
513
514And then use their web browser to access the BMC at::
515
516 https://localhost:2443
517
518The default BMC credentials for the Pronto nodes are::
519
520 login: ADMIN
521 password: Admin123
522
523Once these nodes are brought up, the installation can continue.