blob: cc815649089d402d49683d114c92bb78ed2a394f [file] [log] [blame]
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
Charles Chan4a107222020-10-30 17:23:48 -07005Hardware Installation
6=====================
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00007
Zack Williams34c30e52020-11-16 10:55:00 -07008Hardware installation breaks down into a few steps:
9
101. `Planning`_
112. `Inventory`_
123. `Rackmount of Equipment`_
134. `Cabling and Network Topology`_
145. `Management Switch Bootstrap`_
156. `Management Server Bootstrap`_
167. `Server Software Bootstrap`_
17
18Installation of the fabric switch hardware is covered in :ref:`OS Installation
19- Switches <switch-install>`.
20
21Installation of the radio hardware is covered in :ref:`eNB Installation
22<enb-installation>`.
23
24Planning
25--------
26The planning of the network topology and devices, and required cabling
27
28Once planning is complete, equipment is ordered to match the plan.
29
30Network Cable Plan
31""""""""""""""""""
32
33If a 2x2 TOST fabric is used it should be configured as a :doc:`Single-Stage
34Leaf-Spine <trellis:supported-topology>`.
35
36- The links between each leaf and spine switch must be made up of two separate
37 cables.
38
39- Each compute server is dual-homed via a separate cable to two different leaf
40 switches (as in the "paired switches" diagrams).
41
42If only a single P4 switch is used, the :doc:`Simple
43<trellis:supported-topology>` topology is used, with two connections from each
44compute server to the single switch
45
46Additionally a non-fabric switch is required to provide a set of management
47networks. This management switch is configured with multiple VLANs to separate
48the management plane, fabric, and the out-of-band and lights out management
49connections on the equipment.
50
51Device Naming
52"""""""""""""
53
54Site Design and Bookkeeping
55"""""""""""""""""""""""""""
56
57The following items need to be added to `NetBox
58<https://netbox.readthedocs.io/en/stable>`_ to describe each edge site:
59
601. Add a Site for the edge (if one doesn't already exist), which has the
61 physical location and contact information for the edge.
62
632. Add Racks to the Site (if they don't already exist)
64
653. Add a Tenant for the edge (who owns/manages it), assigned to the ``Pronto``
66 or ``Aether`` Tenant Group.
67
684. Add a VRF (Routing Table) for the edge site.
69
705. Add a VLAN Group to the edge site, which groups the site's VLANs and
71 prevents duplication.
72
736. Add VLANs for the edge site. These should be assigned a VLAN Group, the
74 Site, and Tenant.
75
76 There can be multiple of the same VLAN in NetBox (VLANs are layer 2, and
77 local to the site), but not within the VLAN group.
78
79 The minimal list of VLANs:
80
81 * ADMIN 1
82 * UPLINK 10
83 * MGMT 800
84 * FAB 801
85
86 If you have multiple deployments at a site using the same management server,
87 add additional VLANs incremented by 10 for the MGMT/FAB - for example:
88
89 * DEVMGMT 810
90 * DEVFAB 801
91
927. Add IP Prefixes for the site. This should have the Tenant and VRF assigned.
93
94 All edge IP prefixes fit into a ``/22`` sized block.
95
96 The description of the Prefix contains the DNS suffix for all Devices that
97 have IP addresses within this Prefix. The full DNS names are generated by
98 combining the first ``<devname>`` component of the Device names with this
99 suffix.
100
101 An examples using the ``10.0.0.0/22`` block. There are 5 edge
102 prefixes, with the following purposes:
103
104 * ``10.0.0.0/25``
105 * Has the Server BMC/LOM and Management Switch
106 * Assign the ADMIN 1 VLAN
107 * Set the description to ``admin.<deployment>.<site>.aetherproject.net`` (or
108 ``prontoproject.net``).
109
110 * ``10.0.0.128/25``
111 * Has the Server Management plane, Fabric Switch Management/BMC
112 * Assign MGMT 800 VLAN
113 * Set the description to ``<deployment>.<site>.aetherproject.net`` (or
114 ``prontoproject.net``).
115
116 * ``10.0.1.0/24``
117 * Has Compute Node Fabric Connections, devices connected to the Fabric like the eNB
118 * Assign FAB 801 VLAN
119 * Set the description to ``fabric.<deployment>.<site>.aetherproject.net`` (or
120 ``prontoproject.net``).
121
122 * ``10.0.2.0/24``
123 * Kubernetes Pod IP's
124
125 * ``10.0.3.0/24``
126 * Kubernetes Cluster IP's
127
1288. Add Devices to the site, for each piece of equipment. These are named with a
129 scheme similar to the DNS names used for the pod, given in this format::
130
131 <devname>.<deployment>.<site>
132
133 Examples::
134
135 mgmtserver1.ops1.tucson
136 node1.stage1.menlo
137
138 Note that these names are transformed into DNS names using the Prefixes, and
139 may have additional components - ``admin`` or ``fabric`` may be added after
140 the ``<devname>`` for devices on those networks.
141
142 Set the following fields when creating a device:
143
144 * Site
145 * Tenant
146 * Rack & Rack Position
147 * Serial number
148
149 If a specific Device Type doesn't exist for the device, it must be created,
150 which is detailed in the NetBox documentation, or ask the OPs team for help.
151
1529. Set the MAC address for the physical interfaces on the device.
153
154 You may also need to add physical network interfaces if aren't already
155 created by the Device Type. An example would be if additional add-in
156 network cards were installed.
157
15810. Add any virtual interfaces to the Devices. When creating a virtual
159 interface, it should have it's ``label`` field set to the physical network
160 interface that it is assigned
161
162 These are needed are two cases for the Pronto deployment:
163
164 1. On the Management Server, there should bet (at least) two VLAN
165 interfaces created attached to the ``eno2`` network port, which
166 are used to provide connectivity to the management plane and fabric.
167 These should be named ``<name of vlan><vlan ID>``, so the MGMT 800 VLAN
168 would become a virtual interface named ``mgmt800``, with the label
169 ``eno2``.
170
171 2. On the Fabric switches, the ``eth0`` port is shared between the OpenBMC
172 interface and the ONIE/ONL installation. Add a ``bmc`` virtual
173 interface with a label of ``eth0`` on each fabric switch.
174
17511. Create IP addresses for the physical and virtual interfaces. These should
176 have the Tenant and VRF set.
177
178 The Management Server should always have the first IP address in each
179 range, and they should be incremental, in this order. Examples are given as
180 if there was a single instance of each device - adding additional devices
181 would increment the later IP addresses.
182
183 * Management Server
184 * ``eno1`` - site provided public IP address, or blank if DHCP
185 * ``eno2`` - 10.0.0.1/25 (first of ADMIN) - set as primary IP
186 * ``bmc`` - 10.0.0.2/25 (next of ADMIN)
187 * ``mgmt800`` - 10.0.0.129/25 (first of MGMT)
188 * ``fab801`` - 10.0.1.1/24 (first of FAB)
189
190 * Management Switch
191 * ``gbe1`` - 10.0.0.3/25 (next of ADMIN) - set as primary IP
192
193 * Fabric Switch
194 * ``eth0`` - 10.0.0.130/25 (next of MGMT), set as primary IP
195 * ``bmc`` - 10.0.0.131/25
196
197 * Compute Server
198 * ``eth0`` - 10.0.0.132/25 (next of MGMT), set as primary IP
199 * ``bmc`` - 10.0.0.4/25 (next of ADMIN)
200 * ``qsfp0`` - 10.0.1.2/25 (next of FAB)
201 * ``qsfp1`` - 10.0.1.3/25
202
203 * Other Fabric devices (eNB, etc.)
204 * ``eth0`` or other primary interface - 10.0.1.4/25 (next of FAB)
205
20610. Add Cables between physical interfaces on the devices
207
208 TODO: Explain the cabling topology
209
210Hardware
211""""""""
212
213Fabric Switches
214'''''''''''''''
215
216Pronto currently uses fabric switches based on the Intel (was Barefoot) Tofino
217chipset. There are multiple variants of this switching chipset, with different
218speeds and capabilities.
219
220The specific hardware models in use in Pronto:
221
222* `EdgeCore Wedge100BF-32X
223 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=335>`_
224 - a "Dual Pipe" chipset variant, used for the Spine switches
225
226* `EdgeCore Wedge100BF-32QS
227 <https://www.edge-core.com/productsInfo.php?cls=1&cls2=180&cls3=181&id=770>`_
228 - a "Quad Pipe" chipset variant, used for the Leaf switches
229
230Compute Servers
231
232These servers run Kubernetes and edge applications.
233
234The requirements for these servers:
235
236* AMD64 (aka x86-64) architecture
237* Sufficient resources to run Kubernetes
238* Two 40GbE or 100GbE Ethernet connections to the fabric switches
239* One management 1GbE port
240
241The specific hardware models in use in Pronto:
242
243* `Supermicro 6019U-TRTP2
244 <https://www.supermicro.com/en/products/system/1U/6019/SYS-6019U-TRTP2.cfm>`_
245 1U server
246
247* `Supermicro 6029U-TR4
248 <https://www.supermicro.com/en/products/system/2U/6029/SYS-6029U-TR4.cfm>`_
249 2U server
250
251These servers are configured with:
252
253* 2x `Intel Xeon 5220R CPUs
254 <https://ark.intel.com/content/www/us/en/ark/products/199354/intel-xeon-gold-5220r-processor-35-75m-cache-2-20-ghz.html>`_,
255 each with 24 cores, 48 threads
256* 384GB of DDR4 Memory, made up with 12x 16GB ECC DIMMs
257* 2TB of nVME Flash Storage
258* 2x 6TB SATA Disk storage
259* 2x 40GbE ports using an XL710QDA2 NIC
260
261The 1U servers additionally have:
262
263- 2x 1GbE copper network ports
264- 2x 10GbE SFP+ network ports
265
266The 2U servers have:
267
268- 4x 1GbE copper network ports
269
270Management Server
271'''''''''''''''''
272
273One management server is required, which must have at least two 1GbE network
274ports, and runs a variety of network services to support the edge.
275
276The model used in Pronto is a `Supermicro 5019D-FTN4
277<https://www.supermicro.com/en/Aplus/system/Embedded/AS-5019D-FTN4.cfm>`_
278
279Which is configured with:
280
281* AMD Epyc 3251 CPU with 8 cores, 16 threads
282* 32GB of DDR4 memory, in 2x 16GB ECC DIMMs
283* 1TB of nVME Flash storage
284* 4x 1GbE copper network ports
285
286Management Switch
287'''''''''''''''''
288
289This switch connects the configuration interfaces and management networks on
290all the servers and switches together.
291
292In the Pronto deployment this hardware is a `HP/Aruba 2540 Series JL356A
293<https://www.arubanetworks.com/products/switches/access/2540-series/>`_.
294
295Inventory
296---------
297
298Once equipment arrives, any device needs to be recorded in inventory if it:
299
3001. Connects to the network (has a MAC address)
3012. Has a serial number
3023. Isn't a subcomponent (disk, add-in card, linecard, etc.) of a larger device.
303
304The following information should be recorded for every device:
305
306- Manufacturer
307- Model
308- Serial Number
309- MAC address (for the primary and any management/BMC/IPMI interfaces)
310
311This information should be be added to the corresponding Devices ONF NetBox
312instance. The accuracy of this information is very important as it is used in
313bootstrapping the systems.
314
315Once inventory has been completed, let the Infra team know, and the pxeboot
316configuration will be generated to have the OS preseed files corresponding to the
317new servers based on their serial numbers.
318
319Rackmount of Equipment
320----------------------
321
322Most of the Pronto equipment is in a 19" rackmount form factor.
323
324Guidelines for mounting this equipment:
325
326- The EdgeCore Wedge Switches have a front-to-back (aka "port-to-power") fan
327 configuration, so hot air exhaust is out the back of the switch near the
328 power inlets, away from the 32 QSFP network ports on the front of the switch.
329
330- The full-depth 1U and 2U Supermicro servers also have front-to-back airflow
331 but have most of their ports on the rear of the device.
332
333- Airflow through the rack should be in one direction to avoid heat being
334 pulled from one device into another. This means that to connect the QSFP
335 network ports from the servers to the switches, cabling should be routed
336 through the rack from front (switch) to back (server).
337
338- The short-depth management HP Switch and 1U Supermicro servers should be
339 mounted to the rear of the rack. They both don't generate an appreciable
340 amount of heat, so the airflow direction isn't a significant factor in
341 racking them.
342
343Cabling and Network Topology
344----------------------------
345
346TODO: Add diagrams of network here, and cabling plan
347
348Management Switch Bootstrap
349---------------------------
350
351TODO: Add instructions for bootstrapping management switch, from document that
352has the linked config file.
353
354Server Software Bootstrap
355-------------------------
356
357Management Server Bootstrap
358"""""""""""""""""""""""""""
359
360The management server is bootstrapped into a customized version of the standard
361Ubuntu 18.04 OS installer.
362
363The `iPXE boot firmware <https://ipxe.org/>`_. is used to start this process
364and is built using the steps detailed in the `ipxe-build
365<https://gerrit.opencord.org/plugins/gitiles/ipxe-build>`_. repo, which
366generates both USB and PXE chainloadable boot images.
367
368Once a system has been started using these images started, these images will
369download a customized script from an external webserver to continue the boot
370process. This iPXE to webserver connection is secured with mutual TLS
371authentication, enforced by the nginx webserver.
372
373The iPXE scripts are created by the `pxeboot
374<https://gerrit.opencord.org/plugins/gitiles/ansible/role/pxeboot>`_ role,
375which creates both a boot menu, downloads the appropriate binaries for
376bootstrapping an OS installation, and creates per-node installation preseed files.
377
378The preseed files contain configuration steps to install the OS from the
379upstream Ubuntu repos, as well as customization of packages and creating the
380``onfadmin`` user.
381
382TODO: convert instructions for bootstrapping the management server with iPXE here.
383
384Once the OS is installed on the management server, Ansible is used to remotely
385install software on the management server.
386
387To checkout the ONF ansible repo and enter the virtualenv with the tooling::
388
389 mkdir infra
390 cd infra
391 repo init -u ssh://<your gerrit username>@gerrit.opencord.org:29418/infra-manifest
392 repo sync
393 cd ansible
394 make galaxy
395 source venv_onfansible/bin/activate
396
397Next, create an inventory file to access the NetBox API. An example is given
398in ``inventory/example-netbox.yml`` - duplicate this file and modify it. Fill
399in the ``api_endpoint`` address and ``token`` with an API key you get out of
400the NetBox instance. List the IP Prefixes used by the site in the
401``ip_prefixes`` list.
402
403Next, run the ``scripts/netbox_edgeconfig.py`` to generate a host_vars file for
404the management server. Assuming that the management server in the edge is
405named ``mgmtserver1.stage1.menlo``, you'd run::
406
407 python scripts/netbox_edgeconfig.py inventory/my-netbox.yml > inventory/host_vars/mgmtserver1.stage1.menlo.yml
408
409And create an inventory file for the management server in
410``inventory/menlo-staging.ini`` which contains::
411
412 [mgmt]
413 mgmtserver1.stage1.menlo ansible_host=<public ip address> ansible_user="onfadmin" ansible_become_password=<password>
414
415Then, to configure a management server, run::
416
417 ansible-playbook -i inventory/menlo-staging.ini playbooks/prontomgmt-playbook.yml
418
419This installs software with the following functionality:
420
421- VLANs on second Ethernet port to provide connectivity to the rest of the pod.
422- Firewall with NAT for routing traffic
423- DHCP and TFTP for bootstrapping servers and switches
424- DNS for host naming and identification
425- HTTP server for serving files used for bootstrapping other equipment
426
427Compute Server Bootstrap
428""""""""""""""""""""""""
429
430Once the management server has finished installation, it will be set to offer
431the same iPXE bootstrap file to the computer.
432
433Each node will be booted, and when iPXE loads select the ``Ubuntu 18.04
434Installer (fully automatic)`` option.