blob: fcdab9e872f4c8cd69856c331971a48ce6e4174c [file] [log] [blame]
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
5=============
6Bootstrapping
7=============
8
Zack Williams34c30e52020-11-16 10:55:00 -07009.. _switch-install:
10
Hyunsun Moon239df822020-11-23 21:40:28 -080011OS Installation - Switches
12==========================
13
14.. note::
15
16 This part will be done automatically once we have a DHCP and HTTP server set up in the infrastructure.
17 For now, we need to download and install the ONL image manually.
18
19Install ONL with Docker
20-----------------------
21First, enter **ONIE rescue mode**.
22
23Set up IP and route
24^^^^^^^^^^^^^^^^^^^
25.. code-block:: console
26
27 # ip addr add 10.92.1.81/24 dev eth0
28 # ip route add default via 10.92.1.1
29
30- `10.92.1.81/24` should be replaced by the actual IP and subnet of the ONL.
31- `10.92.1.1` should be replaced by the actual default gateway.
32
33Download and install ONL
34^^^^^^^^^^^^^^^^^^^^^^^^
35
36.. code-block:: console
37
38 # wget https://github.com/opennetworkinglab/OpenNetworkLinux/releases/download/v1.3.2/ONL-onf-ONLPv2_ONL-OS_2020-10-09.1741-f7428f2_AMD64_INSTALLED_INSTALLER
39 # sh ONL-onf-ONLPv2_ONL-OS_2020-10-09.1741-f7428f2_AMD64_INSTALLED_INSTALLER
40
41The switch will reboot automatically once the installer is done.
42
43.. note::
44
45 Alternatively, we can `scp` the ONL installer into ONIE manually.
46
47Setup BMC for remote console access
48-----------------------------------
49Log in to the BMC from ONL by
50
51.. code-block:: console
52
53 # ssh root@192.168.0.1 # pass: 0penBmc
54
55on `usb0` interface.
56
57Once you are in the BMC, run the following commands to setup IP and route (or offer a fixed IP with DHCP)
58
59.. code-block:: console
60
61 # ip addr add 10.92.1.85/24 dev eth0
62 # ip route add default via 10.92.1.1
63
64- `10.92.1.85/24` should be replaced by the actual IP and subnet of the BMC.
65 Note that it should be different from the ONL IP.
66- `10.92.1.1` should be replaced by the actual default gateway.
67
68BMC uses the same ethernet port as ONL management so you should give it an IP address in the same subnet.
69BMC address will preserve during ONL reboot, but won’t be preserved during power outage.
70
71To log in to ONL console from BMC, run
72
73.. code-block:: console
74
75 # /usr/local/bin/sol.sh
76
77If `sol.sh` is unresponsive, please try to restart the mainboard with
78
79.. code-block:: console
80
81 # wedge_power.sh restart
82
83Setup network and host name for ONL
84-----------------------------------
85
86.. code-block:: console
87
88 # hostnamectl set-hostname <host-name>
89
90 # vim.tiny /etc/hosts # update accordingly
91 # cat /etc/hosts # example
92 127.0.0.1 localhost
93 10.92.1.81 menlo-staging-spine-1
94
95 # vim.tiny /etc/network/interfaces.d/ma1 # update accordingly
96 # cat /etc/network/interfaces.d/ma1 # example
97 auto ma1
98 iface ma1 inet static
99 address 10.92.1.81
100 netmask 255.255.255.0
101 gateway 10.92.1.1
102 dns-nameservers 8.8.8.8
103
Hyunsun Moona79c7422020-11-18 04:52:56 -0800104VPN
105===
106This section walks you through how to set up a VPN between ACE and Aether Central in GCP.
107We will be using GitOps based Aether CD pipeline for this,
108so we just need to create a patch to **aether-pod-configs** repository.
109Note that some of the steps described here are not directly related to setting up a VPN,
110but rather are a prerequisite for adding a new ACE.
111
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800112.. attention::
113
114 If you are adding another ACE to an existing VPN connection, go to
115 :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
116
Hyunsun Moona79c7422020-11-18 04:52:56 -0800117Before you begin
118----------------
119* Make sure firewall in front of ACE allows UDP port 500, UDP port 4500, and ESP packets
120 from **gcpvpn1.infra.aetherproject.net(35.242.47.15)** and **gcpvpn2.infra.aetherproject.net(34.104.68.78)**
121* Make sure that the external IP on ACE side is owned by or routed to the management node
122
123To help your understanding, the following sample ACE environment will be used in the rest of this section.
124Make sure to replace the sample values when you actually create a review request.
125
126+-----------------------------+----------------------------------+
127| Management node external IP | 128.105.144.189 |
128+-----------------------------+----------------------------------+
129| ASN | 65003 |
130+-----------------------------+----------------------------------+
131| GCP BGP IP address | Tunnel 1: 169.254.0.9/30 |
132| +----------------------------------+
133| | Tunnel 2: 169.254.1.9/30 |
134+-----------------------------+----------------------------------+
135| ACE BGP IP address | Tunnel 1: 169.254.0.10/30 |
136| +----------------------------------+
137| | Tunnel 2: 169.254.1.10/30 |
138+-----------------------------+----------------------------------+
139| PSK | UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI |
140+-----------------------------+----------------------------------+
141| Management Subnet | 10.91.0.0/24 |
142+-----------------------------+----------------------------------+
143| K8S Subnet | Pod IP: 10.66.0.0/17 |
144| +----------------------------------+
145| | Cluster IP: 10.66.128.0/17 |
146+-----------------------------+----------------------------------+
147
Hyunsun Moona79c7422020-11-18 04:52:56 -0800148Download aether-pod-configs repository
149--------------------------------------
150.. code-block:: shell
151
152 $ cd $WORKDIR
153 $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
154
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800155.. _update_global_resource:
156
Hyunsun Moona79c7422020-11-18 04:52:56 -0800157Update global resource maps
158---------------------------
159Add a new ACE information at the end of the following global resource maps.
160
161* user_map.tfvars
162* cluster_map.tfvars
163* vpn_map.tfvars
164
165As a note, you can find several other global resource maps under the `production` directory.
166Resource definitions that need to be shared among clusters or are better managed in a
167single file to avoid configuration conflicts are maintained in this way.
168
169.. code-block:: diff
170
171 $ cd $WORKDIR/aether-pod-configs/production
172 $ vi user_map.tfvars
173
174 # Add the new cluster admin user at the end of the map
175 $ git diff user_map.tfvars
176 --- a/production/user_map.tfvars
177 +++ b/production/user_map.tfvars
178 @@ user_map = {
179 username = "menlo"
180 password = "changeme"
181 global_roles = ["user-base", "catalogs-use"]
182 + },
183 + test_admin = {
184 + username = "test"
185 + password = "changeme"
186 + global_roles = ["user-base", "catalogs-use"]
187 }
188 }
189
190.. code-block:: diff
191
192 $ cd $WORKDIR/aether-pod-configs/production
193 $ vi cluster_map.tfvars
194
195 # Add the new K8S cluster information at the end of the map
196 $ git diff cluster_map.tfvars
197 --- a/production/cluster_map.tfvars
198 +++ b/production/cluster_map.tfvars
199 @@ cluster_map = {
200 kube_dns_cluster_ip = "10.53.128.10"
201 cluster_domain = "prd.menlo.aetherproject.net"
202 calico_ip_detect_method = "can-reach=www.google.com"
203 + },
204 + ace-test = {
205 + cluster_name = "ace-test"
206 + management_subnets = ["10.91.0.0/24"]
207 + k8s_version = "v1.18.8-rancher1-1"
208 + k8s_pod_range = "10.66.0.0/17"
209 + k8s_cluster_ip_range = "10.66.128.0/17"
210 + kube_dns_cluster_ip = "10.66.128.10"
211 + cluster_domain = "prd.test.aetherproject.net"
212 + calico_ip_detect_method = "can-reach=www.google.com"
213 }
214 }
215 }
216
217.. code-block:: diff
218
219 $ cd $WORKDIR/aether-pod-configs/production
220 $ vi vpn_map.tfvars
221
222 # Add VPN and tunnel information at the end of the map
223 $ git diff vpn_map.tfvars
224 --- a/production/vpn_map.tfvars
225 +++ b/production/vpn_map.tfvars
226 @@ vpn_map = {
227 bgp_peer_ip_address_1 = "169.254.0.6"
228 bgp_peer_ip_range_2 = "169.254.1.5/30"
229 bgp_peer_ip_address_2 = "169.254.1.6"
230 + },
231 + ace-test = {
232 + peer_name = "production-ace-test"
233 + peer_vpn_gateway_address = "128.105.144.189"
234 + tunnel_shared_secret = "UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI"
235 + bgp_peer_asn = "65003"
236 + bgp_peer_ip_range_1 = "169.254.0.9/30"
237 + bgp_peer_ip_address_1 = "169.254.0.10"
238 + bgp_peer_ip_range_2 = "169.254.1.9/30"
239 + bgp_peer_ip_address_2 = "169.254.1.10"
240 }
241 }
242
243.. note::
244 Unless you have a specific requirement, set ASN and BGP addresses to the next available values in the map.
245
246
247Create ACE specific configurations
248----------------------------------
249In this step, we will create a directory under `production` with the same name as ACE,
250and add several Terraform configurations and Ansible inventory needed to configure a VPN connection.
251Throughout the deployment procedure, this directory will contain all ACE specific configurations.
252
253Run the following commands to auto-generate necessary files under the target ACE directory.
254
255.. code-block:: shell
256
257 $ cd $WORKDIR/aether-pod-configs/tools
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800258 $ cp ace_env /tmp/ace_env
259 $ vi /tmp/ace_env
Hyunsun Moona79c7422020-11-18 04:52:56 -0800260 # Set environment variables
261
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800262 $ source /tmp/ace_env
Hyunsun Moona79c7422020-11-18 04:52:56 -0800263 $ make vpn
264 Created ../production/ace-test
265 Created ../production/ace-test/main.tf
266 Created ../production/ace-test/variables.tf
267 Created ../production/ace-test/gcp_fw.tf
268 Created ../production/ace-test/gcp_ha_vpn.tf
269 Created ../production/ace-test/ansible
270 Created ../production/ace-test/backend.tf
271 Created ../production/ace-test/cluster_val.tfvars
272 Created ../production/ace-test/ansible/hosts.ini
273 Created ../production/ace-test/ansible/extra_vars.yml
274
275.. attention::
276 The predefined templates are tailored to Pronto BOM. You'll need to fix `cluster_val.tfvars` and `ansible/extra_vars.yml`
277 when using a different BOM.
278
279Create a review request
280-----------------------
281.. code-block:: shell
282
283 $ cd $WORKDIR/aether-pod-configs/production
284 $ git status
285 On branch tools
286 Changes not staged for commit:
287
288 modified: cluster_map.tfvars
289 modified: user_map.tfvars
290 modified: vpn_map.tfvars
291
292 Untracked files:
293 (use "git add <file>..." to include in what will be committed)
294
295 ace-test/
296
297 $ git add .
298 $ git commit -m "Add test ACE"
299 $ git review
300
301Once the review request is accepted and merged,
302CD pipeline will create VPN tunnels on both GCP and the management node.
303
304Verify VPN connection
305---------------------
306You can verify the VPN connections after successful post-merge job
307by checking the routing table on the management node and trying to ping to one of the central cluster VMs.
308Make sure two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`, exist
309and three additional routing entries via one of the tunnel interfaces.
310
311.. code-block:: shell
312
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800313 # Verify routings
Hyunsun Moona79c7422020-11-18 04:52:56 -0800314 $ netstat -rn
315 Kernel IP routing table
316 Destination Gateway Genmask Flags MSS Window irtt Iface
317 0.0.0.0 128.105.144.1 0.0.0.0 UG 0 0 0 eno1
318 10.45.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
319 10.52.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
320 10.66.128.0 10.91.0.8 255.255.128.0 UG 0 0 0 eno1
321 10.91.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eno1
322 10.168.0.0 169.254.0.9 255.255.240.0 UG 0 0 0 gcp_tunnel1
323 128.105.144.0 0.0.0.0 255.255.252.0 U 0 0 0 eno1
324 169.254.0.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel1
325 169.254.1.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel2
326
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800327 # Verify ACC VM access
328 $ ping 10.168.0.6
Hyunsun Moona79c7422020-11-18 04:52:56 -0800329
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800330 # Verify ACC K8S cluster access
331 $ nslookup kube-dns.kube-system.svc.prd.acc.gcp.aetherproject.net 10.52.128.10
332
333You can further verify whether the ACE routes are propagated well to GCP
334by checking GCP dashboard **VPC Network > Routes > Dynamic**.
335
Hyunsun Moona79c7422020-11-18 04:52:56 -0800336
337Post VPN setup
338--------------
339Once you verify the VPN connections, please update `ansible` directory name to `_ansible` to prevent
340the ansible playbook from running again.
341Note that it is no harm to re-run the ansible playbook but not recommended.
342
343.. code-block:: shell
344
345 $ cd $WORKDIR/aether-pod-configs/production/$ACE_NAME
346 $ mv ansible _ansible
347 $ git add .
348 $ git commit -m "Mark ansible done for test ACE"
349 $ git review
350
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800351.. _add_ace_to_vpn:
352
353Add another ACE to an existing VPN connection
354---------------------------------------------
355VPN connections can be shared when there are multiple ACE clusters in a site.
356In order to add ACE to an existing VPN connection,
357you'll have to SSH into the management node and manually update BIRD configuration.
358
359.. note::
360
361 This step needs improvements in the future.
362
363.. code-block:: shell
364
365 $ sudo vi /etc/bird/bird.conf
366 protocol static {
367 ...
368 route 10.66.128.0/17 via 10.91.0.10;
369
370 # Add routings for the new ACE's K8S cluster IP range via cluster nodes
371 # TODO: Configure iBGP peering with Calico nodes and dynamically learn these routings
372 route <NEW-ACE-CLUSTER-IP> via <SERVER1>
373 route <NEW-ACE-CLUSTER-IP> via <SERVER2>
374 route <NEW-ACE-CLUSTER-IP> via <SERVER3>
375 }
376
377 filter gcp_tunnel_out {
378 # Add the new ACE's K8S cluster IP range and the management subnet if required to the list
379 if (net ~ [ 10.91.0.0/24, 10.66.128.0/17, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
380 else reject;
381 }
382 # Save and exit
383
384 $ sudo birdc configure
385
386 # Confirm the static routes are added
387 $ sudo birdc show route
Hyunsun Moona79c7422020-11-18 04:52:56 -0800388