blob: 093eebe906caf39b3202603b7b187c4d9ab6e613 [file] [log] [blame]
Hung-Wei Chiu77c969e2020-10-23 18:13:07 +00001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
5=============
6Bootstrapping
7=============
8
Hyunsun Moon239df822020-11-23 21:40:28 -08009OS Installation - Switches
10==========================
11
12.. note::
13
14 This part will be done automatically once we have a DHCP and HTTP server set up in the infrastructure.
15 For now, we need to download and install the ONL image manually.
16
17Install ONL with Docker
18-----------------------
19First, enter **ONIE rescue mode**.
20
21Set up IP and route
22^^^^^^^^^^^^^^^^^^^
23.. code-block:: console
24
25 # ip addr add 10.92.1.81/24 dev eth0
26 # ip route add default via 10.92.1.1
27
28- `10.92.1.81/24` should be replaced by the actual IP and subnet of the ONL.
29- `10.92.1.1` should be replaced by the actual default gateway.
30
31Download and install ONL
32^^^^^^^^^^^^^^^^^^^^^^^^
33
34.. code-block:: console
35
36 # wget https://github.com/opennetworkinglab/OpenNetworkLinux/releases/download/v1.3.2/ONL-onf-ONLPv2_ONL-OS_2020-10-09.1741-f7428f2_AMD64_INSTALLED_INSTALLER
37 # sh ONL-onf-ONLPv2_ONL-OS_2020-10-09.1741-f7428f2_AMD64_INSTALLED_INSTALLER
38
39The switch will reboot automatically once the installer is done.
40
41.. note::
42
43 Alternatively, we can `scp` the ONL installer into ONIE manually.
44
45Setup BMC for remote console access
46-----------------------------------
47Log in to the BMC from ONL by
48
49.. code-block:: console
50
51 # ssh root@192.168.0.1 # pass: 0penBmc
52
53on `usb0` interface.
54
55Once you are in the BMC, run the following commands to setup IP and route (or offer a fixed IP with DHCP)
56
57.. code-block:: console
58
59 # ip addr add 10.92.1.85/24 dev eth0
60 # ip route add default via 10.92.1.1
61
62- `10.92.1.85/24` should be replaced by the actual IP and subnet of the BMC.
63 Note that it should be different from the ONL IP.
64- `10.92.1.1` should be replaced by the actual default gateway.
65
66BMC uses the same ethernet port as ONL management so you should give it an IP address in the same subnet.
67BMC address will preserve during ONL reboot, but won’t be preserved during power outage.
68
69To log in to ONL console from BMC, run
70
71.. code-block:: console
72
73 # /usr/local/bin/sol.sh
74
75If `sol.sh` is unresponsive, please try to restart the mainboard with
76
77.. code-block:: console
78
79 # wedge_power.sh restart
80
81Setup network and host name for ONL
82-----------------------------------
83
84.. code-block:: console
85
86 # hostnamectl set-hostname <host-name>
87
88 # vim.tiny /etc/hosts # update accordingly
89 # cat /etc/hosts # example
90 127.0.0.1 localhost
91 10.92.1.81 menlo-staging-spine-1
92
93 # vim.tiny /etc/network/interfaces.d/ma1 # update accordingly
94 # cat /etc/network/interfaces.d/ma1 # example
95 auto ma1
96 iface ma1 inet static
97 address 10.92.1.81
98 netmask 255.255.255.0
99 gateway 10.92.1.1
100 dns-nameservers 8.8.8.8
101
Hyunsun Moona79c7422020-11-18 04:52:56 -0800102VPN
103===
104This section walks you through how to set up a VPN between ACE and Aether Central in GCP.
105We will be using GitOps based Aether CD pipeline for this,
106so we just need to create a patch to **aether-pod-configs** repository.
107Note that some of the steps described here are not directly related to setting up a VPN,
108but rather are a prerequisite for adding a new ACE.
109
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800110.. attention::
111
112 If you are adding another ACE to an existing VPN connection, go to
113 :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
114
Hyunsun Moona79c7422020-11-18 04:52:56 -0800115Before you begin
116----------------
117* Make sure firewall in front of ACE allows UDP port 500, UDP port 4500, and ESP packets
118 from **gcpvpn1.infra.aetherproject.net(35.242.47.15)** and **gcpvpn2.infra.aetherproject.net(34.104.68.78)**
119* Make sure that the external IP on ACE side is owned by or routed to the management node
120
121To help your understanding, the following sample ACE environment will be used in the rest of this section.
122Make sure to replace the sample values when you actually create a review request.
123
124+-----------------------------+----------------------------------+
125| Management node external IP | 128.105.144.189 |
126+-----------------------------+----------------------------------+
127| ASN | 65003 |
128+-----------------------------+----------------------------------+
129| GCP BGP IP address | Tunnel 1: 169.254.0.9/30 |
130| +----------------------------------+
131| | Tunnel 2: 169.254.1.9/30 |
132+-----------------------------+----------------------------------+
133| ACE BGP IP address | Tunnel 1: 169.254.0.10/30 |
134| +----------------------------------+
135| | Tunnel 2: 169.254.1.10/30 |
136+-----------------------------+----------------------------------+
137| PSK | UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI |
138+-----------------------------+----------------------------------+
139| Management Subnet | 10.91.0.0/24 |
140+-----------------------------+----------------------------------+
141| K8S Subnet | Pod IP: 10.66.0.0/17 |
142| +----------------------------------+
143| | Cluster IP: 10.66.128.0/17 |
144+-----------------------------+----------------------------------+
145
Hyunsun Moona79c7422020-11-18 04:52:56 -0800146Download aether-pod-configs repository
147--------------------------------------
148.. code-block:: shell
149
150 $ cd $WORKDIR
151 $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
152
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800153.. _update_global_resource:
154
Hyunsun Moona79c7422020-11-18 04:52:56 -0800155Update global resource maps
156---------------------------
157Add a new ACE information at the end of the following global resource maps.
158
159* user_map.tfvars
160* cluster_map.tfvars
161* vpn_map.tfvars
162
163As a note, you can find several other global resource maps under the `production` directory.
164Resource definitions that need to be shared among clusters or are better managed in a
165single file to avoid configuration conflicts are maintained in this way.
166
167.. code-block:: diff
168
169 $ cd $WORKDIR/aether-pod-configs/production
170 $ vi user_map.tfvars
171
172 # Add the new cluster admin user at the end of the map
173 $ git diff user_map.tfvars
174 --- a/production/user_map.tfvars
175 +++ b/production/user_map.tfvars
176 @@ user_map = {
177 username = "menlo"
178 password = "changeme"
179 global_roles = ["user-base", "catalogs-use"]
180 + },
181 + test_admin = {
182 + username = "test"
183 + password = "changeme"
184 + global_roles = ["user-base", "catalogs-use"]
185 }
186 }
187
188.. code-block:: diff
189
190 $ cd $WORKDIR/aether-pod-configs/production
191 $ vi cluster_map.tfvars
192
193 # Add the new K8S cluster information at the end of the map
194 $ git diff cluster_map.tfvars
195 --- a/production/cluster_map.tfvars
196 +++ b/production/cluster_map.tfvars
197 @@ cluster_map = {
198 kube_dns_cluster_ip = "10.53.128.10"
199 cluster_domain = "prd.menlo.aetherproject.net"
200 calico_ip_detect_method = "can-reach=www.google.com"
201 + },
202 + ace-test = {
203 + cluster_name = "ace-test"
204 + management_subnets = ["10.91.0.0/24"]
205 + k8s_version = "v1.18.8-rancher1-1"
206 + k8s_pod_range = "10.66.0.0/17"
207 + k8s_cluster_ip_range = "10.66.128.0/17"
208 + kube_dns_cluster_ip = "10.66.128.10"
209 + cluster_domain = "prd.test.aetherproject.net"
210 + calico_ip_detect_method = "can-reach=www.google.com"
211 }
212 }
213 }
214
215.. code-block:: diff
216
217 $ cd $WORKDIR/aether-pod-configs/production
218 $ vi vpn_map.tfvars
219
220 # Add VPN and tunnel information at the end of the map
221 $ git diff vpn_map.tfvars
222 --- a/production/vpn_map.tfvars
223 +++ b/production/vpn_map.tfvars
224 @@ vpn_map = {
225 bgp_peer_ip_address_1 = "169.254.0.6"
226 bgp_peer_ip_range_2 = "169.254.1.5/30"
227 bgp_peer_ip_address_2 = "169.254.1.6"
228 + },
229 + ace-test = {
230 + peer_name = "production-ace-test"
231 + peer_vpn_gateway_address = "128.105.144.189"
232 + tunnel_shared_secret = "UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI"
233 + bgp_peer_asn = "65003"
234 + bgp_peer_ip_range_1 = "169.254.0.9/30"
235 + bgp_peer_ip_address_1 = "169.254.0.10"
236 + bgp_peer_ip_range_2 = "169.254.1.9/30"
237 + bgp_peer_ip_address_2 = "169.254.1.10"
238 }
239 }
240
241.. note::
242 Unless you have a specific requirement, set ASN and BGP addresses to the next available values in the map.
243
244
245Create ACE specific configurations
246----------------------------------
247In this step, we will create a directory under `production` with the same name as ACE,
248and add several Terraform configurations and Ansible inventory needed to configure a VPN connection.
249Throughout the deployment procedure, this directory will contain all ACE specific configurations.
250
251Run the following commands to auto-generate necessary files under the target ACE directory.
252
253.. code-block:: shell
254
255 $ cd $WORKDIR/aether-pod-configs/tools
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800256 $ cp ace_env /tmp/ace_env
257 $ vi /tmp/ace_env
Hyunsun Moona79c7422020-11-18 04:52:56 -0800258 # Set environment variables
259
Hyunsun Moon0e080e42020-11-18 12:53:13 -0800260 $ source /tmp/ace_env
Hyunsun Moona79c7422020-11-18 04:52:56 -0800261 $ make vpn
262 Created ../production/ace-test
263 Created ../production/ace-test/main.tf
264 Created ../production/ace-test/variables.tf
265 Created ../production/ace-test/gcp_fw.tf
266 Created ../production/ace-test/gcp_ha_vpn.tf
267 Created ../production/ace-test/ansible
268 Created ../production/ace-test/backend.tf
269 Created ../production/ace-test/cluster_val.tfvars
270 Created ../production/ace-test/ansible/hosts.ini
271 Created ../production/ace-test/ansible/extra_vars.yml
272
273.. attention::
274 The predefined templates are tailored to Pronto BOM. You'll need to fix `cluster_val.tfvars` and `ansible/extra_vars.yml`
275 when using a different BOM.
276
277Create a review request
278-----------------------
279.. code-block:: shell
280
281 $ cd $WORKDIR/aether-pod-configs/production
282 $ git status
283 On branch tools
284 Changes not staged for commit:
285
286 modified: cluster_map.tfvars
287 modified: user_map.tfvars
288 modified: vpn_map.tfvars
289
290 Untracked files:
291 (use "git add <file>..." to include in what will be committed)
292
293 ace-test/
294
295 $ git add .
296 $ git commit -m "Add test ACE"
297 $ git review
298
299Once the review request is accepted and merged,
300CD pipeline will create VPN tunnels on both GCP and the management node.
301
302Verify VPN connection
303---------------------
304You can verify the VPN connections after successful post-merge job
305by checking the routing table on the management node and trying to ping to one of the central cluster VMs.
306Make sure two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`, exist
307and three additional routing entries via one of the tunnel interfaces.
308
309.. code-block:: shell
310
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800311 # Verify routings
Hyunsun Moona79c7422020-11-18 04:52:56 -0800312 $ netstat -rn
313 Kernel IP routing table
314 Destination Gateway Genmask Flags MSS Window irtt Iface
315 0.0.0.0 128.105.144.1 0.0.0.0 UG 0 0 0 eno1
316 10.45.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
317 10.52.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
318 10.66.128.0 10.91.0.8 255.255.128.0 UG 0 0 0 eno1
319 10.91.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eno1
320 10.168.0.0 169.254.0.9 255.255.240.0 UG 0 0 0 gcp_tunnel1
321 128.105.144.0 0.0.0.0 255.255.252.0 U 0 0 0 eno1
322 169.254.0.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel1
323 169.254.1.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel2
324
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800325 # Verify ACC VM access
326 $ ping 10.168.0.6
Hyunsun Moona79c7422020-11-18 04:52:56 -0800327
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800328 # Verify ACC K8S cluster access
329 $ nslookup kube-dns.kube-system.svc.prd.acc.gcp.aetherproject.net 10.52.128.10
330
331You can further verify whether the ACE routes are propagated well to GCP
332by checking GCP dashboard **VPC Network > Routes > Dynamic**.
333
Hyunsun Moona79c7422020-11-18 04:52:56 -0800334
335Post VPN setup
336--------------
337Once you verify the VPN connections, please update `ansible` directory name to `_ansible` to prevent
338the ansible playbook from running again.
339Note that it is no harm to re-run the ansible playbook but not recommended.
340
341.. code-block:: shell
342
343 $ cd $WORKDIR/aether-pod-configs/production/$ACE_NAME
344 $ mv ansible _ansible
345 $ git add .
346 $ git commit -m "Mark ansible done for test ACE"
347 $ git review
348
Hyunsun Moon5c1e0b02020-11-20 11:09:00 -0800349.. _add_ace_to_vpn:
350
351Add another ACE to an existing VPN connection
352---------------------------------------------
353VPN connections can be shared when there are multiple ACE clusters in a site.
354In order to add ACE to an existing VPN connection,
355you'll have to SSH into the management node and manually update BIRD configuration.
356
357.. note::
358
359 This step needs improvements in the future.
360
361.. code-block:: shell
362
363 $ sudo vi /etc/bird/bird.conf
364 protocol static {
365 ...
366 route 10.66.128.0/17 via 10.91.0.10;
367
368 # Add routings for the new ACE's K8S cluster IP range via cluster nodes
369 # TODO: Configure iBGP peering with Calico nodes and dynamically learn these routings
370 route <NEW-ACE-CLUSTER-IP> via <SERVER1>
371 route <NEW-ACE-CLUSTER-IP> via <SERVER2>
372 route <NEW-ACE-CLUSTER-IP> via <SERVER3>
373 }
374
375 filter gcp_tunnel_out {
376 # Add the new ACE's K8S cluster IP range and the management subnet if required to the list
377 if (net ~ [ 10.91.0.0/24, 10.66.128.0/17, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
378 else reject;
379 }
380 # Save and exit
381
382 $ sudo birdc configure
383
384 # Confirm the static routes are added
385 $ sudo birdc show route
Hyunsun Moona79c7422020-11-18 04:52:56 -0800386