blob: eb9041531dc9981c27994f1dabfd1528b62d7fa6 [file] [log] [blame]
Charles Chan4a107222020-10-30 17:23:48 -07001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
5===============
6TOST Deployment
7===============
8
9Update aether-pod-config
10========================
11
12Aether-pod-configs is a git project hosted on **gerrit.opencord.org** and we placed the following materials in it.
13
14- Terraform scripts to install TOST applications on Rancher, including ONOS, Stratum and Telegraf.
15- Customized configuration for each application (helm values).
16- Application specific configuration files, including ONOS network configuration and Stratum chassis config.
17
18Here is an example folder structure:
19
20.. code-block:: console
21
22 ╰─$ tree staging/ace-menlo/tost
23 staging/ace-menlo/tost
24 ├── app_map.tfvars
25 ├── backend.tf
26 ├── common
27 ├── main.tf
28 └── variables.tf
29 ├── hostpath.yaml
30 ├── main.tf
31 ├── onos
32 ├── app_map.tfvars
33 ├── backend.tf
34 ├── main.tf -> ../common/main.tf
35 ├── onos-netcfg.json
36 ├── onos-netcfg.json.license
37 ├── onos.yaml
38 └── variables.tf -> ../common/variables.tf
39 ├── stratum
40 ├── app_map.tfvars
41 ├── backend.tf
42 ├── main.tf -> ../common/main.tf
43 ├── menlo-staging-leaf-1-chassis-config.pb.txt
44 ├── menlo-staging-leaf-2-chassis-config.pb.txt
45 ├── menlo-staging-spine-1-chassis-config.pb.txt
46 ├── menlo-staging-spine-2-chassis-config.pb.txt
47 ├── stratum.yaml
48 ├── tost-dev-chassis-config.pb.txt
49 └── variables.tf -> ../common/variables.tf
50 ├── telegraf
51 ├── app_map.tfvars
52 ├── backend.tf
53 ├── main.tf -> ../common/main.tf
54 ├── telegraf.yaml
55 └── variables.tf -> ../common/variables.tf
56 └── variables.tf
57
58There are four Terraform scripts inside **tost** directory and are responsible for managing each service.
59
60Root folder
61^^^^^^^^^^^
62Terraform reads **app_map.tfvars** to know which application will be installed on Rancher
63and which version and customized values need to apply to.
64
65Here is the example of **app_map.tfvars** which is used to install the **hostpath** storage.
66We specify the Helm Chart version to **0.2.9** and use the file **hostpath.yaml** as the custom values
67
68.. code-block::
69
70 project_name = "tost"
71 namespace_name = "tost"
72
73 app_map = {
74 hostpath-provisioner = {
75 app_name = "hostpath-provisioner"
76 target_namespace = "tost"
77 catalog_name = "rimusz"
78 template_name = "hostpath-provisioner"
79 template_version = "0.2.9"
80 values_yaml = ["hostpath.yaml"]
81 }
82 }
83
84The content of **hostpath.yaml** looks like below.
85It follows the standard yaml format and we use this file to customize the **hostpath** Helm Chart.
86
87.. code-block::
88
89 storageClass:
90 name: fast-disks
91
92ONOS folder
93^^^^^^^^^^^
94All files under **onos** directory are related to ONOS application.
95As we mentioned above, the **app_map.tfvars** describe the information about ONOS helm chart.
96
97In this example, we specify the **onos-tost** helm chart version to **0.1.18** and load **onos.yaml**
98as custom value files.
99
100.. code-block::
101
102 apps = ["onos"]
103
104 app_map = {
105 onos = {
106 app_name = "onos-tost"
107 project_name = "tost"
108 target_namespace = "onos-tost"
109 catalog_name = "onos"
110 template_name = "onos-tost"
111 template_version = "0.1.18"
112 values_yaml = ["onos.yaml"]
113 }
114 }
115
116**onos.yaml** used to custom your ONOS-tost Helm chart values and please pay attention to the last section, config.
117
118.. code-block:: yaml
119
120 onos-classic:
121 image:
122 tag: master
123 pullPolicy: Always
124 replicas: 1
125 atomix:
126 replicas: 1
127 logging:
128 config: |
129 # Common pattern layout for appenders
130 log4j2.stdout.pattern = %d{RFC3339} %-5level [%c{1}] %msg%n%throwable
131
132 # Root logger
133 log4j2.rootLogger.level = INFO
134
135 # OSGi appender
136 log4j2.rootLogger.appenderRef.PaxOsgi.ref = PaxOsgi
137 log4j2.appender.osgi.type = PaxOsgi
138 log4j2.appender.osgi.name = PaxOsgi
139 log4j2.appender.osgi.filter = *
140
141 # stdout appender
142 log4j2.rootLogger.appenderRef.Console.ref = Console
143 log4j2.appender.console.type = Console
144 log4j2.appender.console.name = Console
145 log4j2.appender.console.layout.type = PatternLayout
146 log4j2.appender.console.layout.pattern = ${log4j2.stdout.pattern}
147
148 # SSHD logger
149 log4j2.logger.sshd.name = org.apache.sshd
150 log4j2.logger.sshd.level = INFO
151
152 # Spifly logger
153 log4j2.logger.spifly.name = org.apache.aries.spifly
154 log4j2.logger.spifly.level = WARN
155
156 # SegmentRouting logger
157 log4j2.logger.segmentrouting.name = org.onosproject.segmentrouting
158 log4j2.logger.segmentrouting.level = DEBUG
159
160 config:
161 server: gerrit.opencord.org
162 repo: aether-pod-configs
163 folder: staging/ace-menlo/tost/onos
164 file: onos-netcfg.json
165 netcfgUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/network/configuration
166 clusterUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/cluster
167
168Once the **onos-tost** containers are deployed into Kubernetes,
169it will read **onos-netcfg.json** file from the **aether-pod-config** and please change the folder name to different location if necessary.
170
171**onos-netcfg.json** is environment dependent and please change it to fit your environment.
172
173..
174 TODO: Add an example based on the recommended topology
175
176Stratum folder
177^^^^^^^^^^^^^^
178Stratum uses a similar directory structure as ONOS for Terraform and its configuration files.
179
180The customize value file is named **stratum.yaml**
181
182.. code-block::
183
184 app_map = {
185 stratum= {
186 app_name = "stratum"
187 project_name = "tost"
188 target_namespace = "stratum"
189 catalog_name = "stratum"
190 template_name = "stratum"
191 template_version = "0.1.9"
192 values_yaml = ["stratum.yaml"]
193 }
194 }
195
196Like ONOS, **stratum.yaml** used to customize Stratum Helm Chart and please pay attention to the config section.
197
198.. code-block:: yaml
199
200 image:
201 registry: registry.aetherproject.org
202 repository: tost/stratum-bfrt
203 tag: 9.2.0-4.14.49
204 pullPolicy: Always
205 pullSecrets:
206 - aether-registry-credential
207
208 extraParams:
209 - "-max_log_size=0"
210 - '-write_req_log_file=""'
211 - '-read_req_log_file=""'
212 - "-v=0"
213 - "-stderrthreshold=0"
214 - "-bf_switchd_background=false"
215
216 nodeSelector:
217 node-role.aetherproject.org: switch
218
219 tolerations:
220 - effect: NoSchedule
221 value: switch
222 key: node-role.aetherproject.org
223
224 config:
225 server: gerrit.opencord.org
226 repo: aether-pod-configs
227 folder: staging/ace-onf-menlo/tost/stratum
228
229Stratum has the same deployment workflow as ONOS.
230Once it is deployed to Kubernetes, it will read switch-dependent config files from the aether-pod-configs repo.
231The key folder indicates that relative path of configs.
232
233.. attention::
234
235 The switch-dependent config file should be named as **${hostname}-chassis-config.pb.txt**.
236 For example, if the host name of your Tofino switch is **my-leaf**, please name config file **my-leaf-config.pb.txt**.
237
238..
239 TODO: Add an example based on the recommended topology
240
241Telegraf folder
242^^^^^^^^^^^^^^^
243
244The app_map.tfvars specify the Helm Chart version and the filename of the custom Helm value file.
245
246.. code-block::
247
248 apps=["telegraf"]
249
250 app_map = {
251 telegraf= {
252 app_name = "telegraf"
253 project_name = "tost"
254 target_namespace = "telegraf"
255 catalog_name = "influxdata"
256 template_name = "telegraf"
257 template_version = "1.7.23"
258 values_yaml = ["telegraf.yaml"]
259 }
260 }
261
262The **telegraf.yaml** used to override the Telegraf Helm Chart and its environment-dependent.
263Please pay attention to the **inputs.addresses** section.
264Telegraf will read data from stratum so we need to specify all Tofino switchs IP addresses here.
265Taking Menlo staging pod as example, there are four switches so we fill out 4 IP addresses.
266
267.. code-block:: yaml
268
269 podAnnotations:
270 field.cattle.io/workloadMetrics: '[{"path":"/metrics","port":9273,"schema":"HTTP"}]'
271
272 config:
273 outputs:
274 - prometheus_client:
275 metric_version: 2
276 listen: ":9273"
277 inputs:
278 - cisco_telemetry_gnmi:
279 addresses:
280 - 10.92.1.81:9339
281 - 10.92.1.82:9339
282 - 10.92.1.83:9339
283 - 10.92.1.84:9339
284 redial: 10s
285 - cisco_telemetry_gnmi.subscription:
286 name: stratum_counters
287 origin: openconfig-interfaces
288 path: /interfaces/interface[name=*]/state/counters
289 sample_interval: 5000ns
290 subscription_mode: sample
291
292Quick recap
293^^^^^^^^^^^
294
295To recap, most of the files in **tost** folder can be copied from existing examples.
296However, there are a few files we need to pay extra attentions to.
297
298- **onos-netcfg.json** in **onos** folder
299- Chassis config in **stratum** folder
300 There should be one chassis config for each switch. The file name needs to be **${hostname}-chassis-config.pb.txt**
301- **telegraf.yaml** in **telegraf** folder need to be updated with all switch IP addresses
302
303Double check these files and make sure they have been updated accordingly.
304
305
306Create a review request
307^^^^^^^^^^^^^^^^^^^^^^^
308We also need to create a gerrit review request, similar to what we have done in the **Aether Run-Time Deployment**.
309Please refer to :doc:`Aether Run-Time Deployment <run_time_deployment>` to create a review request.
310
311
312Create TOST deployment job in Jenkins
313=====================================
314There are three major components in the Jenkins system, the Jenkins pipeline and Jenkins Job Builder and Jenkins Job.
315
316.. note::
317
318 All Jenkins related files are placed in a `temporary repository <https://github.com/hwchiu/stratum-example/tree/master/pipelines>`_ and will move to another repo once the Aether Jenkins is ready.
319
320
321Jenkins pipeline
322^^^^^^^^^^^^^^^^
323Jenkins pipeline runs the Terraform scripts to install desired applications into the specified Kubernetes cluster.
324
325Both ONOS and Stratum will read configuration files (network config, chassis config) from aether-pod-config.
326The default git branch is master.
327For testing purpose, we also provide two parameters to specify the number of reviews and patchset.
328We will explain more in the next section.
329
330.. note::
331
332 Currently, we dont perform the incremental upgrade for TOST application.
333 Instead, we perform the clean installation.
334 In the pipeline script, Terraform will destroy all existing resources and then create them again.
335
336Jenkins jobs
337^^^^^^^^^^^^
338
339Jenkins job is the task unit in the Jenkins system. A Jenkins job contains the following information:
340
341- Jenkins pipeline
342- Parameters for Jenkins pipeline
343- Build trigger
344- Source code management
345
346We created one Jenkins job for each TOST component, per Aether edge.
347We have four Jenkins jobs (HostPath provisioner, ONOS, Stratum and Telegraf) for each edge as of today.
348
349There are 10+ parameters in Jenkins jobs and they can be divided into two parts, cluster-level and application-level.
350Here is an example of supported parameters.
351
352.. image:: images/jenkins-onos-params.png
353 :width: 480px
354
355Application level
356"""""""""""""""""
357
358- **config_review/config_patchset** tell the pipeline script to read the config for ONOS from a specified
359 gerrit review, instead of the HEAD branch. It’s good for developer to test its change before merge.
360- **onos_user/onos_password**: used to login ONOS controller
361 **onos_password** is a key which will load the real password from Jenkins Credential system.
362- **onos_ns**: the namespace we installed the secret file for ONOS, (will refactor in the future).
363- **git_repo/git_server/git_user/git_password_env**: information of git repository, **git_password_env** is a key for
364 Jenkins Credential system.
365
366Cluster level
367"""""""""""""
368- **gcp_credential**: Google Cloud Platform credential for remote storage, used by Terraform.
369- **terraform_dir**: The root directory of the TOST directory.
370- **rancher_cluster**: target Rancher cluster name.
371- **rancher_api_env**: Rancher credential to access Rancher, used by Terraform.
372- **k8s_conifg**: Kubernetes config to access remote Kubernetes cluster.
373
374.. note::
375
376 Typically, developer only focus on **config_review** and **config_patchset**. The rest of them are managed by OPs.
377
378Jenkins Job Builder (JJB)
379^^^^^^^^^^^^^^^^^^^^^^^^^
380We prefer to apply the IaaC (Infrastructure as a Code) for everything.
381We use the JJB (Jenkins Job Builder) to create new Jenkins Job, including the Jenkins pipeline.
382We need to clone a set of Jenkins jobs when a new edge is deployed.
383
384..
385 TODO: Automate Jenkins job creation with JJB once the Aether Jenkins is set updated
386
387Trigger TOST deployment in Jenkins
388==================================
389Ideally, whenever a change is merged into **aether-pod-config**,
390the Jenkins job should be triggered automatically to (re)deploy TOST.
391This is still being set up at this moment.
392Therefore, we need to manually trigger the deployment by clicking the **Build** button
393of each Jenkins job and provide parameters accordingly.
394
395..
396 TODO: Update this once the gerrit trigger is implemented
397
398
399Troubleshooting
400===============
401
402The deployment process involves the following steps:
403
4041. Jenkins Job
4052. Jenkins Pipeline
4063. Clone Git Repository
4074. Execute Terraform scripts
4085. Rancher start to install applications
4096. Applications be deployed into Kubernetes cluster
4107. ONOS/Stratum will read the configuration (network config, chassis config)
4118. Pod become running
412
413Taking ONOS as an example, here's what you can do to troubleshoot.
414
415You can see the log message of the first 4 steps in Jenkins console.
416If something goes wrong, the status of the Jenkins job will be in red.
417If Jenkins doesn't report any error message, the next step is going to Rancher's portal
418to ensure the Answers is same as the *onos.yaml* in *aether-pod-configs*.