blob: 103182cf19bd9c31c59d74475651eead206f9026 [file] [log] [blame]
Charles Chan4a107222020-10-30 17:23:48 -07001..
2 SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
3 SPDX-License-Identifier: Apache-2.0
4
5===============
6TOST Deployment
7===============
8
9Update aether-pod-config
10========================
11
12Aether-pod-configs is a git project hosted on **gerrit.opencord.org** and we placed the following materials in it.
13
14- Terraform scripts to install TOST applications on Rancher, including ONOS, Stratum and Telegraf.
15- Customized configuration for each application (helm values).
16- Application specific configuration files, including ONOS network configuration and Stratum chassis config.
17
18Here is an example folder structure:
19
20.. code-block:: console
21
22 ╰─$ tree staging/ace-menlo/tost
23 staging/ace-menlo/tost
24 ├── app_map.tfvars
25 ├── backend.tf
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +000026 ├── deepinsight
27 │   ├── README.md
28 │   ├── deepinsight-topo.json
29 │   └── deepinsight-topo.json.license
30 ├── main.tf -> ../../../common/tost/main.tf
Charles Chan4a107222020-10-30 17:23:48 -070031 ├── onos
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +000032 │   ├── app_map.tfvars
33 │   ├── backend.tf
34 │   ├── main.tf -> ../../../../common/tost/apps/onos/main.tf
35 │   ├── onos-netcfg.json
36 │   ├── onos-netcfg.json.license
37 │   ├── onos.yaml
38 │   └── variables.tf -> ../../../../common/tost/apps/onos/variables.tf
Charles Chan4a107222020-10-30 17:23:48 -070039 ├── stratum
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +000040 │   ├── app_map.tfvars
41 │   ├── backend.tf
42 │   ├── main.tf -> ../../../../common/tost/apps/stratum/main.tf
43 │   ├── menlo-staging-leaf-1-chassis-config.pb.txt
44 │   ├── menlo-staging-leaf-2-chassis-config.pb.txt
45 │   ├── menlo-staging-spine-1-chassis-config.pb.txt
46 │   ├── menlo-staging-spine-2-chassis-config.pb.txt
47 │   ├── stratum.yaml
48 │   ├── tost-dev-chassis-config.pb.txt
49 │   └── variables.tf -> ../../../../common/tost/apps/stratum/variables.tf
Charles Chan4a107222020-10-30 17:23:48 -070050 ├── telegraf
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +000051 │   ├── app_map.tfvars
52 │   ├── backend.tf
53 │   ├── main.tf -> ../../../../common/tost/apps/telegraf/main.tf
54 │   ├── telegraf.yaml
55 │   └── variables.tf -> ../../../../common/tost/apps/telegraf/variables.tf
56 └── variables.tf -> ../../../common/tost/variables.tf
Charles Chan4a107222020-10-30 17:23:48 -070057
58There are four Terraform scripts inside **tost** directory and are responsible for managing each service.
59
60Root folder
61^^^^^^^^^^^
62Terraform reads **app_map.tfvars** to know which application will be installed on Rancher
63and which version and customized values need to apply to.
64
Hyunsun Moonfc751aa2020-11-11 18:49:47 -080065Here is the example of **app_map.tfvars** which defines prerequisite apps for TOST
66as well as project and namespace in which TOST apps will be provisioned.
67Note that currently we don't have any prerequisite so we left this blank intentionally.
68It can be used to specify prerequisites in the future.
Charles Chan4a107222020-10-30 17:23:48 -070069
70.. code-block::
71
72 project_name = "tost"
73 namespace_name = "tost"
74
Hyunsun Moonfc751aa2020-11-11 18:49:47 -080075 app_map = {}
Charles Chan4a107222020-10-30 17:23:48 -070076
77ONOS folder
78^^^^^^^^^^^
79All files under **onos** directory are related to ONOS application.
Hyunsun Moonfc751aa2020-11-11 18:49:47 -080080The **app_map.tfvars** in this folder describes the information about ONOS helm chart.
Charles Chan4a107222020-10-30 17:23:48 -070081
82In this example, we specify the **onos-tost** helm chart version to **0.1.18** and load **onos.yaml**
83as custom value files.
84
85.. code-block::
86
87 apps = ["onos"]
88
89 app_map = {
90 onos = {
91 app_name = "onos-tost"
92 project_name = "tost"
93 target_namespace = "onos-tost"
94 catalog_name = "onos"
95 template_name = "onos-tost"
96 template_version = "0.1.18"
97 values_yaml = ["onos.yaml"]
98 }
99 }
100
101**onos.yaml** used to custom your ONOS-tost Helm chart values and please pay attention to the last section, config.
102
103.. code-block:: yaml
104
105 onos-classic:
106 image:
107 tag: master
108 pullPolicy: Always
109 replicas: 1
110 atomix:
111 replicas: 1
112 logging:
113 config: |
114 # Common pattern layout for appenders
115 log4j2.stdout.pattern = %d{RFC3339} %-5level [%c{1}] %msg%n%throwable
116
117 # Root logger
118 log4j2.rootLogger.level = INFO
119
120 # OSGi appender
121 log4j2.rootLogger.appenderRef.PaxOsgi.ref = PaxOsgi
122 log4j2.appender.osgi.type = PaxOsgi
123 log4j2.appender.osgi.name = PaxOsgi
124 log4j2.appender.osgi.filter = *
125
126 # stdout appender
127 log4j2.rootLogger.appenderRef.Console.ref = Console
128 log4j2.appender.console.type = Console
129 log4j2.appender.console.name = Console
130 log4j2.appender.console.layout.type = PatternLayout
131 log4j2.appender.console.layout.pattern = ${log4j2.stdout.pattern}
132
133 # SSHD logger
134 log4j2.logger.sshd.name = org.apache.sshd
135 log4j2.logger.sshd.level = INFO
136
137 # Spifly logger
138 log4j2.logger.spifly.name = org.apache.aries.spifly
139 log4j2.logger.spifly.level = WARN
140
141 # SegmentRouting logger
142 log4j2.logger.segmentrouting.name = org.onosproject.segmentrouting
143 log4j2.logger.segmentrouting.level = DEBUG
144
145 config:
146 server: gerrit.opencord.org
147 repo: aether-pod-configs
148 folder: staging/ace-menlo/tost/onos
149 file: onos-netcfg.json
150 netcfgUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/network/configuration
151 clusterUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/cluster
152
153Once the **onos-tost** containers are deployed into Kubernetes,
154it will read **onos-netcfg.json** file from the **aether-pod-config** and please change the folder name to different location if necessary.
155
156**onos-netcfg.json** is environment dependent and please change it to fit your environment.
157
158..
159 TODO: Add an example based on the recommended topology
160
161Stratum folder
162^^^^^^^^^^^^^^
163Stratum uses a similar directory structure as ONOS for Terraform and its configuration files.
164
165The customize value file is named **stratum.yaml**
166
167.. code-block::
168
169 app_map = {
170 stratum= {
171 app_name = "stratum"
172 project_name = "tost"
173 target_namespace = "stratum"
174 catalog_name = "stratum"
175 template_name = "stratum"
176 template_version = "0.1.9"
177 values_yaml = ["stratum.yaml"]
178 }
179 }
180
181Like ONOS, **stratum.yaml** used to customize Stratum Helm Chart and please pay attention to the config section.
182
183.. code-block:: yaml
184
185 image:
186 registry: registry.aetherproject.org
187 repository: tost/stratum-bfrt
188 tag: 9.2.0-4.14.49
189 pullPolicy: Always
190 pullSecrets:
191 - aether-registry-credential
192
193 extraParams:
194 - "-max_log_size=0"
195 - '-write_req_log_file=""'
196 - '-read_req_log_file=""'
197 - "-v=0"
198 - "-stderrthreshold=0"
199 - "-bf_switchd_background=false"
200
201 nodeSelector:
202 node-role.aetherproject.org: switch
203
204 tolerations:
205 - effect: NoSchedule
206 value: switch
207 key: node-role.aetherproject.org
208
209 config:
210 server: gerrit.opencord.org
211 repo: aether-pod-configs
212 folder: staging/ace-onf-menlo/tost/stratum
213
214Stratum has the same deployment workflow as ONOS.
215Once it is deployed to Kubernetes, it will read switch-dependent config files from the aether-pod-configs repo.
216The key folder indicates that relative path of configs.
217
218.. attention::
219
220 The switch-dependent config file should be named as **${hostname}-chassis-config.pb.txt**.
221 For example, if the host name of your Tofino switch is **my-leaf**, please name config file **my-leaf-config.pb.txt**.
222
223..
224 TODO: Add an example based on the recommended topology
225
226Telegraf folder
227^^^^^^^^^^^^^^^
228
229The app_map.tfvars specify the Helm Chart version and the filename of the custom Helm value file.
230
231.. code-block::
232
233 apps=["telegraf"]
234
235 app_map = {
236 telegraf= {
237 app_name = "telegraf"
238 project_name = "tost"
239 target_namespace = "telegraf"
240 catalog_name = "influxdata"
241 template_name = "telegraf"
242 template_version = "1.7.23"
243 values_yaml = ["telegraf.yaml"]
244 }
245 }
246
247The **telegraf.yaml** used to override the Telegraf Helm Chart and its environment-dependent.
248Please pay attention to the **inputs.addresses** section.
249Telegraf will read data from stratum so we need to specify all Tofino switch’s IP addresses here.
250Taking Menlo staging pod as example, there are four switches so we fill out 4 IP addresses.
251
252.. code-block:: yaml
253
254 podAnnotations:
255 field.cattle.io/workloadMetrics: '[{"path":"/metrics","port":9273,"schema":"HTTP"}]'
256
257 config:
258 outputs:
259 - prometheus_client:
260 metric_version: 2
261 listen: ":9273"
262 inputs:
263 - cisco_telemetry_gnmi:
264 addresses:
265 - 10.92.1.81:9339
266 - 10.92.1.82:9339
267 - 10.92.1.83:9339
268 - 10.92.1.84:9339
269 redial: 10s
270 - cisco_telemetry_gnmi.subscription:
271 name: stratum_counters
272 origin: openconfig-interfaces
273 path: /interfaces/interface[name=*]/state/counters
274 sample_interval: 5000ns
275 subscription_mode: sample
276
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000277
278Create Your Own Configs
279^^^^^^^^^^^^^^^^^^^^^^^
280
281The easiest way to create your own configs is running the template script.
282
283Assumed we would like to set up the **ace-example** pod in the production environment.
284
2851. open the **tools/ace_env**
2862. fill out all required variables
2873. import the environment variables from **tools/ace_env**
2884. perform the makefile command to generate configuration and directory for TOST
2895. update **onos-netcfg.json** for ONOS
2906. update **${hostname}-chassis-config.pb.txt** for Stratum
2917. update all switch IPs in **telegraf.yaml**
2928. commit your change and open the Gerrit patch
293
294.. code-block:: console
295
296 vim tools/ace_env
297 source tools/ace_env
298 make -C tools/ tost
299 vim production/ace-example/tost/onos/onos-netcfg.json
300 vim production/ace-example/tost/stratum/*${hostname}-chassis-config.pb.txt**
301 vim production/ace-example/tost/telegraf/telegraf.yam
302 git add commit
303 git review
304
305
Charles Chan4a107222020-10-30 17:23:48 -0700306Quick recap
307^^^^^^^^^^^
308
309To recap, most of the files in **tost** folder can be copied from existing examples.
310However, there are a few files we need to pay extra attentions to.
311
312- **onos-netcfg.json** in **onos** folder
313- Chassis config in **stratum** folder
314 There should be one chassis config for each switch. The file name needs to be **${hostname}-chassis-config.pb.txt**
315- **telegraf.yaml** in **telegraf** folder need to be updated with all switch IP addresses
316
317Double check these files and make sure they have been updated accordingly.
318
319
320Create a review request
321^^^^^^^^^^^^^^^^^^^^^^^
322We also need to create a gerrit review request, similar to what we have done in the **Aether Run-Time Deployment**.
323Please refer to :doc:`Aether Run-Time Deployment <run_time_deployment>` to create a review request.
324
325
326Create TOST deployment job in Jenkins
327=====================================
328There are three major components in the Jenkins system, the Jenkins pipeline and Jenkins Job Builder and Jenkins Job.
329
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000330We follow the Infrastructure as Code principle to place three major components in a Git repo, **aether-ci-management**
331Download **aether-ci-management** repository .
Charles Chan4a107222020-10-30 17:23:48 -0700332
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000333.. code-block:: shell
334
335 $ cd $WORKDIR
336 $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-ci-management"
337
338
339Here is the example of folder structure, we put everything related to three major components under the jjb folder.
340
341.. code-block:: console
342
343 $ tree -d jjb
344 jjb
345 ├── ci-management
346 ├── global
347 │   ├── jenkins-admin -> ../../global-jjb/jenkins-admin
348 │   ├── jenkins-init-scripts -> ../../global-jjb/jenkins-init-scripts
349 │   ├── jjb -> ../../global-jjb/jjb
350 │   └── shell -> ../../global-jjb/shell
351 ├── pipeline
352 ├── repos
353 ├── shell
354 └── templates
Charles Chan4a107222020-10-30 17:23:48 -0700355
356
357Jenkins pipeline
358^^^^^^^^^^^^^^^^
359Jenkins pipeline runs the Terraform scripts to install desired applications into the specified Kubernetes cluster.
360
361Both ONOS and Stratum will read configuration files (network config, chassis config) from aether-pod-config.
362The default git branch is master.
363For testing purpose, we also provide two parameters to specify the number of reviews and patchset.
364We will explain more in the next section.
365
366.. note::
367
368 Currently, we don’t perform the incremental upgrade for TOST application.
369 Instead, we perform the clean installation.
370 In the pipeline script, Terraform will destroy all existing resources and then create them again.
371
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000372
373We put all pipeline scripts under the pipeline directory, the language of the pipeline script is groovy.
374
375.. code-block:: console
376
377 $ tree pipeline
378 pipeline
379 ├── aether-in-a-box.groovy
380 ├── artifact-release.groovy
381 ├── cd-pipeline-charts-postrelease.groovy
382 ├── cd-pipeline-dockerhub-postrelease.groovy
383 ├── cd-pipeline-postrelease.groovy
384 ├── cd-pipeline-terraform.groovy
385 ├── docker-publish.groovy
386 ├── ng40-func.groovy
387 ├── ng40-scale.groovy
388 ├── reuse-scan-gerrit.groovy
389 ├── reuse-scan-github.groovy
390 ├── tost-onos.groovy
391 ├── tost-stratum.groovy
392 ├── tost-telegraf.groovy
393 └── tost.groovy
394
395Currently, we had four pipeline scripts for TOST deployment.
396
3971. tost-onos.groovy
3982. tost-stratum.groovy
3993. tost-telegraf.groovy
4004. tost.groovy
401
402tost-[onos/stratum/telegraf].groovy are used to deploy the individual application respectively,
403and tost.groovy is a high level script, used to deploy the TOST application, it will execute
404the above three scripts in its pipeline script.
405
406
Charles Chan4a107222020-10-30 17:23:48 -0700407Jenkins jobs
408^^^^^^^^^^^^
409
410Jenkins job is the task unit in the Jenkins system. A Jenkins job contains the following information:
411
412- Jenkins pipeline
413- Parameters for Jenkins pipeline
414- Build trigger
415- Source code management
416
417We created one Jenkins job for each TOST component, per Aether edge.
418We have four Jenkins jobs (HostPath provisioner, ONOS, Stratum and Telegraf) for each edge as of today.
419
420There are 10+ parameters in Jenkins jobs and they can be divided into two parts, cluster-level and application-level.
421Here is an example of supported parameters.
422
423.. image:: images/jenkins-onos-params.png
424 :width: 480px
425
426Application level
427"""""""""""""""""
428
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000429- **GERRIT_CHANGE_NUMBER/GERRIT_PATCHSET_NUMBER**: tell the pipeline script to read
430 the config for aether-pod-configs repo from a specified gerrit review, instead of the
431 HEAD branch. It’s good for developer to test its change before merge.
432- **onos_user**: used to login ONOS controller
Charles Chan4a107222020-10-30 17:23:48 -0700433- **git_repo/git_server/git_user/git_password_env**: information of git repository, **git_password_env** is a key for
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000434 Jenkins Credential system.
Charles Chan4a107222020-10-30 17:23:48 -0700435
436Cluster level
437"""""""""""""
438- **gcp_credential**: Google Cloud Platform credential for remote storage, used by Terraform.
439- **terraform_dir**: The root directory of the TOST directory.
440- **rancher_cluster**: target Rancher cluster name.
441- **rancher_api_env**: Rancher credential to access Rancher, used by Terraform.
Charles Chan4a107222020-10-30 17:23:48 -0700442
443.. note::
444
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000445 Typically, developer only focus on **GERRIT_CHANGE_NUMBER** and **GERRIT_PATCHSET_NUMBER**. The rest of them are managed by OPs.
Charles Chan4a107222020-10-30 17:23:48 -0700446
447Jenkins Job Builder (JJB)
448^^^^^^^^^^^^^^^^^^^^^^^^^
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000449
450We prefer to apply the IaC (Infrastructure as Code) for everything.
Charles Chan4a107222020-10-30 17:23:48 -0700451We use the JJB (Jenkins Job Builder) to create new Jenkins Job, including the Jenkins pipeline.
452We need to clone a set of Jenkins jobs when a new edge is deployed.
453
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000454In order to provide the flexibility and avoid re-inventing the wheel, we used the job template to declare your job.
455Thanks to the JJB, we can use the parameters in the job template to render different kinds of jobs easily.
456
457All the template files are placed under templates directory.
458
459.. code-block:: console
460
461 ╰─$ tree templates
462 templates
463 ├── aether-in-a-box.yaml
464 ├── archive-artifacts.yaml
465 ├── artifact-release.yml
466 ├── cd-pipeline-terraform.yaml
467 ├── docker-publish-github.yaml
468 ├── docker-publish.yaml
469 ├── helm-lint.yaml
470 ├── make-test.yaml
471 ├── ng40-nightly.yaml
472 ├── ng40-test.yaml
473 ├── private-docker-publish.yaml
474 ├── private-make-test.yaml
475 ├── publish-helm-repo.yaml
476 ├── reuse-gerrit.yaml
477 ├── reuse-github.yaml
478 ├── sync-dir.yaml
479 ├── tost.yaml
480 ├── verify-licensed.yaml
481 └── versioning.yaml
482
483
484we defined all TOST required job templates in tost.yaml and here is its partial content.
485
486.. code-block:: yaml
487
488 - job-template:
489 name: "{name}-onos"
490 id: "deploy-onos"
491 project-type: pipeline
492 dsl: !include-raw-escape: jjb/pipeline/tost-onos.groovy
493 triggers:
494 - onf-infra-tost-gerrit-trigger:
495 gerrit-server-name: '{gerrit-server-name}'
496 trigger_command: "apply"
497 pattern: "{terraform_dir}/tost/onos/.*"
498 logrotate:
499 daysToKeep: 7
500 numToKeep: 10
501 artifactDaysToKeep: 7
502 artifactNumToKeep: 10
503 parameters:
504 - string:
505 name: gcp_credential
506 default: "{google_bucket_access}"
507 - string:
508 name: rancher_cluster
509 default: "{rancher_cluster}"
510 - string:
511 name: rancher_api_env
512 default: "{rancher_api}"
513 - string:
514 name: git_repo
515 default: "aether-pod-configs"
516 - string:
517 name: git_server
518 default: "gerrit.opencord.org"
519 - string:
520 name: git_ssh_user
521 default: "jenkins"
522
523
524
525
526Once we have the job template, we need to tell the JJB, we want to use the job template to create our own jobs.
527Here comes the concept of project, you need to define job templates you want to use and the values of all parameters.
528
529
530We put all project yaml files under the repo directory and here is the example
531
532.. code-block:: console
533
534 ╰─$ tree repos 130 ↵
535 repos
536 ├── aether-helm-charts.yaml
537 ├── aether-in-a-box.yaml
538 ├── cd-pipeline-terraform.yaml
539 ├── ng40-test.yaml
540 ├── spgw.yaml
541 └── tost.yaml
542
543
544Following is the example of tost projects, we defined three projects here, and each project has different
545parameters and Jenkins jobs it wants to use.
546
547.. code-block:: yaml
548
549 - project:
550 name: deploy-menlo-tost-dev
551 rancher_cluster: "menlo-tost-dev"
552 terraform_dir: "testing/menlo-tost"
553 rancher_api: "{rancher_testing_access}"
554 jobs:
555 - "deploy"
556 - "deploy-onos"
557 - "deploy-stratum"
558 - "deploy-telegraf"
559 - project:
560 name: deploy-menlo-tost-staging
561 rancher_cluster: "ace-menlo"
562 terraform_dir: "staging/ace-menlo"
563 rancher_api: "{rancher_staging_access}"
564 jobs:
565 - "deploy"
566 - "deploy-onos"
567 - "deploy-stratum"
568 - "deploy-telegraf"
569 - project:
570 name: deploy-menlo-production
571 rancher_cluster: "ace-menlo"
572 terraform_dir: "production/ace-menlo"
573 rancher_api: "{rancher_production_access}"
574 jobs:
575 - "deploy"
576 - "deploy-onos"
577 - "deploy-stratum"
578 - "deploy-telegraf"
579
580
581Create Your Own Jenkins Job
582^^^^^^^^^^^^^^^^^^^^^^^^^^^
583
584Basically, if you don't need to customize the Jenkins pipeline script and the job configuration, the only thing
585you need to do is modify the repos/tost.yaml to add your project.
586
587For example, we would like to deploy the TOST to our production pod, let's assume it named "tost-example".
588Add the following content into repos/tost.yaml
589
590.. code-block:: yaml
591
592 - project:
593 name: deploy-tost-example-production
594 rancher_cluster: "ace-test-example"
595 terraform_dir: "production/tost-example"
596 rancher_api: "{rancher_production_access}"
597 jobs:
598 - "deploy"
599 - "deploy-onos"
600 - "deploy-stratum"
601 - "deploy-telegraf"
602
603
604.. note::
605
606 The **terraform_dir** indicates the directory location in aether-pod-configs repo, please ensure your Terraform scripts
607 already there before running the Jenkins job.
608
Charles Chan4a107222020-10-30 17:23:48 -0700609
610Trigger TOST deployment in Jenkins
611==================================
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000612Whenever a change is merged into **aether-pod-config**,
Charles Chan4a107222020-10-30 17:23:48 -0700613the Jenkins job should be triggered automatically to (re)deploy TOST.
Charles Chan4a107222020-10-30 17:23:48 -0700614
Hung-Wei Chiuf7cadb32020-11-19 04:49:35 +0000615You can also type the comment **apply** in the Gerrit patch, it will trigger Jenkins jobs to deploy TOST for you.
Charles Chan4a107222020-10-30 17:23:48 -0700616
617Troubleshooting
618===============
619
620The deployment process involves the following steps:
621
6221. Jenkins Job
6232. Jenkins Pipeline
6243. Clone Git Repository
6254. Execute Terraform scripts
6265. Rancher start to install applications
6276. Applications be deployed into Kubernetes cluster
6287. ONOS/Stratum will read the configuration (network config, chassis config)
6298. Pod become running
630
631Taking ONOS as an example, here's what you can do to troubleshoot.
632
633You can see the log message of the first 4 steps in Jenkins console.
634If something goes wrong, the status of the Jenkins job will be in red.
635If Jenkins doesn't report any error message, the next step is going to Rancher's portal
636to ensure the Answers is same as the *onos.yaml* in *aether-pod-configs*.