Reorganization pass on Aether Docs

Change-Id: I0653109d6fe8d340278580ff5c7758ca264b512e
diff --git a/edge_deployment/tost_deployment.rst b/edge_deployment/tost_deployment.rst
new file mode 100644
index 0000000..2ac0225
--- /dev/null
+++ b/edge_deployment/tost_deployment.rst
@@ -0,0 +1,668 @@
+..
+   SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
+   SPDX-License-Identifier: Apache-2.0
+
+===============
+TOST Deployment
+===============
+
+Update aether-pod-config
+========================
+
+Aether-pod-configs is a git project hosted on **gerrit.opencord.org** and we placed the following materials in it.
+
+- Terraform scripts to install TOST applications on Rancher, including ONOS, Stratum and Telegraf.
+- Customized configuration for each application (helm values).
+- Application specific configuration files, including ONOS network configuration and Stratum chassis config.
+
+Here is an example folder structure:
+
+.. code-block:: console
+
+   ╰─$ tree staging/ace-menlo/tost
+   staging/ace-menlo/tost
+   ├── app_map.tfvars
+   ├── backend.tf
+   ├── deepinsight
+   │   ├── README.md
+   │   ├── deepinsight-topo.json
+   │   └── deepinsight-topo.json.license
+   ├── main.tf -> ../../../common/tost/main.tf
+   ├── onos
+   │   ├── app_map.tfvars
+   │   ├── backend.tf
+   │   ├── main.tf -> ../../../../common/tost/apps/onos/main.tf
+   │   ├── onos-netcfg.json
+   │   ├── onos-netcfg.json.license
+   │   ├── onos.yaml
+   │   └── variables.tf -> ../../../../common/tost/apps/onos/variables.tf
+   ├── stratum
+   │   ├── app_map.tfvars
+   │   ├── backend.tf
+   │   ├── main.tf -> ../../../../common/tost/apps/stratum/main.tf
+   │   ├── menlo-staging-leaf-1-chassis-config.pb.txt
+   │   ├── menlo-staging-leaf-2-chassis-config.pb.txt
+   │   ├── menlo-staging-spine-1-chassis-config.pb.txt
+   │   ├── menlo-staging-spine-2-chassis-config.pb.txt
+   │   ├── stratum.yaml
+   │   ├── tost-dev-chassis-config.pb.txt
+   │   └── variables.tf -> ../../../../common/tost/apps/stratum/variables.tf
+   ├── telegraf
+   │   ├── app_map.tfvars
+   │   ├── backend.tf
+   │   ├── main.tf -> ../../../../common/tost/apps/telegraf/main.tf
+   │   ├── telegraf.yaml
+   │   └── variables.tf -> ../../../../common/tost/apps/telegraf/variables.tf
+   └── variables.tf -> ../../../common/tost/variables.tf
+
+There are four Terraform scripts inside **tost** directory and are responsible for managing each service.
+
+Root folder
+^^^^^^^^^^^
+Terraform reads **app_map.tfvars** to know which application will be installed on Rancher
+and which version and customized values need to apply to.
+
+Here is the example of **app_map.tfvars** which defines prerequisite apps for TOST
+as well as project and namespace in which TOST apps will be provisioned.
+Note that currently we don't have any prerequisite so we left this blank intentionally.
+It can be used to specify prerequisites in the future.
+
+.. code-block::
+
+   project_name     = "tost"
+   namespace_name   = "tost"
+
+   app_map = {}
+
+ONOS folder
+^^^^^^^^^^^
+All files under **onos** directory are related to ONOS application.
+The **app_map.tfvars** in this folder describes the information about ONOS helm chart.
+
+In this example, we specify the **onos-tost** helm chart version to **0.1.18** and load **onos.yaml**
+as custom value files.
+
+.. code-block::
+
+   apps = ["onos"]
+
+   app_map = {
+      onos = {
+         app_name         = "onos-tost"
+         project_name     = "tost"
+         target_namespace = "onos-tost"
+         catalog_name     = "onos"
+         template_name    = "onos-tost"
+         template_version = "0.1.18"
+         values_yaml      = ["onos.yaml"]
+      }
+   }
+
+**onos.yaml** used to custom your ONOS-tost Helm chart values and please pay attention to the last section, config.
+
+.. code-block:: yaml
+
+   onos-classic:
+      image:
+         tag: master
+         pullPolicy: Always
+      replicas: 1
+      atomix:
+         replicas: 1
+      logging:
+         config: |
+            # Common pattern layout for appenders
+            log4j2.stdout.pattern = %d{RFC3339} %-5level [%c{1}] %msg%n%throwable
+
+            # Root logger
+            log4j2.rootLogger.level = INFO
+
+            # OSGi appender
+            log4j2.rootLogger.appenderRef.PaxOsgi.ref = PaxOsgi
+            log4j2.appender.osgi.type = PaxOsgi
+            log4j2.appender.osgi.name = PaxOsgi
+            log4j2.appender.osgi.filter = *
+
+            # stdout appender
+            log4j2.rootLogger.appenderRef.Console.ref = Console
+            log4j2.appender.console.type = Console
+            log4j2.appender.console.name = Console
+            log4j2.appender.console.layout.type = PatternLayout
+            log4j2.appender.console.layout.pattern = ${log4j2.stdout.pattern}
+
+            # SSHD logger
+            log4j2.logger.sshd.name = org.apache.sshd
+            log4j2.logger.sshd.level = INFO
+
+            # Spifly logger
+            log4j2.logger.spifly.name = org.apache.aries.spifly
+            log4j2.logger.spifly.level = WARN
+
+            # SegmentRouting logger
+            log4j2.logger.segmentrouting.name = org.onosproject.segmentrouting
+            log4j2.logger.segmentrouting.level = DEBUG
+
+      config:
+         server: gerrit.opencord.org
+         repo: aether-pod-configs
+         folder: staging/ace-menlo/tost/onos
+         file: onos-netcfg.json
+         netcfgUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/network/configuration
+         clusterUrl: http://onos-tost-onos-classic-hs.tost.svc:8181/onos/v1/cluster
+
+Once the **onos-tost** containers are deployed into Kubernetes,
+it will read **onos-netcfg.json** file from the **aether-pod-config** and please change the folder name to different location if necessary.
+
+**onos-netcfg.json** is environment dependent and please change it to fit your environment.
+
+..
+   TODO: Add an example based on the recommended topology
+
+Stratum folder
+^^^^^^^^^^^^^^
+Stratum uses a similar directory structure as ONOS for Terraform and its configuration files.
+
+The customize value file is named **stratum.yaml**
+
+.. code-block::
+
+   app_map = {
+      stratum= {
+         app_name         = "stratum"
+         project_name     = "tost"
+         target_namespace = "stratum"
+         catalog_name     = "stratum"
+         template_name    = "stratum"
+         template_version = "0.1.9"
+         values_yaml      = ["stratum.yaml"]
+      }
+   }
+
+Like ONOS, **stratum.yaml** used to customize Stratum Helm Chart and please pay attention to the config section.
+
+.. code-block:: yaml
+
+   image:
+      registry: registry.aetherproject.org
+      repository: tost/stratum-bfrt
+      tag: 9.2.0-4.14.49
+      pullPolicy: Always
+      pullSecrets:
+         - aether-registry-credential
+
+   extraParams:
+      - "-max_log_size=0"
+      - '-write_req_log_file=""'
+      - '-read_req_log_file=""'
+      - "-v=0"
+      - "-stderrthreshold=0"
+      - "-bf_switchd_background=false"
+
+   nodeSelector:
+   node-role.aetherproject.org: switch
+
+   tolerations:
+      - effect: NoSchedule
+         value: switch
+         key: node-role.aetherproject.org
+
+   config:
+      server: gerrit.opencord.org
+      repo: aether-pod-configs
+      folder: staging/ace-onf-menlo/tost/stratum
+
+Stratum has the same deployment workflow as ONOS.
+Once it is deployed to Kubernetes, it will read switch-dependent config files from the aether-pod-configs repo.
+The key folder indicates that relative path of configs.
+
+.. attention::
+
+   The switch-dependent config file should be named as **${hostname}-chassis-config.pb.txt**.
+   For example, if the host name of your Tofino switch is **my-leaf**, please name config file **my-leaf-config.pb.txt**.
+
+..
+   TODO: Add an example based on the recommended topology
+
+Telegraf folder
+^^^^^^^^^^^^^^^
+
+The app_map.tfvars specify the Helm Chart version and the filename of the custom Helm value file.
+
+.. code-block::
+
+   apps=["telegraf"]
+
+   app_map = {
+      telegraf= {
+         app_name         = "telegraf"
+         project_name     = "tost"
+         target_namespace = "telegraf"
+         catalog_name     = "influxdata"
+         template_name    = "telegraf"
+         template_version = "1.7.23"
+         values_yaml      = ["telegraf.yaml"]
+      }
+   }
+
+The **telegraf.yaml** used to override the Telegraf Helm Chart and its environment-dependent.
+Please pay attention to the **inputs.addresses** section.
+Telegraf will read data from stratum so we need to specify all Tofino switch’s IP addresses here.
+Taking Menlo staging pod as example, there are four switches so we fill out 4 IP addresses.
+
+.. code-block:: yaml
+
+   podAnnotations:
+      field.cattle.io/workloadMetrics: '[{"path":"/metrics","port":9273,"schema":"HTTP"}]'
+
+   config:
+      outputs:
+         - prometheus_client:
+            metric_version: 2
+            listen: ":9273"
+   inputs:
+      - cisco_telemetry_gnmi:
+         addresses:
+            - 10.92.1.81:9339
+            - 10.92.1.82:9339
+            - 10.92.1.83:9339
+            - 10.92.1.84:9339
+         redial: 10s
+      - cisco_telemetry_gnmi.subscription:
+         name: stratum_counters
+         origin: openconfig-interfaces
+         path: /interfaces/interface[name=*]/state/counters
+         sample_interval: 5000ns
+         subscription_mode: sample
+
+
+Create Your Own Configs
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The easiest way to create your own configs is running the template script.
+
+Assumed we would like to set up the **ace-example** pod in the production environment.
+
+1. open the **tools/ace_env**
+2. fill out all required variables
+3. import the environment variables from **tools/ace_env**
+4. perform the makefile command to generate configuration and directory for TOST
+5. update **onos-netcfg.json** for ONOS
+6. update **${hostname}-chassis-config.pb.txt** for Stratum
+7. update all switch IPs in **telegraf.yaml**
+8. commit your change and open the Gerrit patch
+
+.. code-block:: console
+
+  vim tools/ace_env
+  source tools/ace_env
+  make -C tools/  tost
+  vim production/ace-example/tost/onos/onos-netcfg.json
+  vim production/ace-example/tost/stratum/*${hostname}-chassis-config.pb.txt**
+  vim production/ace-example/tost/telegraf/telegraf.yam
+  git add commit
+  git review
+
+
+Quick recap
+^^^^^^^^^^^
+
+To recap, most of the files in **tost** folder can be copied from existing examples.
+However, there are a few files we need to pay extra attentions to.
+
+- **onos-netcfg.json** in **onos** folder
+- Chassis config in **stratum** folder
+  There should be one chassis config for each switch. The file name needs to be
+  **${hostname}-chassis-config.pb.txt**
+- **telegraf.yaml** in **telegraf** folder need to be updated with all switch
+  IP addresses
+
+Double check these files and make sure they have been updated accordingly.
+
+
+Create a review request
+^^^^^^^^^^^^^^^^^^^^^^^
+We also need to create a gerrit review request, similar to what we have done in
+the **Aether Runtime Deployment**.
+
+Please refer to :doc:`Aether Runtime Deployment <runtime_deployment>` to
+create a review request.
+
+
+Create TOST deployment job in Jenkins
+=====================================
+There are three major components in the Jenkins system, the Jenkins pipeline
+and Jenkins Job Builder and Jenkins Job.
+
+We follow the Infrastructure as Code principle to place three major components
+in a Git repo, ``aether-ci-management``
+
+Download the ``aether-ci-management`` repository.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR
+   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-ci-management"
+
+
+Here is the example of folder structure, we put everything related to three
+major components under the jjb folder.
+
+.. code-block:: console
+
+   $ tree -d jjb
+   jjb
+   ├── ci-management
+   ├── global
+   │   ├── jenkins-admin -> ../../global-jjb/jenkins-admin
+   │   ├── jenkins-init-scripts -> ../../global-jjb/jenkins-init-scripts
+   │   ├── jjb -> ../../global-jjb/jjb
+   │   └── shell -> ../../global-jjb/shell
+   ├── pipeline
+   ├── repos
+   ├── shell
+   └── templates
+
+
+Jenkins pipeline
+^^^^^^^^^^^^^^^^
+Jenkins pipeline runs the Terraform scripts to install desired applications
+into the specified Kubernetes cluster.
+
+Both ONOS and Stratum will read configuration files (network config, chassis
+config) from aether-pod-config.
+
+The default git branch is master.  For testing purpose, we also provide two
+parameters to specify the number of reviews and patchset.
+
+We will explain more in the next section.
+
+.. note::
+
+   Currently, we don’t perform the incremental upgrade for TOST application.
+   Instead, we perform the clean installation.
+   In the pipeline script, Terraform will destroy all existing resources and
+   then create them again.
+
+
+We put all pipeline scripts under the pipeline directory, the language of the
+pipeline script is groovy.
+
+.. code-block:: console
+
+   $ tree pipeline
+   pipeline
+   ├── aether-in-a-box.groovy
+   ├── artifact-release.groovy
+   ├── cd-pipeline-charts-postrelease.groovy
+   ├── cd-pipeline-dockerhub-postrelease.groovy
+   ├── cd-pipeline-postrelease.groovy
+   ├── cd-pipeline-terraform.groovy
+   ├── docker-publish.groovy
+   ├── ng40-func.groovy
+   ├── ng40-scale.groovy
+   ├── reuse-scan-gerrit.groovy
+   ├── reuse-scan-github.groovy
+   ├── tost-onos.groovy
+   ├── tost-stratum.groovy
+   ├── tost-telegraf.groovy
+   └── tost.groovy
+
+Currently, we had four pipeline scripts for TOST deployment.
+
+1. tost-onos.groovy
+2. tost-stratum.groovy
+3. tost-telegraf.groovy
+4. tost.groovy
+
+tost-[onos/stratum/telegraf].groovy are used to deploy the individual
+application respectively, and tost.groovy is a high level script, used to
+deploy the TOST application, it will execute the above three scripts in its
+pipeline script.
+
+
+Jenkins jobs
+^^^^^^^^^^^^
+
+Jenkins job is the task unit in the Jenkins system. A Jenkins job contains the following information:
+
+- Jenkins pipeline
+- Parameters for Jenkins pipeline
+- Build trigger
+- Source code management
+
+We created one Jenkins job for each TOST component, per Aether edge.
+
+We have four Jenkins jobs (HostPath provisioner, ONOS, Stratum and Telegraf)
+for each edge as of today.
+
+There are 10+ parameters in Jenkins jobs and they can be divided into two
+parts, cluster-level and application-level.
+
+Here is an example of supported parameters.
+
+.. image:: images/jenkins-onos-params.png
+   :width: 480px
+
+Application level
+"""""""""""""""""
+
+- **GERRIT_CHANGE_NUMBER/GERRIT_PATCHSET_NUMBER**: tell the pipeline script to read
+  the config for aether-pod-configs repo from a specified gerrit review, instead of the
+  HEAD branch. It’s good for developer to test its change before merge.
+- **onos_user**: used to login ONOS controller
+- **git_repo/git_server/git_user/git_password_env**: information of git
+  repository, **git_password_env** is a key for Jenkins Credential system.
+
+Cluster level
+"""""""""""""
+- **gcp_credential**: Google Cloud Platform credential for remote storage, used
+  by Terraform.
+- **terraform_dir**: The root directory of the TOST directory.
+- **rancher_cluster**: target Rancher cluster name.
+- **rancher_api_env**: Rancher credential to access Rancher, used by Terraform.
+
+.. note::
+
+   Typically, developer only focus on **GERRIT_CHANGE_NUMBER** and **GERRIT_PATCHSET_NUMBER**. The rest of them are managed by OPs.
+
+Jenkins Job Builder (JJB)
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+We prefer to apply the IaC (Infrastructure as Code) for everything.  We use the
+JJB (Jenkins Job Builder) to create new Jenkins Job, including the Jenkins
+pipeline.  We need to clone a set of Jenkins jobs when a new edge is deployed.
+
+In order to provide the flexibility and avoid re-inventing the wheel, we used
+the job template to declare your job.  Thanks to the JJB, we can use the
+parameters in the job template to render different kinds of jobs easily.
+
+All the template files are placed under templates directory.
+
+.. code-block:: console
+
+   ╰─$ tree templates
+   templates
+   ├── aether-in-a-box.yaml
+   ├── archive-artifacts.yaml
+   ├── artifact-release.yml
+   ├── cd-pipeline-terraform.yaml
+   ├── docker-publish-github.yaml
+   ├── docker-publish.yaml
+   ├── helm-lint.yaml
+   ├── make-test.yaml
+   ├── ng40-nightly.yaml
+   ├── ng40-test.yaml
+   ├── private-docker-publish.yaml
+   ├── private-make-test.yaml
+   ├── publish-helm-repo.yaml
+   ├── reuse-gerrit.yaml
+   ├── reuse-github.yaml
+   ├── sync-dir.yaml
+   ├── tost.yaml
+   ├── verify-licensed.yaml
+   └── versioning.yaml
+
+We defined all TOST required job templates in tost.yaml and here is its partial
+content.
+
+.. code-block:: yaml
+
+   - job-template:
+      name: "{name}-onos"
+      id: "deploy-onos"
+      project-type: pipeline
+      dsl: !include-raw-escape: jjb/pipeline/tost-onos.groovy
+      triggers:
+        - onf-infra-tost-gerrit-trigger:
+           gerrit-server-name: '{gerrit-server-name}'
+           trigger_command: "apply"
+           pattern: "{terraform_dir}/tost/onos/.*"
+      logrotate:
+          daysToKeep: 7
+          numToKeep: 10
+          artifactDaysToKeep: 7
+          artifactNumToKeep: 10
+      parameters:
+          - string:
+                name: gcp_credential
+                default: "{google_bucket_access}"
+          - string:
+                name: rancher_cluster
+                default: "{rancher_cluster}"
+          - string:
+                name: rancher_api_env
+                default: "{rancher_api}"
+          - string:
+                name: git_repo
+                default: "aether-pod-configs"
+          - string:
+                name: git_server
+                default: "gerrit.opencord.org"
+          - string:
+                name: git_ssh_user
+                default: "jenkins"
+
+
+
+
+Once we have the job template, we need to tell the JJB, we want to use the job template to create our own jobs.
+Here comes the concept of project, you need to define job templates you want to use and the values of all parameters.
+
+
+We put all project yaml files under the repo directory and here is the example
+
+.. code-block:: console
+
+   ╰─$ tree repos                                                                                                                                   130 ↵
+   repos
+   ├── aether-helm-charts.yaml
+   ├── aether-in-a-box.yaml
+   ├── cd-pipeline-terraform.yaml
+   ├── ng40-test.yaml
+   ├── spgw.yaml
+   └── tost.yaml
+
+
+Following is the example of tost projects, we defined three projects here, and each project has different
+parameters and Jenkins jobs it wants to use.
+
+.. code-block:: yaml
+
+   - project:
+         name: deploy-menlo-tost-dev
+         rancher_cluster: "menlo-tost-dev"
+         terraform_dir: "testing/menlo-tost"
+         rancher_api: "{rancher_testing_access}"
+         jobs:
+            - "deploy"
+            - "deploy-onos"
+            - "deploy-stratum"
+            - "deploy-telegraf"
+   - project:
+         name: deploy-menlo-tost-staging
+         rancher_cluster: "ace-menlo"
+         terraform_dir: "staging/ace-menlo"
+         rancher_api: "{rancher_staging_access}"
+         jobs:
+            - "deploy"
+            - "deploy-onos"
+            - "deploy-stratum"
+            - "deploy-telegraf"
+   - project:
+         name: deploy-menlo-production
+         rancher_cluster: "ace-menlo"
+         terraform_dir: "production/ace-menlo"
+         rancher_api: "{rancher_production_access}"
+         jobs:
+            - "deploy"
+            - "deploy-onos"
+            - "deploy-stratum"
+            - "deploy-telegraf"
+
+
+Create Your Own Jenkins Job
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Basically, if you don't need to customize the Jenkins pipeline script and the job configuration, the only thing
+you need to do is modify the repos/tost.yaml to add your project.
+
+For example, we would like to deploy the TOST to our production pod, let's assume it named "tost-example".
+Add the following content into repos/tost.yaml
+
+.. code-block:: yaml
+
+   - project:
+         name: deploy-tost-example-production
+         rancher_cluster: "ace-test-example"
+         terraform_dir: "production/tost-example"
+         rancher_api: "{rancher_production_access}"
+         jobs:
+            - "deploy"
+            - "deploy-onos"
+            - "deploy-stratum"
+            - "deploy-telegraf"
+
+
+.. note::
+
+   The **terraform_dir** indicates the directory location in aether-pod-configs repo, please ensure your Terraform scripts
+   already there before running the Jenkins job.
+
+
+Trigger TOST deployment in Jenkins
+==================================
+Whenever a change is merged into **aether-pod-config**,
+the Jenkins job should be triggered automatically to (re)deploy TOST.
+
+You can also type the comment **apply** in the Gerrit patch, it will trigger Jenkins jobs to deploy TOST for you.
+
+
+Verification
+============
+Fabric connectivity should be fully ready at this point.
+We should verify that **all servers**, including compute nodes and the management server,
+have an IP address and are **able to reach each other via fabric interface** before continuing the next step.
+
+This can be simply done by running a **ping** command from one server to another server's fabric IP.
+
+
+Troubleshooting
+===============
+
+The deployment process involves the following steps:
+
+1. Jenkins Job
+2. Jenkins Pipeline
+3. Clone Git Repository
+4. Execute Terraform scripts
+5. Rancher start to install applications
+6. Applications be deployed into Kubernetes cluster
+7. ONOS/Stratum will read the configuration (network config, chassis config)
+8. Pod become running
+
+Taking ONOS as an example, here's what you can do to troubleshoot.
+
+You can see the log message of the first 4 steps in Jenkins console.
+If something goes wrong, the status of the Jenkins job will be in red.
+If Jenkins doesn't report any error message, the next step is going to Rancher's portal
+to ensure the Answers is same as the *onos.yaml* in *aether-pod-configs*.