Merge "Subscriber Management"

commit: 09c793b03f0a755698063873406a8ee408712293 [log] [tgz]
author: Scott Baker <scottb@opennetworking.org> Mon Aug 02 20:41:18 2021 +0000
committer: Gerrit Code Review <gerrit@opencord.org> Mon Aug 02 20:41:18 2021 +0000
tree: c4723bf508d4364f038d0b5b56bca1ac8829273a
parent: 6fc0ff1db924b2855f24af719df7e8326c2d7cd9 [diff]
parent: e59540ccada165ca55612e22d7fc0b18d563c8f9 [diff]
diff --git a/.gitignore b/.gitignore
index 9a7c442..27d92e8 100644
--- a/.gitignore
+++ b/.gitignore

@@ -1,2 +1,3 @@
 venv-docs
 _build
+.vscode

diff --git a/amp/monitoring.rst b/amp/monitoring.rst
new file mode 100644
index 0000000..b9d2ff4
--- /dev/null
+++ b/amp/monitoring.rst

@@ -0,0 +1,91 @@
+..
+   SPDX-FileCopyrightText: © 2021 Open Networking Foundation <support@opennetworking.org>
+   SPDX-License-Identifier: Apache-2.0
+
+Monitoring and Alerts
+=====================
+
+Aether leverages `Prometheus <https://prometheus.io/docs/introduction/overview/>`_ to collect
+and store platform and service metrics, `Grafana <https://grafana.com/docs/grafana/latest/getting-started/>`_
+to visualize metrics over time, and `Alertmanager <https://prometheus.io/docs/alerting/latest/alertmanager/>`_ to
+notify Aether OPs staff of events requiring attention.  This monitoring stack is running on each Aether cluster.
+This section describes how an Aether component can "opt in" to the Aether monitoring stack so that its metrics can be
+collected and graphed, and can trigger alerts.
+
+
+Exporting Service Metrics to Prometheus
+---------------------------------------
+An Aether component implements a `Prometheus exporter <https://prometheus.io/docs/instrumenting/writing_exporters/>`_
+to expose its metrics to Prometheus.  An exporter provides the current values of a components's
+metrics via HTTP using a simple text format.  Prometheus scrapes the exporter's HTTP endpoint and stores the metrics
+in its Time Series Database (TSDB) for querying and analysis.  Many `client libraries <https://prometheus.io/docs/instrumenting/clientlibs/>`_
+are available for instrumenting code to export metrics in Prometheus format.  If a component's metrics are available
+in some other format, tools like `Telegraf <https://docs.influxdata.com/telegraf>`_ can be used to convert the metrics
+into Prometheus format and export them.
+
+A component that exposes a Prometheus exporter HTTP endpoint via a Service can tell Prometheus to scrape
+this endpoint by defining a
+`ServiceMonitor <https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/running-exporters.md>`_
+custom resource.  The ServiceMonitor is typically created by the Helm chart that installs the component.
+
+
+Working with Grafana Dashboards
+--------------------------------
+Once the local cluster's Prometheus is collecting a component's metrics, they can be visualized using Grafana
+dashboards.  The Grafana instance running on the AMP cluster is able to send queries to the Prometheus
+servers running on all Aether clusters.  This means that component metrics can be visualized on the AMP Grafana
+regardless of where the component is actually running.
+
+In order to create a new Grafana dashboard or modify an existing one, first login to the AMP Grafana using an account
+with admin privileges.  To add a new dashboard, click the **+** at left.  To make a copy of an existing dashboard for
+editing, click the **Dashboard Settings** icon (gear icon) at upper right of the existing dashboard, and then
+click the **Save as…** button at left.
+
+Next, add panels to the dashboard.  Since Grafana can access Prometheus on all the clusters in the environment,
+each cluster is available as a data source.  For example, when adding a panel showing metrics collected on the
+ace-menlo cluster, choose ace-menlo as the data source.
+
+Clicking on the floppy disk icon at top will save the dashboard *temporarily* (the dashboard is not
+saved to persistent storage and is deleted as soon as Grafana is restarted).  To save the dashboard *permanently*,
+click the **Share Dashboard** icon next to the title and save its JSON to a file.  Then add the file to the
+aether-app-configs repository so that it will be deployed by Fleet:
+
+* Change to directory ``aether-app-configs/infrastructure/rancher-monitoring/overlays/<amp-cluster>/``
+* Copy the dashboard JSON file to the ``dashboards/`` sub-directory
+* Edit ``kustomization.yaml`` and add the new dashboard JSON under ``configmapGenerator``
+* Commit the changes and submit patchset to gerrit
+
+Once the patchset is merged, the AMP Grafana will automatically detect and deploy the new dashboard.
+
+Adding Service-specific Alerts
+------------------------------
+An alert can be triggered in Prometheus when a component metric crosses a threshold.  The Alertmanager
+then routes the alert to one or more receivers (e.g., an email address or Slack channel).
+
+To add an alert for a component, create a
+`PrometheusRule <https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/alerting.md>`_
+custom resource, for example in the Helm chart that deploys the component.  This resource describes one or
+more `rules <https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/>`_ using Prometheus expressions;
+if the expression is true for the time indicated, then the alert is raised. Once the PrometheusRule
+resource is instantiated, the cluster's Prometheus will pick up the rule and start evaluating it.
+
+The Alertmanager is configured to send alerts with *critical* or *warning* severity to e-mail and Slack channels
+monitored by Aether OPs staff.  If it is desirable to route a specific alert to a different receiver
+(e.g., a component-specific Slack channel), it is necessary to change the Alertmanager configuration. This is stored in
+a `SealedSecret <https://github.com/bitnami-labs/sealed-secrets>`_ custom resource in the aether-app-configs repository.
+To update the configuration:
+
+* Change to directory ``aether-app-configs/infrastructure/rancher-monitoring/overlays/<cluster>/``
+* Update the ``receivers`` and ``route`` sections of the ``alertmanager-config.yaml`` file
+* Encode the ``alertmanager-config.yaml`` file as a Base64 string
+* Create a file ``alertmanager-config-secret.yaml`` to define the Secret resource using the Base64-encoded string
+* Run the following command using a valid ``PUBLICKEY``:
+
+.. code-block:: shell
+
+   $ kubeseal --cert "${PUBLICKEY}" --scope cluster-wide --format yaml < alertmanager-config-secret.yaml > alertmanager-config-sealed-secret.yaml
+
+* Commit the changes and submit patchset to gerrit
+
+Once the patchset is merged, verify that the SealedSecret was successfully unsealed and converted to a Secret
+by looking at the logs of the *sealed-secrets-controller* pod running on the cluster in the *kube-system* namespace.

diff --git a/edge_deployment/bess_upf_deployment.rst b/edge_deployment/bess_upf_deployment.rst
new file mode 100644
index 0000000..fb8e101
--- /dev/null
+++ b/edge_deployment/bess_upf_deployment.rst

@@ -0,0 +1,143 @@
+..
+   SPDX-FileCopyrightText: © 2021 Open Networking Foundation <support@opennetworking.org>
+   SPDX-License-Identifier: Apache-2.0
+
+BESS UPF Deployment
+===================
+
+This section describes how to configure and deploy BESS UPF.
+
+
+Network Settings
+----------------
+
+BESS UPF requires three networks, **enb**, **access**, and **core**, and all
+three networks must use different subnets. To help your understanding,
+the following example ACE environment will be used in the rest of the guide.
+
+.. image:: images/bess-upf-example-network.png
+
++-----------+-----------+------------------------------------+-------------------+---------------+
+| Network   | VLAN      | Subnet                             | Interface         | IP address    |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| enb       | 2         | 192.168.2.0/24 (gw: 192.168.2.254) | mgmt server vlan2 | 192.168.2.254 |
+|           |           |                                    +-------------------+---------------+
+|           |           |                                    | enb               | 192.168.2.10  |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| access    | 3         | 192.168.3.0/24 (gw: 192.168.3.254) | mgmt server vlan3 | 192.168.3.254 |
+|           |           |                                    +-------------------+---------------+
+|           |           |                                    | upf access        | 192.168.3.1   |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| core      | 4         | 192.168.4.0/24 (gw: 192.168.4.254) | mgmt server vlan4 | 192.168.4.254 |
+|           |           |                                    +-------------------+---------------+
+|           |           |                                    | upf core          | 192.168.4.1   |
++-----------+-----------+------------------------------------+-------------------+---------------+
+
+.. note::
+
+   Management plane and out-of-band network are not depicted in the diagram.
+
+
+Note that the management server has the only external routable address and acts as a router for
+all networks in the Aether pod.
+So in order for UE to access the Internet, two things need to be done in the managmenet server.
+
+* For outgoing traffic, masquerade the internal adddress with the external address of the management server.
+* For response traffic to UE, forward them to UPF's **core** interface.
+
+
+Check Cluster Resources
+-----------------------
+
+Before proceeding with the deployment, make sure the cluster has enough resources to run BESS UPF.
+
+* 2 dedicated cores (``"cpu"``)
+* 2 1Gi HugePages (``"hugepages-1Gi"``)
+* 2 SRIOV Virtual Functions bound to **vfio-pci** driver (``"intel.com/intel_sriov_vfio"``)
+
+In fact, these requirements are not mandatory to run BESS UPF, but are recommended for best performance.
+You can use the following command to check allocatable resources in the cluster nodes.
+
+.. code-block:: shell
+
+   $ kubectl get nodes -o json | jq '.items[].status.available'
+   {
+     "cpu": "95",
+     "ephemeral-storage": "1770223432846",
+     "hugepages-1Gi": "32Gi",
+     "intel.com/intel_sriov_netdevice": "32",
+     "intel.com/intel_sriov_vfio": "32",
+     "memory": "360749956Ki",
+     "pods": "110"
+   }
+
+
+Configure and Deploy
+--------------------
+
+Download ``aether-app-configs`` if you don't have it already in your development machine.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR
+   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-app-configs"
+
+Move the directory to ``apps/bess-upf/upf1`` and create a Helm values file for the new cluster as shown below.
+Don't forget to replace the IP addresses in the example configuration with the addresses of the actual cluster.
+
+.. code-block:: yaml
+
+   $ cd $WORKDIR/aether-app-configs/apps/bess-upf/upf1
+   $ mkdir overlays/prd-ace-test
+   $ vi overlays/prd-ace-test/values.yaml
+   # SPDX-FileCopyrightText: 2020-present Open Networking Foundation <info@opennetworking.org>
+
+   config:
+     upf:
+       enb:
+         subnet: "192.168.2.0/24"
+       access:
+         ip: "192.168.3.1/24"
+         gateway: "192.168.3.254"
+         vlan: 3
+       core:
+         ip: "192.168.4.1/24"
+         gateway: "192.168.4.254"
+         vlan: 4
+     # Below is required only when connecting to 5G core
+     cfgFiles:
+       upf.json:
+         cpiface:
+           dnn: "8internet"
+           hostname: "upf"
+
+
+Update ``fleet.yaml`` in the same directory to let Fleet use the custom configuration when deploying
+BESS UPF to the new cluster.
+
+.. code-block:: yaml
+
+   $ vi fleet.yaml
+   # add following block at the end
+   - name: prd-ace-test
+     clusterSelector:
+       matchLabels:
+         management.cattle.io/cluster-display-name: ace-test
+     helm:
+       valuesFiles:
+         - overlays/prd-ace-test/values.yaml
+
+
+Submit your changes.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR/aether-app-configs
+   $ git status
+   $ git add .
+   $ git commit -m "Add BESS UPF configs for test ACE"
+   $ git review
+
+
+Go to Fleet dashboard and wait until the cluster status becomes **Active**.
+It can take up to 1 min for Fleet to fetch the configuration updates.

diff --git a/edge_deployment/images/bess-upf-example-network.png b/edge_deployment/images/bess-upf-example-network.png
new file mode 100644
index 0000000..c6fff20
--- /dev/null
+++ b/edge_deployment/images/bess-upf-example-network.png
Binary files differ

diff --git a/edge_deployment/images/fleet-move-workspace.png b/edge_deployment/images/fleet-move-workspace.png
new file mode 100644
index 0000000..accfb6d
--- /dev/null
+++ b/edge_deployment/images/fleet-move-workspace.png
Binary files differ

diff --git a/edge_deployment/runtime_deployment.rst b/edge_deployment/runtime_deployment.rst
index c97f55d..7b3e700 100644
--- a/edge_deployment/runtime_deployment.rst
+++ b/edge_deployment/runtime_deployment.rst

@@ -2,19 +2,35 @@
    SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
    SPDX-License-Identifier: Apache-2.0
 
-Aether Runtime Deployment
-=========================
+Runtime Deployment
+==================
 
-This section describes how to install Aether edge runtime and Aether managed
-applications.
+This section describes how to install and configure Aether Edge Runtime including Kubernetes
+and system level applications listed below.
 
-We will be using GitOps based Aether CD pipeline for this, so we just need to
-create a patch to **aether-pod-configs** repository.
+* sealed-secrets
+* rancher-monitoring
+* fluent-bit
+* opendistro-es
+* hostpath-provisioner
+* edge-maintenance-agent
+* sriov-device-plugin
+* uedns
 
-Download aether-pod-configs repository
---------------------------------------
+For this, we will be using GitOps based CI/CD systems and what you will need to do is
+create patches in Aether GitOps repositories, **aether-pod-configs** and **aether-app-configs**,
+to provide the cluster configurations to the CI/CD systems.
 
-Download the ``aether-pod-configs`` repository if you don't have it already in
+.. attention::
+
+   If you skipped VPN bootstap step and didn't add the deployment jobs for the new edge,
+   go to :ref:`Add deployment jobs <add_deployment_jobs>` step and finish it first
+   before proceeding.
+
+K8S cluster deployment
+----------------------
+
+Download ``aether-pod-configs`` repository if you don't have it already in
 your development machine.
 
 .. code-block:: shell
@@ -22,135 +38,206 @@
    $ cd $WORKDIR
    $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
 
-Update global resource maps
----------------------------
-
 .. attention::
 
-   Skip this section if you have already done the same step in the
-   :ref:`Update Global Resources Map for VPN <update_global_resource>` section.
+   If you skipped VPN bootstap step and didn't update global resource maps for the new edge,
+   go to :ref:`Update global resource maps <update_global_resource>` step and
+   finish ``cluster_map.tfvars`` and ``user_map.tfvars`` update first before proceeding.
 
-Add a new ACE information at the end of the following global resource maps.
-
-* user_map.tfvars
-* cluster_map.tfvars
-
-As a note, you can find several other global resource maps under the
-`production` directory.  Resource definitions that need to be shared among
-clusters or are better managed in a single file to avoid configuration
-conflicts are maintained in this way.
-
-.. code-block:: diff
-
-   $ cd $WORKDIR/aether-pod-configs/production
-   $ vi user_map.tfvars
-
-   # Add the new cluster admin user at the end of the map
-   $ git diff user_map.tfvars
-   --- a/production/user_map.tfvars
-   +++ b/production/user_map.tfvars
-   @@ user_map = {
-      username      = "menlo"
-      password      = "changeme"
-      global_roles  = ["user-base", "catalogs-use"]
-   +  },
-   +  test_admin = {
-   +    username      = "test"
-   +    password      = "changeme"
-   +    global_roles  = ["user-base", "catalogs-use"]
-      }
-   }
-
-.. code-block:: diff
-
-   $ cd $WORKDIR/aether-pod-configs/production
-   $ vi cluster_map.tfvars
-
-   # Add the new K8S cluster information at the end of the map
-   $ git diff cluster_map.tfvars
-   --- a/production/cluster_map.tfvars
-   +++ b/production/cluster_map.tfvars
-   @@ cluster_map = {
-         kube_dns_cluster_ip     = "10.53.128.10"
-         cluster_domain          = "prd.menlo.aetherproject.net"
-         calico_ip_detect_method = "can-reach=www.google.com"
-   +    },
-   +    ace-test = {
-   +      cluster_name            = "ace-test"
-   +      management_subnets      = ["10.91.0.0/24"]
-   +      k8s_version             = "v1.18.8-rancher1-1"
-   +      k8s_pod_range           = "10.66.0.0/17"
-   +      k8s_cluster_ip_range    = "10.66.128.0/17"
-   +      kube_dns_cluster_ip     = "10.66.128.10"
-   +      cluster_domain          = "prd.test.aetherproject.net"
-   +      calico_ip_detect_method = "can-reach=www.google.com"
-         }
-      }
-   }
-
-You'll have to get this change merged before proceeding.
+Run the following commands to automatically generate Terraform configurations needed to
+create a new cluster in `Rancher <https://rancher.aetherproject.org>`_ and add the servers
+and switches to the cluster.
 
 .. code-block:: shell
 
-   $ git status
-   On branch tools
-   Changes not staged for commit:
-
-      modified:   cluster_map.tfvars
-      modified:   user_map.tfvars
-
-   $ git add .
-   $ git commit -m "Add test ACE"
-   $ git review
-
-Create runtime configurations
------------------------------
-
-In this step, we will add several Terraform configurations and overriding
-values for the managed applications.
-
-Run the following commands to auto-generate necessary files under the target
-ACE directory.
-
-.. code-block:: shell
-
+   # Create ace_cofig.yaml file if you haven't yet
    $ cd $WORKDIR/aether-pod-configs/tools
-   $ cp ace_env /tmp/ace_env
-   $ vi /tmp/ace_env
-   # Set environment variables
+   $ cp ace_config.yaml.example ace_config.yaml
+   $ vi ace_config.yaml
+   # Set all values
 
-   $ source /tmp/ace_env
    $ make runtime
-   Created ../production/ace-test
-   Created ../production/ace-test/main.tf
-   Created ../production/ace-test/variables.tf
-   Created ../production/ace-test/gcp_fw.tf
+   Created ../production/ace-test/provider.tf
    Created ../production/ace-test/cluster.tf
-   Created ../production/ace-test/alerts.tf
+   Created ../production/ace-test/rke-bare-metal.tf
+   Created ../production/ace-test/addon-manifests.yml.tpl
+   Created ../production/ace-test/project.tf
+   Created ../production/ace-test/member.tf
    Created ../production/ace-test/backend.tf
    Created ../production/ace-test/cluster_val.tfvars
-   Created ../production/ace-test/app_values
-   Created ../production/ace-test/app_values/ace-coredns.yml
-   Created ../production/ace-test/app_values/omec-upf-pfcp-agent.yml
 
-Create a review request
------------------------
+Create a review request.
 
 .. code-block:: shell
 
    $ cd $WORKDIR/aether-pod-configs
-   $ git status
-
-   Untracked files:
-   (use "git add <file>..." to include in what will be committed)
-
-      production/ace-test/alerts.tf
-      production/ace-test/app_values/
-      production/ace-test/cluster.tf
-
    $ git add .
    $ git commit -m "Add test ACE runtime configs"
    $ git review
 
-Once the review request is accepted and merged,
-CD pipeline will start to deploy K8S and Aether managed applications on it.
+Once your review request is accepted and merged, Aether CI/CD system starts to deploy K8S.
+Wait until the cluster status changes to **Active** in `Rancher <https://rancher.aetherproject.org>`_.
+It normally takes 10 - 15 minutes depending on the speed of the container images
+download at the edge.
+
+System Application Deployment
+-----------------------------
+
+For the system application deployment, we will be using Rancher's built-in GitOps tool, **Fleet**.
+Fleet uses a git repository as a single source of truth to manage applications in the clusters.
+For Aether, **aether-app-configs** is the repository where all Aether applications
+are defined.
+
+Most of the Aether system applications do not require cluster specific configurations,
+except **rancher-monitoring** and **uedns**.
+For these applications, you will have to manually create custom configurations and
+commit them to aether-app-configs.
+
+First, download ``aether-app-configs`` if you don't have it already in your development machine.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR
+   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-app-configs"
+
+Configure rancher-monitoring
+############################
+
+Open ``fleet.yaml`` under ``infrastructure/rancher-monitoring``, add a custom target
+with the new cluster name as a selector, and provide cluster specific Helm values and
+kustomize overlay directory path like below.
+
+.. code-block:: yaml
+
+   $ cd $WORKDIR/aether-app-configs/infrastructure/rancher-monitoring
+   $ vi fleet.yaml
+   # add following block at the end
+   - name: ace-test
+     clusterSelector:
+       matchLabels:
+         management.cattle.io/cluster-display-name: ace-test
+     helm:
+       values:
+         prometheus:
+           prometheusSpec:
+             additionalAlertRelabelConfigs:
+               - source_labels: [__address__]
+                 target_label: cluster
+                 replacement: ace-test
+     kustomize:
+       dir: overlays/prd-ace
+
+.. note::
+
+   Above step will not be required in Rancher v2.6 as it supports using cluster labels as helm values in a list.
+
+Configure ue-dns
+################
+
+For UE-DNS, it is required to create a Helm values file for the new cluster.
+You'll need cluster domain and kube-dns ClusterIP. Both can be found in
+``aether-pod-configs/production/cluster_map.tfvars``.
+Be sure to replace ``[ ]`` in the example configuration below to the actual cluster values.
+
+.. code-block:: yaml
+
+   $ cd $WORKDIR/aether-app-configs/infrastructure/coredns
+   $ mkdir overlays/prd-ace-test
+   $ vi overlays/prd-ace-test/values.yaml
+   # SPDX-FileCopyrightText: 2021-present Open Networking Foundation <info@opennetworking.org>
+
+   serviceType: ClusterIP
+   service:
+     clusterIP: [next address of the kube-dns ip]
+   servers:
+     - zones:
+         - zone: .
+       port: 53
+       plugins:
+         - name: errors
+         - name: health
+           configBlock: |-
+             lameduck 5s
+         - name: ready
+         - name: prometheus
+           parameters: 0.0.0.0:9153
+         - name: forward
+           parameters: . /etc/resolv.conf
+         - name: cache
+           parameters: 30
+         - name: loop
+         - name: reload
+         - name: loadbalance
+     - zones:
+         - zone: aetherproject.net
+       port: 53
+       plugins:
+         - name: errors
+         - name: rewrite continue
+           configBlock: |-
+             name regex (.*)\.aetherproject.net {1}.svc.[cluster domain]
+             answer name (.*)\.svc\.[cluster domain] {1}.aetherproject.net
+         - name: forward
+           parameters: . [kube-dns ip]
+           configBlock: |-
+             except kube-system.svc.[cluster domain] aether-sdcore.svc.[cluster domain] tost.svc.[cluster domain]
+         - name: cache
+           parameters: 30
+
+
+Next, update ``fleet.yaml`` under ``infrastructure/coredns`` so that Fleet can use the custom configuration
+you just created when deploying UE-DNS to the cluster.
+
+.. code-block:: yaml
+
+   $ cd $WORKDIR/aether-app-configs/infrastructure/coredns
+   $ vi fleet.yaml
+   # add following block at the end
+   - name: prd-ace-test
+     clusterSelector:
+       matchLabels:
+         management.cattle.io/cluster-display-name: ace-test
+     helm:
+       valuesFiles:
+         - overlays/prd-ace-test/values.yaml
+
+
+Submit your changes.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR/aether-app-configs
+   $ git status
+   $ git add .
+   $ git commit -m "Add test ACE application configs"
+   $ git review
+
+
+Assign Fleet workspace
+######################
+
+By default, all new clusters are assgiend to a default Fleet workspace called **fleet-default**.
+To make a cluster part of Aether and have the applications defined in aether-app-configs deployed,
+you must assign the cluster to either **aether-stable** or **aether-alpha** workspace.
+For clusters expecting minimal downtime, assign to **aether-stable**.
+For clusters for development or previewing upcoming release, assign to **aether-alpha**.
+
+Log in to `Rancher <https://rancher.aetherproject.org>`_ as ``admin`` or ``onfadmin`` user
+and go to the **Cluster Explorer**.
+In the top left dropdown menu, click **Cluster Explorer > Continuous Delivery**.
+
+.. image:: images/fleet-move-workspace.png
+
+
+1) Click the second dropdown menu from the left at the top and select **fleet-default**.
+2) Select **Clusters** on the left menu and you'll see the new cluster.
+3) Click the checkbox in front of the cluster name.
+4) Select **Assign to...** button and assign the cluster to the Aether workspace.
+
+Switch the workspace to the Aether workspace, click **Clusters** in the left menu, and check the
+new cluster exists.
+Wait until the cluster state becomes **Active**.
+
+.. attention::
+
+   Ignore BESS UPF failure at this point.

diff --git a/edge_deployment/vpn_bootstrap.rst b/edge_deployment/vpn_bootstrap.rst
index 223e3d8..7c57367 100644
--- a/edge_deployment/vpn_bootstrap.rst
+++ b/edge_deployment/vpn_bootstrap.rst

@@ -5,19 +5,65 @@
 VPN Bootstrap
 =============
 
-This section walks you through how to set up a VPN between ACE and Aether
-Central in GCP.  We will be using GitOps based Aether CD pipeline for this, so
-we just need to create a patch to **aether-pod-configs** repository.  Note that
-some of the steps described here are not directly related to setting up a VPN,
+This section guides you through setting up a VPN connection between Aether Central in GCP and ACE.
+We will be using GitOps based Aether CI/CD system for this and what you need to do is
+create a patch to Aether GitOps repository, **aether-pod-configs**, with the edge specific information.
+Note that some of the steps described here are not directly related to setting up a VPN,
 but rather are a prerequisite for adding a new ACE.
 
-.. attention::
+.. _add_deployment_jobs:
 
-   If you are adding another ACE to an existing VPN connection, go to
-   :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
+Add deployment jobs
+-------------------
+First, you need to add Jenkins jobs to Aether CI/CD system that build and apply infrastructure change
+plans for the new edge. This can be done by creating a patch to **aether-ci-management** repository.
 
-Before you begin
-----------------
+Download **aether-ci-management** repository.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR
+   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-ci-management"
+
+Add the jobs for the new cluster at the end of the `cd-pipeline-terraform-ace` project job list.
+Make sure to add both pre-merge and post-merge jobs.
+Note that the cluster name specified here will be used in the rest of the deployment procedure.
+
+.. code-block:: diff
+
+   $ cd $WORKDIR/aether-ci-management
+   $ vi jjb/repos/cd-pipeline-terraform.yaml
+
+   # Add jobs for the new cluster
+   diff jjb/repos/cd-pipeline-terraform.yamll
+   --- a/jjb/repos/cd-pipeline-terraform.yaml
+   +++ b/jjb/repos/cd-pipeline-terraform.yaml
+   @@ -227,3 +227,9 @@
+          - 'cd-pipeline-terraform-postmerge-cluster':
+              pod: 'production'
+              cluster: 'ace-eks'
+   +      - 'cd-pipeline-terraform-premerge-cluster':
+   +          pod: 'production'
+   +          cluster: 'ace-test'
+   +      - 'cd-pipeline-terraform-postmerge-cluster':
+   +          pod: 'production'
+   +          cluster: 'ace-test'
+
+Submit your change and wait for the jobs you just added available in Aether Jenkins.
+
+.. code-block:: shell
+
+   $ git status
+   Changes not staged for commit:
+
+     modified:   jjb/repos/cd-pipeline-terraform.yaml
+
+   $ git add .
+   $ git commit -m "Add test ACE deployment job"
+   $ git review
+
+Gather VPN information
+----------------------
 
 * Make sure firewall in front of ACE allows UDP port 500, UDP port 4500, and
   ESP packets from **gcpvpn1.infra.aetherproject.net(35.242.47.15)** and
@@ -31,7 +77,7 @@
 actually create a review request.
 
 +-----------------------------+----------------------------------+
-| Management node external IP | 128.105.144.189                  |
+| Management node external IP | 66.201.42.222                    |
 +-----------------------------+----------------------------------+
 | ASN                         | 65003                            |
 +-----------------------------+----------------------------------+
@@ -45,37 +91,53 @@
 +-----------------------------+----------------------------------+
 | PSK                         | UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI |
 +-----------------------------+----------------------------------+
-| Management Subnet           | 10.91.0.0/24                     |
+| Management Subnet           | 10.32.4.0/24                     |
 +-----------------------------+----------------------------------+
-| K8S Subnet                  | Pod IP: 10.66.0.0/17             |
+| K8S Subnet                  | Pod IP: 10.33.0.0/17             |
 |                             +----------------------------------+
-|                             | Cluster IP: 10.66.128.0/17       |
+|                             | Cluster IP: 10.33.128.0/17       |
 +-----------------------------+----------------------------------+
 
-Download aether-pod-configs repository
---------------------------------------
+.. note::
+   Use `this site <https://cloud.google.com/network-connectivity/docs/vpn/how-to/generating-pre-shared-key/>`_ to generate a new strong pre-shared key.
+
+.. attention::
+
+   If you are adding another ACE to an existing VPN connection, go to
+   :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
+
+Get access to encrypted files in aether-pod-configs repository
+--------------------------------------------------------------
+
+`git-crypt <https://github.com/AGWA/git-crypt>`_ is used to securely store encrypted files
+in the aether-pod-configs repository. Before proceeding, (1) install git-crypt and `gpg <https://gnupg.org/>`_,
+(2) create a GPG keypair, and (3) ask a member of the Aether OPs team to add your public key
+to the aether-pod-configs keyring.  To create the keypair follow these steps:
 
 .. code-block:: shell
 
-   $ cd $WORKDIR
-   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
+   $ gpg --full-generate-key
+   $ gpg --output <key-name>.gpg --armor --export <your-email-address>
 
 .. _update_global_resource:
 
 Update global resource maps
 ---------------------------
 
-Add a new ACE information at the end of the following global resource maps.
+Download aether-pod-configs repository.
+
+.. code-block:: shell
+
+   $ cd $WORKDIR
+   $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
+   $ git-crypt unlock
+
+Add the new cluster information at the end of the following global resource maps.
 
 * ``user_map.tfvars``
 * ``cluster_map.tfvars``
 * ``vpn_map.tfvars``
 
-As a note, you can find several other global resource maps under the
-``production`` directory.  Resource definitions that need to be shared among
-clusters or are better managed in a single file to avoid configuration
-conflicts are maintained in this way.
-
 .. code-block:: diff
 
    $ cd $WORKDIR/aether-pod-configs/production
@@ -113,11 +175,11 @@
    +    },
    +    ace-test = {
    +      cluster_name            = "ace-test"
-   +      management_subnets      = ["10.91.0.0/24"]
+   +      management_subnets      = ["10.32.4.0/24"]
    +      k8s_version             = "v1.18.8-rancher1-1"
-   +      k8s_pod_range           = "10.66.0.0/17"
-   +      k8s_cluster_ip_range    = "10.66.128.0/17"
-   +      kube_dns_cluster_ip     = "10.66.128.10"
+   +      k8s_pod_range           = "10.33.0.0/17"
+   +      k8s_cluster_ip_range    = "10.33.128.0/17"
+   +      kube_dns_cluster_ip     = "10.33.128.10"
    +      cluster_domain          = "prd.test.aetherproject.net"
    +      calico_ip_detect_method = "can-reach=www.google.com"
          }
@@ -140,7 +202,7 @@
    +  },
    +  ace-test = {
    +    peer_name                = "production-ace-test"
-   +    peer_vpn_gateway_address = "128.105.144.189"
+   +    peer_vpn_gateway_address = "66.201.42.222"
    +    tunnel_shared_secret     = "UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI"
    +    bgp_peer_asn             = "65003"
    +    bgp_peer_ip_range_1      = "169.254.0.9/30"
@@ -154,45 +216,35 @@
    Unless you have a specific requirement, set ASN and BGP addresses to the next available values in the map.
 
 
-Create ACE specific configurations
-----------------------------------
+Create Terraform and Ansible configurations
+-------------------------------------------
 
-In this step, we will create a directory under `production` with the same name
-as ACE, and add several Terraform configurations and Ansible inventory needed
-to configure a VPN connection.
-
-Throughout the deployment procedure, this directory will contain all ACE
-specific configurations.
-
-Run the following commands to auto-generate necessary files under the target
-ACE directory.
+In this step, we will create a directory under ``production`` with the same name
+as the cluster, and add Terraform configurations and Ansible inventory needed
+to configure a VPN in GCP and ACE accordingly.
 
 .. code-block:: shell
 
    $ cd $WORKDIR/aether-pod-configs/tools
-   $ cp ace_env /tmp/ace_env
-   $ vi /tmp/ace_env
-   # Set environment variables
+   $ cp ace_config.yaml.example ace_config.yaml
+   $ vi ace_config.yaml
+   # Set all values
 
-   $ source /tmp/ace_env
    $ make vpn
    Created ../production/ace-test
-   Created ../production/ace-test/main.tf
-   Created ../production/ace-test/variables.tf
-   Created ../production/ace-test/gcp_fw.tf
+   Created ../production/ace-test/provider.tf
+   Created ../production/ace-test/cluster.tf
    Created ../production/ace-test/gcp_ha_vpn.tf
-   Created ../production/ace-test/ansible
+   Created ../production/ace-test/gcp_fw.tf
    Created ../production/ace-test/backend.tf
    Created ../production/ace-test/cluster_val.tfvars
+   Created ../production/ace-test/ansible
    Created ../production/ace-test/ansible/hosts.ini
    Created ../production/ace-test/ansible/extra_vars.yml
 
-.. attention::
-   The predefined templates are tailored to Pronto BOM. You'll need to fix `cluster_val.tfvars` and `ansible/extra_vars.yml`
-   when using a different BOM.
 
-Create a review request
------------------------
+Submit your change
+------------------
 
 .. code-block:: shell
 
@@ -214,18 +266,17 @@
    $ git commit -m "Add test ACE"
    $ git review
 
-Once the review request is accepted and merged,
-CD pipeline will create VPN tunnels on both GCP and the management node.
+After the change is merged, wait for a while until the post-merge job finishes.
 
 Verify VPN connection
 ---------------------
 
-You can verify the VPN connections after successful post-merge job by checking
+You can verify the VPN connections by checking
 the routing table on the management node and trying to ping to one of the
 central cluster VMs.
 
-Make sure two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`, exist
-and three additional routing entries via one of the tunnel interfaces.
+Be sure there are two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`,
+and three routing entries via one of the tunnel interfaces.
 
 .. code-block:: shell
 
@@ -233,39 +284,39 @@
    $ netstat -rn
    Kernel IP routing table
    Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
-   0.0.0.0         128.105.144.1   0.0.0.0         UG        0 0          0 eno1
+   0.0.0.0         66.201.42.209   0.0.0.0         UG        0 0          0 eno1
+   10.32.4.0       0.0.0.0         255.255.255.128 U         0 0          0 eno2
+   10.32.4.128     0.0.0.0         255.255.255.128 U         0 0          0 mgmt800
    10.45.128.0     169.254.0.9     255.255.128.0   UG        0 0          0 gcp_tunnel1
    10.52.128.0     169.254.0.9     255.255.128.0   UG        0 0          0 gcp_tunnel1
-   10.66.128.0     10.91.0.8       255.255.128.0   UG        0 0          0 eno1
-   10.91.0.0       0.0.0.0         255.255.255.0   U         0 0          0 eno1
+   10.33.128.0     10.32.4.138     255.255.128.0   UG        0 0          0 mgmt800
    10.168.0.0      169.254.0.9     255.255.240.0   UG        0 0          0 gcp_tunnel1
-   128.105.144.0   0.0.0.0         255.255.252.0   U         0 0          0 eno1
+   66.201.42.208   0.0.0.0         255.255.252.0   U         0 0          0 eno1
    169.254.0.8     0.0.0.0         255.255.255.252 U         0 0          0 gcp_tunnel1
    169.254.1.8     0.0.0.0         255.255.255.252 U         0 0          0 gcp_tunnel2
 
    # Verify ACC VM access
    $ ping 10.168.0.6
 
-   # Verify ACC K8S cluster access
+   # Verify ACC K8S Service access
    $ nslookup kube-dns.kube-system.svc.prd.acc.gcp.aetherproject.net 10.52.128.10
 
-You can further verify whether the ACE routes are propagated well to GCP
-by checking GCP dashboard **VPC Network > Routes > Dynamic**.
+You can also login to GCP console and check if the edge subnets exist in
+**VPC Network > Routes > Dynamic**.
 
 
 Post VPN setup
 --------------
 
-Once you verify the VPN connections, please update `ansible` directory name to
-`_ansible` to prevent the ansible playbook from running again.  Note that it is
-no harm to re-run the ansible playbook but not recommended.
+Once you verify the VPN connections, update ``ansible`` directory name to
+``_ansible`` to prevent the ansible playbook from being rerun.
 
 .. code-block:: shell
 
    $ cd $WORKDIR/aether-pod-configs/production/$ACE_NAME
    $ mv ansible _ansible
    $ git add .
-   $ git commit -m "Mark ansible done for test ACE"
+   $ git commit -m "Ansible done for test ACE"
    $ git review
 
 .. _add_ace_to_vpn:
@@ -274,7 +325,7 @@
 """""""""""""""""""""""""""""""""""""""""""""
 
 VPN connections can be shared when there are multiple ACE clusters in a site.
-In order to add ACE to an existing VPN connection, you'll have to SSH into the
+In order to add another cluster to an existing VPN connection, you'll have to SSH into the
 management node and manually update BIRD configuration.
 
 .. note::
@@ -285,8 +336,9 @@
 
    $ sudo vi /etc/bird/bird.conf
    protocol static {
+      # Routings for the existing cluster
       ...
-      route 10.66.128.0/17 via 10.91.0.10;
+      route 10.33.128.0/17 via 10.32.4.138;
 
       # Add routings for the new ACE's K8S cluster IP range via cluster nodes
       # TODO: Configure iBGP peering with Calico nodes and dynamically learn these routings
@@ -297,7 +349,7 @@
 
    filter gcp_tunnel_out {
       # Add the new ACE's K8S cluster IP range and the management subnet if required to the list
-      if (net ~ [ 10.91.0.0/24, 10.66.128.0/17, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
+      if (net ~ [ 10.32.4.0/24, 10.33.128.0/17, <NEW-ACE-CLUSTER-MGMT-SUBNET>, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
       else reject;
    }
    # Save and exit

diff --git a/index.rst b/index.rst
index f7ce0cd..4aaff92 100644
--- a/index.rst
+++ b/index.rst

@@ -30,6 +30,7 @@
    edge_deployment/fabric_switch_bootstrap
    edge_deployment/vpn_bootstrap
    edge_deployment/runtime_deployment
+   edge_deployment/bess_upf_deployment
    edge_deployment/tost_deployment
    edge_deployment/connectivity_service_update
    edge_deployment/enb_installation
@@ -42,6 +43,7 @@
    :glob:
 
    amp/roc
+   amp/monitoring
 
 .. toctree::
    :maxdepth: 3
commit	09c793b03f0a755698063873406a8ee408712293	[log] [tgz]
author	Scott Baker <scottb@opennetworking.org>	Mon Aug 02 20:41:18 2021 +0000
committer	Gerrit Code Review <gerrit@opencord.org>	Mon Aug 02 20:41:18 2021 +0000
tree	c4723bf508d4364f038d0b5b56bca1ac8829273a
parent	6fc0ff1db924b2855f24af719df7e8326c2d7cd9 [diff]
parent	e59540ccada165ca55612e22d7fc0b18d563c8f9 [diff]