Merge "Subscriber Management"
diff --git a/.gitignore b/.gitignore
index 9a7c442..27d92e8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,2 +1,3 @@
venv-docs
_build
+.vscode
diff --git a/amp/monitoring.rst b/amp/monitoring.rst
new file mode 100644
index 0000000..b9d2ff4
--- /dev/null
+++ b/amp/monitoring.rst
@@ -0,0 +1,91 @@
+..
+ SPDX-FileCopyrightText: © 2021 Open Networking Foundation <support@opennetworking.org>
+ SPDX-License-Identifier: Apache-2.0
+
+Monitoring and Alerts
+=====================
+
+Aether leverages `Prometheus <https://prometheus.io/docs/introduction/overview/>`_ to collect
+and store platform and service metrics, `Grafana <https://grafana.com/docs/grafana/latest/getting-started/>`_
+to visualize metrics over time, and `Alertmanager <https://prometheus.io/docs/alerting/latest/alertmanager/>`_ to
+notify Aether OPs staff of events requiring attention. This monitoring stack is running on each Aether cluster.
+This section describes how an Aether component can "opt in" to the Aether monitoring stack so that its metrics can be
+collected and graphed, and can trigger alerts.
+
+
+Exporting Service Metrics to Prometheus
+---------------------------------------
+An Aether component implements a `Prometheus exporter <https://prometheus.io/docs/instrumenting/writing_exporters/>`_
+to expose its metrics to Prometheus. An exporter provides the current values of a components's
+metrics via HTTP using a simple text format. Prometheus scrapes the exporter's HTTP endpoint and stores the metrics
+in its Time Series Database (TSDB) for querying and analysis. Many `client libraries <https://prometheus.io/docs/instrumenting/clientlibs/>`_
+are available for instrumenting code to export metrics in Prometheus format. If a component's metrics are available
+in some other format, tools like `Telegraf <https://docs.influxdata.com/telegraf>`_ can be used to convert the metrics
+into Prometheus format and export them.
+
+A component that exposes a Prometheus exporter HTTP endpoint via a Service can tell Prometheus to scrape
+this endpoint by defining a
+`ServiceMonitor <https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/running-exporters.md>`_
+custom resource. The ServiceMonitor is typically created by the Helm chart that installs the component.
+
+
+Working with Grafana Dashboards
+--------------------------------
+Once the local cluster's Prometheus is collecting a component's metrics, they can be visualized using Grafana
+dashboards. The Grafana instance running on the AMP cluster is able to send queries to the Prometheus
+servers running on all Aether clusters. This means that component metrics can be visualized on the AMP Grafana
+regardless of where the component is actually running.
+
+In order to create a new Grafana dashboard or modify an existing one, first login to the AMP Grafana using an account
+with admin privileges. To add a new dashboard, click the **+** at left. To make a copy of an existing dashboard for
+editing, click the **Dashboard Settings** icon (gear icon) at upper right of the existing dashboard, and then
+click the **Save as…** button at left.
+
+Next, add panels to the dashboard. Since Grafana can access Prometheus on all the clusters in the environment,
+each cluster is available as a data source. For example, when adding a panel showing metrics collected on the
+ace-menlo cluster, choose ace-menlo as the data source.
+
+Clicking on the floppy disk icon at top will save the dashboard *temporarily* (the dashboard is not
+saved to persistent storage and is deleted as soon as Grafana is restarted). To save the dashboard *permanently*,
+click the **Share Dashboard** icon next to the title and save its JSON to a file. Then add the file to the
+aether-app-configs repository so that it will be deployed by Fleet:
+
+* Change to directory ``aether-app-configs/infrastructure/rancher-monitoring/overlays/<amp-cluster>/``
+* Copy the dashboard JSON file to the ``dashboards/`` sub-directory
+* Edit ``kustomization.yaml`` and add the new dashboard JSON under ``configmapGenerator``
+* Commit the changes and submit patchset to gerrit
+
+Once the patchset is merged, the AMP Grafana will automatically detect and deploy the new dashboard.
+
+Adding Service-specific Alerts
+------------------------------
+An alert can be triggered in Prometheus when a component metric crosses a threshold. The Alertmanager
+then routes the alert to one or more receivers (e.g., an email address or Slack channel).
+
+To add an alert for a component, create a
+`PrometheusRule <https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/alerting.md>`_
+custom resource, for example in the Helm chart that deploys the component. This resource describes one or
+more `rules <https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/>`_ using Prometheus expressions;
+if the expression is true for the time indicated, then the alert is raised. Once the PrometheusRule
+resource is instantiated, the cluster's Prometheus will pick up the rule and start evaluating it.
+
+The Alertmanager is configured to send alerts with *critical* or *warning* severity to e-mail and Slack channels
+monitored by Aether OPs staff. If it is desirable to route a specific alert to a different receiver
+(e.g., a component-specific Slack channel), it is necessary to change the Alertmanager configuration. This is stored in
+a `SealedSecret <https://github.com/bitnami-labs/sealed-secrets>`_ custom resource in the aether-app-configs repository.
+To update the configuration:
+
+* Change to directory ``aether-app-configs/infrastructure/rancher-monitoring/overlays/<cluster>/``
+* Update the ``receivers`` and ``route`` sections of the ``alertmanager-config.yaml`` file
+* Encode the ``alertmanager-config.yaml`` file as a Base64 string
+* Create a file ``alertmanager-config-secret.yaml`` to define the Secret resource using the Base64-encoded string
+* Run the following command using a valid ``PUBLICKEY``:
+
+.. code-block:: shell
+
+ $ kubeseal --cert "${PUBLICKEY}" --scope cluster-wide --format yaml < alertmanager-config-secret.yaml > alertmanager-config-sealed-secret.yaml
+
+* Commit the changes and submit patchset to gerrit
+
+Once the patchset is merged, verify that the SealedSecret was successfully unsealed and converted to a Secret
+by looking at the logs of the *sealed-secrets-controller* pod running on the cluster in the *kube-system* namespace.
diff --git a/edge_deployment/bess_upf_deployment.rst b/edge_deployment/bess_upf_deployment.rst
new file mode 100644
index 0000000..fb8e101
--- /dev/null
+++ b/edge_deployment/bess_upf_deployment.rst
@@ -0,0 +1,143 @@
+..
+ SPDX-FileCopyrightText: © 2021 Open Networking Foundation <support@opennetworking.org>
+ SPDX-License-Identifier: Apache-2.0
+
+BESS UPF Deployment
+===================
+
+This section describes how to configure and deploy BESS UPF.
+
+
+Network Settings
+----------------
+
+BESS UPF requires three networks, **enb**, **access**, and **core**, and all
+three networks must use different subnets. To help your understanding,
+the following example ACE environment will be used in the rest of the guide.
+
+.. image:: images/bess-upf-example-network.png
+
++-----------+-----------+------------------------------------+-------------------+---------------+
+| Network | VLAN | Subnet | Interface | IP address |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| enb | 2 | 192.168.2.0/24 (gw: 192.168.2.254) | mgmt server vlan2 | 192.168.2.254 |
+| | | +-------------------+---------------+
+| | | | enb | 192.168.2.10 |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| access | 3 | 192.168.3.0/24 (gw: 192.168.3.254) | mgmt server vlan3 | 192.168.3.254 |
+| | | +-------------------+---------------+
+| | | | upf access | 192.168.3.1 |
++-----------+-----------+------------------------------------+-------------------+---------------+
+| core | 4 | 192.168.4.0/24 (gw: 192.168.4.254) | mgmt server vlan4 | 192.168.4.254 |
+| | | +-------------------+---------------+
+| | | | upf core | 192.168.4.1 |
++-----------+-----------+------------------------------------+-------------------+---------------+
+
+.. note::
+
+ Management plane and out-of-band network are not depicted in the diagram.
+
+
+Note that the management server has the only external routable address and acts as a router for
+all networks in the Aether pod.
+So in order for UE to access the Internet, two things need to be done in the managmenet server.
+
+* For outgoing traffic, masquerade the internal adddress with the external address of the management server.
+* For response traffic to UE, forward them to UPF's **core** interface.
+
+
+Check Cluster Resources
+-----------------------
+
+Before proceeding with the deployment, make sure the cluster has enough resources to run BESS UPF.
+
+* 2 dedicated cores (``"cpu"``)
+* 2 1Gi HugePages (``"hugepages-1Gi"``)
+* 2 SRIOV Virtual Functions bound to **vfio-pci** driver (``"intel.com/intel_sriov_vfio"``)
+
+In fact, these requirements are not mandatory to run BESS UPF, but are recommended for best performance.
+You can use the following command to check allocatable resources in the cluster nodes.
+
+.. code-block:: shell
+
+ $ kubectl get nodes -o json | jq '.items[].status.available'
+ {
+ "cpu": "95",
+ "ephemeral-storage": "1770223432846",
+ "hugepages-1Gi": "32Gi",
+ "intel.com/intel_sriov_netdevice": "32",
+ "intel.com/intel_sriov_vfio": "32",
+ "memory": "360749956Ki",
+ "pods": "110"
+ }
+
+
+Configure and Deploy
+--------------------
+
+Download ``aether-app-configs`` if you don't have it already in your development machine.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR
+ $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-app-configs"
+
+Move the directory to ``apps/bess-upf/upf1`` and create a Helm values file for the new cluster as shown below.
+Don't forget to replace the IP addresses in the example configuration with the addresses of the actual cluster.
+
+.. code-block:: yaml
+
+ $ cd $WORKDIR/aether-app-configs/apps/bess-upf/upf1
+ $ mkdir overlays/prd-ace-test
+ $ vi overlays/prd-ace-test/values.yaml
+ # SPDX-FileCopyrightText: 2020-present Open Networking Foundation <info@opennetworking.org>
+
+ config:
+ upf:
+ enb:
+ subnet: "192.168.2.0/24"
+ access:
+ ip: "192.168.3.1/24"
+ gateway: "192.168.3.254"
+ vlan: 3
+ core:
+ ip: "192.168.4.1/24"
+ gateway: "192.168.4.254"
+ vlan: 4
+ # Below is required only when connecting to 5G core
+ cfgFiles:
+ upf.json:
+ cpiface:
+ dnn: "8internet"
+ hostname: "upf"
+
+
+Update ``fleet.yaml`` in the same directory to let Fleet use the custom configuration when deploying
+BESS UPF to the new cluster.
+
+.. code-block:: yaml
+
+ $ vi fleet.yaml
+ # add following block at the end
+ - name: prd-ace-test
+ clusterSelector:
+ matchLabels:
+ management.cattle.io/cluster-display-name: ace-test
+ helm:
+ valuesFiles:
+ - overlays/prd-ace-test/values.yaml
+
+
+Submit your changes.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR/aether-app-configs
+ $ git status
+ $ git add .
+ $ git commit -m "Add BESS UPF configs for test ACE"
+ $ git review
+
+
+Go to Fleet dashboard and wait until the cluster status becomes **Active**.
+It can take up to 1 min for Fleet to fetch the configuration updates.
diff --git a/edge_deployment/images/bess-upf-example-network.png b/edge_deployment/images/bess-upf-example-network.png
new file mode 100644
index 0000000..c6fff20
--- /dev/null
+++ b/edge_deployment/images/bess-upf-example-network.png
Binary files differ
diff --git a/edge_deployment/images/fleet-move-workspace.png b/edge_deployment/images/fleet-move-workspace.png
new file mode 100644
index 0000000..accfb6d
--- /dev/null
+++ b/edge_deployment/images/fleet-move-workspace.png
Binary files differ
diff --git a/edge_deployment/runtime_deployment.rst b/edge_deployment/runtime_deployment.rst
index c97f55d..7b3e700 100644
--- a/edge_deployment/runtime_deployment.rst
+++ b/edge_deployment/runtime_deployment.rst
@@ -2,19 +2,35 @@
SPDX-FileCopyrightText: © 2020 Open Networking Foundation <support@opennetworking.org>
SPDX-License-Identifier: Apache-2.0
-Aether Runtime Deployment
-=========================
+Runtime Deployment
+==================
-This section describes how to install Aether edge runtime and Aether managed
-applications.
+This section describes how to install and configure Aether Edge Runtime including Kubernetes
+and system level applications listed below.
-We will be using GitOps based Aether CD pipeline for this, so we just need to
-create a patch to **aether-pod-configs** repository.
+* sealed-secrets
+* rancher-monitoring
+* fluent-bit
+* opendistro-es
+* hostpath-provisioner
+* edge-maintenance-agent
+* sriov-device-plugin
+* uedns
-Download aether-pod-configs repository
---------------------------------------
+For this, we will be using GitOps based CI/CD systems and what you will need to do is
+create patches in Aether GitOps repositories, **aether-pod-configs** and **aether-app-configs**,
+to provide the cluster configurations to the CI/CD systems.
-Download the ``aether-pod-configs`` repository if you don't have it already in
+.. attention::
+
+ If you skipped VPN bootstap step and didn't add the deployment jobs for the new edge,
+ go to :ref:`Add deployment jobs <add_deployment_jobs>` step and finish it first
+ before proceeding.
+
+K8S cluster deployment
+----------------------
+
+Download ``aether-pod-configs`` repository if you don't have it already in
your development machine.
.. code-block:: shell
@@ -22,135 +38,206 @@
$ cd $WORKDIR
$ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
-Update global resource maps
----------------------------
-
.. attention::
- Skip this section if you have already done the same step in the
- :ref:`Update Global Resources Map for VPN <update_global_resource>` section.
+ If you skipped VPN bootstap step and didn't update global resource maps for the new edge,
+ go to :ref:`Update global resource maps <update_global_resource>` step and
+ finish ``cluster_map.tfvars`` and ``user_map.tfvars`` update first before proceeding.
-Add a new ACE information at the end of the following global resource maps.
-
-* user_map.tfvars
-* cluster_map.tfvars
-
-As a note, you can find several other global resource maps under the
-`production` directory. Resource definitions that need to be shared among
-clusters or are better managed in a single file to avoid configuration
-conflicts are maintained in this way.
-
-.. code-block:: diff
-
- $ cd $WORKDIR/aether-pod-configs/production
- $ vi user_map.tfvars
-
- # Add the new cluster admin user at the end of the map
- $ git diff user_map.tfvars
- --- a/production/user_map.tfvars
- +++ b/production/user_map.tfvars
- @@ user_map = {
- username = "menlo"
- password = "changeme"
- global_roles = ["user-base", "catalogs-use"]
- + },
- + test_admin = {
- + username = "test"
- + password = "changeme"
- + global_roles = ["user-base", "catalogs-use"]
- }
- }
-
-.. code-block:: diff
-
- $ cd $WORKDIR/aether-pod-configs/production
- $ vi cluster_map.tfvars
-
- # Add the new K8S cluster information at the end of the map
- $ git diff cluster_map.tfvars
- --- a/production/cluster_map.tfvars
- +++ b/production/cluster_map.tfvars
- @@ cluster_map = {
- kube_dns_cluster_ip = "10.53.128.10"
- cluster_domain = "prd.menlo.aetherproject.net"
- calico_ip_detect_method = "can-reach=www.google.com"
- + },
- + ace-test = {
- + cluster_name = "ace-test"
- + management_subnets = ["10.91.0.0/24"]
- + k8s_version = "v1.18.8-rancher1-1"
- + k8s_pod_range = "10.66.0.0/17"
- + k8s_cluster_ip_range = "10.66.128.0/17"
- + kube_dns_cluster_ip = "10.66.128.10"
- + cluster_domain = "prd.test.aetherproject.net"
- + calico_ip_detect_method = "can-reach=www.google.com"
- }
- }
- }
-
-You'll have to get this change merged before proceeding.
+Run the following commands to automatically generate Terraform configurations needed to
+create a new cluster in `Rancher <https://rancher.aetherproject.org>`_ and add the servers
+and switches to the cluster.
.. code-block:: shell
- $ git status
- On branch tools
- Changes not staged for commit:
-
- modified: cluster_map.tfvars
- modified: user_map.tfvars
-
- $ git add .
- $ git commit -m "Add test ACE"
- $ git review
-
-Create runtime configurations
------------------------------
-
-In this step, we will add several Terraform configurations and overriding
-values for the managed applications.
-
-Run the following commands to auto-generate necessary files under the target
-ACE directory.
-
-.. code-block:: shell
-
+ # Create ace_cofig.yaml file if you haven't yet
$ cd $WORKDIR/aether-pod-configs/tools
- $ cp ace_env /tmp/ace_env
- $ vi /tmp/ace_env
- # Set environment variables
+ $ cp ace_config.yaml.example ace_config.yaml
+ $ vi ace_config.yaml
+ # Set all values
- $ source /tmp/ace_env
$ make runtime
- Created ../production/ace-test
- Created ../production/ace-test/main.tf
- Created ../production/ace-test/variables.tf
- Created ../production/ace-test/gcp_fw.tf
+ Created ../production/ace-test/provider.tf
Created ../production/ace-test/cluster.tf
- Created ../production/ace-test/alerts.tf
+ Created ../production/ace-test/rke-bare-metal.tf
+ Created ../production/ace-test/addon-manifests.yml.tpl
+ Created ../production/ace-test/project.tf
+ Created ../production/ace-test/member.tf
Created ../production/ace-test/backend.tf
Created ../production/ace-test/cluster_val.tfvars
- Created ../production/ace-test/app_values
- Created ../production/ace-test/app_values/ace-coredns.yml
- Created ../production/ace-test/app_values/omec-upf-pfcp-agent.yml
-Create a review request
------------------------
+Create a review request.
.. code-block:: shell
$ cd $WORKDIR/aether-pod-configs
- $ git status
-
- Untracked files:
- (use "git add <file>..." to include in what will be committed)
-
- production/ace-test/alerts.tf
- production/ace-test/app_values/
- production/ace-test/cluster.tf
-
$ git add .
$ git commit -m "Add test ACE runtime configs"
$ git review
-Once the review request is accepted and merged,
-CD pipeline will start to deploy K8S and Aether managed applications on it.
+Once your review request is accepted and merged, Aether CI/CD system starts to deploy K8S.
+Wait until the cluster status changes to **Active** in `Rancher <https://rancher.aetherproject.org>`_.
+It normally takes 10 - 15 minutes depending on the speed of the container images
+download at the edge.
+
+System Application Deployment
+-----------------------------
+
+For the system application deployment, we will be using Rancher's built-in GitOps tool, **Fleet**.
+Fleet uses a git repository as a single source of truth to manage applications in the clusters.
+For Aether, **aether-app-configs** is the repository where all Aether applications
+are defined.
+
+Most of the Aether system applications do not require cluster specific configurations,
+except **rancher-monitoring** and **uedns**.
+For these applications, you will have to manually create custom configurations and
+commit them to aether-app-configs.
+
+First, download ``aether-app-configs`` if you don't have it already in your development machine.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR
+ $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-app-configs"
+
+Configure rancher-monitoring
+############################
+
+Open ``fleet.yaml`` under ``infrastructure/rancher-monitoring``, add a custom target
+with the new cluster name as a selector, and provide cluster specific Helm values and
+kustomize overlay directory path like below.
+
+.. code-block:: yaml
+
+ $ cd $WORKDIR/aether-app-configs/infrastructure/rancher-monitoring
+ $ vi fleet.yaml
+ # add following block at the end
+ - name: ace-test
+ clusterSelector:
+ matchLabels:
+ management.cattle.io/cluster-display-name: ace-test
+ helm:
+ values:
+ prometheus:
+ prometheusSpec:
+ additionalAlertRelabelConfigs:
+ - source_labels: [__address__]
+ target_label: cluster
+ replacement: ace-test
+ kustomize:
+ dir: overlays/prd-ace
+
+.. note::
+
+ Above step will not be required in Rancher v2.6 as it supports using cluster labels as helm values in a list.
+
+Configure ue-dns
+################
+
+For UE-DNS, it is required to create a Helm values file for the new cluster.
+You'll need cluster domain and kube-dns ClusterIP. Both can be found in
+``aether-pod-configs/production/cluster_map.tfvars``.
+Be sure to replace ``[ ]`` in the example configuration below to the actual cluster values.
+
+.. code-block:: yaml
+
+ $ cd $WORKDIR/aether-app-configs/infrastructure/coredns
+ $ mkdir overlays/prd-ace-test
+ $ vi overlays/prd-ace-test/values.yaml
+ # SPDX-FileCopyrightText: 2021-present Open Networking Foundation <info@opennetworking.org>
+
+ serviceType: ClusterIP
+ service:
+ clusterIP: [next address of the kube-dns ip]
+ servers:
+ - zones:
+ - zone: .
+ port: 53
+ plugins:
+ - name: errors
+ - name: health
+ configBlock: |-
+ lameduck 5s
+ - name: ready
+ - name: prometheus
+ parameters: 0.0.0.0:9153
+ - name: forward
+ parameters: . /etc/resolv.conf
+ - name: cache
+ parameters: 30
+ - name: loop
+ - name: reload
+ - name: loadbalance
+ - zones:
+ - zone: aetherproject.net
+ port: 53
+ plugins:
+ - name: errors
+ - name: rewrite continue
+ configBlock: |-
+ name regex (.*)\.aetherproject.net {1}.svc.[cluster domain]
+ answer name (.*)\.svc\.[cluster domain] {1}.aetherproject.net
+ - name: forward
+ parameters: . [kube-dns ip]
+ configBlock: |-
+ except kube-system.svc.[cluster domain] aether-sdcore.svc.[cluster domain] tost.svc.[cluster domain]
+ - name: cache
+ parameters: 30
+
+
+Next, update ``fleet.yaml`` under ``infrastructure/coredns`` so that Fleet can use the custom configuration
+you just created when deploying UE-DNS to the cluster.
+
+.. code-block:: yaml
+
+ $ cd $WORKDIR/aether-app-configs/infrastructure/coredns
+ $ vi fleet.yaml
+ # add following block at the end
+ - name: prd-ace-test
+ clusterSelector:
+ matchLabels:
+ management.cattle.io/cluster-display-name: ace-test
+ helm:
+ valuesFiles:
+ - overlays/prd-ace-test/values.yaml
+
+
+Submit your changes.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR/aether-app-configs
+ $ git status
+ $ git add .
+ $ git commit -m "Add test ACE application configs"
+ $ git review
+
+
+Assign Fleet workspace
+######################
+
+By default, all new clusters are assgiend to a default Fleet workspace called **fleet-default**.
+To make a cluster part of Aether and have the applications defined in aether-app-configs deployed,
+you must assign the cluster to either **aether-stable** or **aether-alpha** workspace.
+For clusters expecting minimal downtime, assign to **aether-stable**.
+For clusters for development or previewing upcoming release, assign to **aether-alpha**.
+
+Log in to `Rancher <https://rancher.aetherproject.org>`_ as ``admin`` or ``onfadmin`` user
+and go to the **Cluster Explorer**.
+In the top left dropdown menu, click **Cluster Explorer > Continuous Delivery**.
+
+.. image:: images/fleet-move-workspace.png
+
+
+1) Click the second dropdown menu from the left at the top and select **fleet-default**.
+2) Select **Clusters** on the left menu and you'll see the new cluster.
+3) Click the checkbox in front of the cluster name.
+4) Select **Assign to...** button and assign the cluster to the Aether workspace.
+
+Switch the workspace to the Aether workspace, click **Clusters** in the left menu, and check the
+new cluster exists.
+Wait until the cluster state becomes **Active**.
+
+.. attention::
+
+ Ignore BESS UPF failure at this point.
diff --git a/edge_deployment/vpn_bootstrap.rst b/edge_deployment/vpn_bootstrap.rst
index 223e3d8..7c57367 100644
--- a/edge_deployment/vpn_bootstrap.rst
+++ b/edge_deployment/vpn_bootstrap.rst
@@ -5,19 +5,65 @@
VPN Bootstrap
=============
-This section walks you through how to set up a VPN between ACE and Aether
-Central in GCP. We will be using GitOps based Aether CD pipeline for this, so
-we just need to create a patch to **aether-pod-configs** repository. Note that
-some of the steps described here are not directly related to setting up a VPN,
+This section guides you through setting up a VPN connection between Aether Central in GCP and ACE.
+We will be using GitOps based Aether CI/CD system for this and what you need to do is
+create a patch to Aether GitOps repository, **aether-pod-configs**, with the edge specific information.
+Note that some of the steps described here are not directly related to setting up a VPN,
but rather are a prerequisite for adding a new ACE.
-.. attention::
+.. _add_deployment_jobs:
- If you are adding another ACE to an existing VPN connection, go to
- :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
+Add deployment jobs
+-------------------
+First, you need to add Jenkins jobs to Aether CI/CD system that build and apply infrastructure change
+plans for the new edge. This can be done by creating a patch to **aether-ci-management** repository.
-Before you begin
-----------------
+Download **aether-ci-management** repository.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR
+ $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-ci-management"
+
+Add the jobs for the new cluster at the end of the `cd-pipeline-terraform-ace` project job list.
+Make sure to add both pre-merge and post-merge jobs.
+Note that the cluster name specified here will be used in the rest of the deployment procedure.
+
+.. code-block:: diff
+
+ $ cd $WORKDIR/aether-ci-management
+ $ vi jjb/repos/cd-pipeline-terraform.yaml
+
+ # Add jobs for the new cluster
+ diff jjb/repos/cd-pipeline-terraform.yamll
+ --- a/jjb/repos/cd-pipeline-terraform.yaml
+ +++ b/jjb/repos/cd-pipeline-terraform.yaml
+ @@ -227,3 +227,9 @@
+ - 'cd-pipeline-terraform-postmerge-cluster':
+ pod: 'production'
+ cluster: 'ace-eks'
+ + - 'cd-pipeline-terraform-premerge-cluster':
+ + pod: 'production'
+ + cluster: 'ace-test'
+ + - 'cd-pipeline-terraform-postmerge-cluster':
+ + pod: 'production'
+ + cluster: 'ace-test'
+
+Submit your change and wait for the jobs you just added available in Aether Jenkins.
+
+.. code-block:: shell
+
+ $ git status
+ Changes not staged for commit:
+
+ modified: jjb/repos/cd-pipeline-terraform.yaml
+
+ $ git add .
+ $ git commit -m "Add test ACE deployment job"
+ $ git review
+
+Gather VPN information
+----------------------
* Make sure firewall in front of ACE allows UDP port 500, UDP port 4500, and
ESP packets from **gcpvpn1.infra.aetherproject.net(35.242.47.15)** and
@@ -31,7 +77,7 @@
actually create a review request.
+-----------------------------+----------------------------------+
-| Management node external IP | 128.105.144.189 |
+| Management node external IP | 66.201.42.222 |
+-----------------------------+----------------------------------+
| ASN | 65003 |
+-----------------------------+----------------------------------+
@@ -45,37 +91,53 @@
+-----------------------------+----------------------------------+
| PSK | UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI |
+-----------------------------+----------------------------------+
-| Management Subnet | 10.91.0.0/24 |
+| Management Subnet | 10.32.4.0/24 |
+-----------------------------+----------------------------------+
-| K8S Subnet | Pod IP: 10.66.0.0/17 |
+| K8S Subnet | Pod IP: 10.33.0.0/17 |
| +----------------------------------+
-| | Cluster IP: 10.66.128.0/17 |
+| | Cluster IP: 10.33.128.0/17 |
+-----------------------------+----------------------------------+
-Download aether-pod-configs repository
---------------------------------------
+.. note::
+ Use `this site <https://cloud.google.com/network-connectivity/docs/vpn/how-to/generating-pre-shared-key/>`_ to generate a new strong pre-shared key.
+
+.. attention::
+
+ If you are adding another ACE to an existing VPN connection, go to
+ :ref:`Add ACE to an existing VPN connection <add_ace_to_vpn>`
+
+Get access to encrypted files in aether-pod-configs repository
+--------------------------------------------------------------
+
+`git-crypt <https://github.com/AGWA/git-crypt>`_ is used to securely store encrypted files
+in the aether-pod-configs repository. Before proceeding, (1) install git-crypt and `gpg <https://gnupg.org/>`_,
+(2) create a GPG keypair, and (3) ask a member of the Aether OPs team to add your public key
+to the aether-pod-configs keyring. To create the keypair follow these steps:
.. code-block:: shell
- $ cd $WORKDIR
- $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
+ $ gpg --full-generate-key
+ $ gpg --output <key-name>.gpg --armor --export <your-email-address>
.. _update_global_resource:
Update global resource maps
---------------------------
-Add a new ACE information at the end of the following global resource maps.
+Download aether-pod-configs repository.
+
+.. code-block:: shell
+
+ $ cd $WORKDIR
+ $ git clone "ssh://[username]@gerrit.opencord.org:29418/aether-pod-configs"
+ $ git-crypt unlock
+
+Add the new cluster information at the end of the following global resource maps.
* ``user_map.tfvars``
* ``cluster_map.tfvars``
* ``vpn_map.tfvars``
-As a note, you can find several other global resource maps under the
-``production`` directory. Resource definitions that need to be shared among
-clusters or are better managed in a single file to avoid configuration
-conflicts are maintained in this way.
-
.. code-block:: diff
$ cd $WORKDIR/aether-pod-configs/production
@@ -113,11 +175,11 @@
+ },
+ ace-test = {
+ cluster_name = "ace-test"
- + management_subnets = ["10.91.0.0/24"]
+ + management_subnets = ["10.32.4.0/24"]
+ k8s_version = "v1.18.8-rancher1-1"
- + k8s_pod_range = "10.66.0.0/17"
- + k8s_cluster_ip_range = "10.66.128.0/17"
- + kube_dns_cluster_ip = "10.66.128.10"
+ + k8s_pod_range = "10.33.0.0/17"
+ + k8s_cluster_ip_range = "10.33.128.0/17"
+ + kube_dns_cluster_ip = "10.33.128.10"
+ cluster_domain = "prd.test.aetherproject.net"
+ calico_ip_detect_method = "can-reach=www.google.com"
}
@@ -140,7 +202,7 @@
+ },
+ ace-test = {
+ peer_name = "production-ace-test"
- + peer_vpn_gateway_address = "128.105.144.189"
+ + peer_vpn_gateway_address = "66.201.42.222"
+ tunnel_shared_secret = "UMAoZA7blv6gd3IaArDqgK2s0sDB8mlI"
+ bgp_peer_asn = "65003"
+ bgp_peer_ip_range_1 = "169.254.0.9/30"
@@ -154,45 +216,35 @@
Unless you have a specific requirement, set ASN and BGP addresses to the next available values in the map.
-Create ACE specific configurations
-----------------------------------
+Create Terraform and Ansible configurations
+-------------------------------------------
-In this step, we will create a directory under `production` with the same name
-as ACE, and add several Terraform configurations and Ansible inventory needed
-to configure a VPN connection.
-
-Throughout the deployment procedure, this directory will contain all ACE
-specific configurations.
-
-Run the following commands to auto-generate necessary files under the target
-ACE directory.
+In this step, we will create a directory under ``production`` with the same name
+as the cluster, and add Terraform configurations and Ansible inventory needed
+to configure a VPN in GCP and ACE accordingly.
.. code-block:: shell
$ cd $WORKDIR/aether-pod-configs/tools
- $ cp ace_env /tmp/ace_env
- $ vi /tmp/ace_env
- # Set environment variables
+ $ cp ace_config.yaml.example ace_config.yaml
+ $ vi ace_config.yaml
+ # Set all values
- $ source /tmp/ace_env
$ make vpn
Created ../production/ace-test
- Created ../production/ace-test/main.tf
- Created ../production/ace-test/variables.tf
- Created ../production/ace-test/gcp_fw.tf
+ Created ../production/ace-test/provider.tf
+ Created ../production/ace-test/cluster.tf
Created ../production/ace-test/gcp_ha_vpn.tf
- Created ../production/ace-test/ansible
+ Created ../production/ace-test/gcp_fw.tf
Created ../production/ace-test/backend.tf
Created ../production/ace-test/cluster_val.tfvars
+ Created ../production/ace-test/ansible
Created ../production/ace-test/ansible/hosts.ini
Created ../production/ace-test/ansible/extra_vars.yml
-.. attention::
- The predefined templates are tailored to Pronto BOM. You'll need to fix `cluster_val.tfvars` and `ansible/extra_vars.yml`
- when using a different BOM.
-Create a review request
------------------------
+Submit your change
+------------------
.. code-block:: shell
@@ -214,18 +266,17 @@
$ git commit -m "Add test ACE"
$ git review
-Once the review request is accepted and merged,
-CD pipeline will create VPN tunnels on both GCP and the management node.
+After the change is merged, wait for a while until the post-merge job finishes.
Verify VPN connection
---------------------
-You can verify the VPN connections after successful post-merge job by checking
+You can verify the VPN connections by checking
the routing table on the management node and trying to ping to one of the
central cluster VMs.
-Make sure two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`, exist
-and three additional routing entries via one of the tunnel interfaces.
+Be sure there are two tunnel interfaces, `gcp_tunnel1` and `gcp_tunnel2`,
+and three routing entries via one of the tunnel interfaces.
.. code-block:: shell
@@ -233,39 +284,39 @@
$ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
- 0.0.0.0 128.105.144.1 0.0.0.0 UG 0 0 0 eno1
+ 0.0.0.0 66.201.42.209 0.0.0.0 UG 0 0 0 eno1
+ 10.32.4.0 0.0.0.0 255.255.255.128 U 0 0 0 eno2
+ 10.32.4.128 0.0.0.0 255.255.255.128 U 0 0 0 mgmt800
10.45.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
10.52.128.0 169.254.0.9 255.255.128.0 UG 0 0 0 gcp_tunnel1
- 10.66.128.0 10.91.0.8 255.255.128.0 UG 0 0 0 eno1
- 10.91.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eno1
+ 10.33.128.0 10.32.4.138 255.255.128.0 UG 0 0 0 mgmt800
10.168.0.0 169.254.0.9 255.255.240.0 UG 0 0 0 gcp_tunnel1
- 128.105.144.0 0.0.0.0 255.255.252.0 U 0 0 0 eno1
+ 66.201.42.208 0.0.0.0 255.255.252.0 U 0 0 0 eno1
169.254.0.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel1
169.254.1.8 0.0.0.0 255.255.255.252 U 0 0 0 gcp_tunnel2
# Verify ACC VM access
$ ping 10.168.0.6
- # Verify ACC K8S cluster access
+ # Verify ACC K8S Service access
$ nslookup kube-dns.kube-system.svc.prd.acc.gcp.aetherproject.net 10.52.128.10
-You can further verify whether the ACE routes are propagated well to GCP
-by checking GCP dashboard **VPC Network > Routes > Dynamic**.
+You can also login to GCP console and check if the edge subnets exist in
+**VPC Network > Routes > Dynamic**.
Post VPN setup
--------------
-Once you verify the VPN connections, please update `ansible` directory name to
-`_ansible` to prevent the ansible playbook from running again. Note that it is
-no harm to re-run the ansible playbook but not recommended.
+Once you verify the VPN connections, update ``ansible`` directory name to
+``_ansible`` to prevent the ansible playbook from being rerun.
.. code-block:: shell
$ cd $WORKDIR/aether-pod-configs/production/$ACE_NAME
$ mv ansible _ansible
$ git add .
- $ git commit -m "Mark ansible done for test ACE"
+ $ git commit -m "Ansible done for test ACE"
$ git review
.. _add_ace_to_vpn:
@@ -274,7 +325,7 @@
"""""""""""""""""""""""""""""""""""""""""""""
VPN connections can be shared when there are multiple ACE clusters in a site.
-In order to add ACE to an existing VPN connection, you'll have to SSH into the
+In order to add another cluster to an existing VPN connection, you'll have to SSH into the
management node and manually update BIRD configuration.
.. note::
@@ -285,8 +336,9 @@
$ sudo vi /etc/bird/bird.conf
protocol static {
+ # Routings for the existing cluster
...
- route 10.66.128.0/17 via 10.91.0.10;
+ route 10.33.128.0/17 via 10.32.4.138;
# Add routings for the new ACE's K8S cluster IP range via cluster nodes
# TODO: Configure iBGP peering with Calico nodes and dynamically learn these routings
@@ -297,7 +349,7 @@
filter gcp_tunnel_out {
# Add the new ACE's K8S cluster IP range and the management subnet if required to the list
- if (net ~ [ 10.91.0.0/24, 10.66.128.0/17, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
+ if (net ~ [ 10.32.4.0/24, 10.33.128.0/17, <NEW-ACE-CLUSTER-MGMT-SUBNET>, <NEW-ACE-CLUSTER-IP-RANGE> ]) then accept;
else reject;
}
# Save and exit
diff --git a/index.rst b/index.rst
index f7ce0cd..4aaff92 100644
--- a/index.rst
+++ b/index.rst
@@ -30,6 +30,7 @@
edge_deployment/fabric_switch_bootstrap
edge_deployment/vpn_bootstrap
edge_deployment/runtime_deployment
+ edge_deployment/bess_upf_deployment
edge_deployment/tost_deployment
edge_deployment/connectivity_service_update
edge_deployment/enb_installation
@@ -42,6 +43,7 @@
:glob:
amp/roc
+ amp/monitoring
.. toctree::
:maxdepth: 3