[SEBA-154] and [SEBA-104] Docs

Change-Id: I449edbb78c0900af0d4ce6bb6a1a80a77c625faf
diff --git a/SUMMARY.md b/SUMMARY.md
index ee4e0ec..cd55421 100644
--- a/SUMMARY.md
+++ b/SUMMARY.md
@@ -35,7 +35,8 @@
             * [VTN Setup](prereqs/vtn-setup.md)
         * [M-CORD](charts/mcord.md)
         * [XOSSH](charts/xossh.md)
-        * [MONITORING](charts/monitoring.md)
+        * [Logging and Monitoring](charts/logging-monitoring.md)
+        * [Persistent Storage](charts/storage.md)
 * [Operations Guide](operating_cord/operating_cord.md)
     * [General Info](operating_cord/general.md)
         * [GUI](operating_cord/gui.md)
diff --git a/charts/kafka.md b/charts/kafka.md
index 4004be0..99cfe0e 100644
--- a/charts/kafka.md
+++ b/charts/kafka.md
@@ -7,22 +7,11 @@
 
 ```shell
 helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
-helm install --name cord-kafka \
---set replicas=1 \
---set persistence.enabled=false \
---set zookeeper.servers=1 \
---set zookeeper.persistence.enabled=false \
-incubator/kafka
+helm install -f examples/kafka-single.yaml --version 0.8.8 -n cord-kafka incubator/kafka
+helm install -f examples/kafka-single.yaml --version 0.8.8 -n voltha-kafka incubator/kafka
 ```
 
-If you are experierencing problems with a multi instance installation of kafka,
-you can try to install a single instance of it:
-
-```shell
-helm install --name cord-kafka incubator/kafka -f examples/kafka-single.yaml
-```
-
-## Viewing events on the bus
+## Viewing events with kafkacat
 
 As a debugging tool you can deploy a container containing `kafkacat` and use
 that to listen for events:
@@ -31,21 +20,27 @@
 helm install -n kafkacat xos-tools/kafkacat/
 ```
 
-Once the container is up and running you can exec into the pod and use this 
-command to listen for events on a particular topic:
+Once the container is up and running you can exec into the pod and use various
+commands.  For a complete reference, please refer to the [`kafkacat`
+guide](https://github.com/edenhill/kafkacat)
 
-```shell
-kafkacat -C -b <kafka-service> -t <kafka-topic>
-```
+ A few examples:
 
-For a complete reference, please refer to the [`kafkacat` guide](https://github.com/edenhill/kafkacat)
+- List available topics:
+  ```shell
+  kafkacat -L -b <kafka-service>
+  ```
 
-### Most common topics
+- Listen for events on a particular topic:
+  ```shell
+  kafkacat -C -b <kafka-service> -t <kafka-topic>
+  ```
 
-Here are some of the most common topic you can listen to on `cord-kafka`:
+- Some common topics to listen for on `cord-kafka` and `voltha-kafka`:
 
-```shell
-kafkacat -b cord-kafka -t onu.events
-kafkacat -b cord-kafka -t authentication.events
-kafkacat -b cord-kafka -t dhcp.events
-```
\ No newline at end of file
+  ```shell
+  kafkacat -b cord-kafka -t onu.events
+  kafkacat -b cord-kafka -t authentication.events
+  kafkacat -b cord-kafka -t dhcp.events
+  kafkacat -b voltha-kafka -t voltha.events
+  ```
diff --git a/charts/local-persistent-volume.md b/charts/local-persistent-volume.md
deleted file mode 100644
index b2bd8d8..0000000
--- a/charts/local-persistent-volume.md
+++ /dev/null
@@ -1,40 +0,0 @@
-# Local Persistent Volume Helm chart
-
-## Introduction
-
-The `local-persistent-volume` helm chart is a utility helm chart. It was
-created mainly to persist the `xos-core` DB data but this helm can be used
-to persist any data.
-
-It uses a relatively new kubernetes feature (it's a beta feature
-in Kubernetes 1.10.x) that allows us to define an independent persistent
-store in a kubernetes cluster.
-
-The helm chart mainly consists of the following kubernetes resources:
-
-- A storage class resource representing a local persistent volume
-- A persistent volume resource associated with the storage class and a specific directory on a specific node
-- A persistent volume claim resource that claims certain portion of the persistent volume on behalf of a pod
-
-The following variables are configurable in the helm chart:
-
-- `storageClassName`: The name of the storage class resource
-- `persistentVolumeName`: The name of the persistent volume resource
-- `pvClaimName`: The name of the persistent volume claim resource
-- `volumeHostName`: The name of the kubernetes node on which the data will be persisted
-- `hostLocalPath`: The directory or volume mount path on the chosen chosen node where data will be persisted
-- `pvStorageCapacity`: The capacity of the volume available to the persistent volume resource (e.g. 10Gi)
-
-Note: For this helm chart to work, the volume mount path or directory specified in the `hostLocalPath` variable needs to exist before the helm chart is deployed.
-
-## Standard Install
-
-```shell
-helm install -n local-store local-persistent-volume
-```
-
-## Standard Uninstall
-
-```shell
-helm delete --purge local-store
-```
diff --git a/charts/logging-monitoring.md b/charts/logging-monitoring.md
new file mode 100644
index 0000000..8b80de4
--- /dev/null
+++ b/charts/logging-monitoring.md
@@ -0,0 +1,46 @@
+# Deploy Logging and Monitoring components
+
+To read more about logging and monitoring in CORD, please refer to [the design
+document](https://docs.google.com/document/d/1hCljvKzsNW9D2Y1cbvOTNOCbTy1AgH33zXvVjbicjH8/edit).
+
+There are currently two charts that deploy logging and monitoring
+functionality, `nem-monitoring` and `logging`.  Both of these charts depend on
+having [kafka](kafka.md) instances running in order to pass messages.
+
+
+## `nem-monitoring` charts
+
+```shell
+helm dep update nem-monitoring
+helm install -n nem-monitoring nem-monitoring
+```
+
+> NOTE: In order to display `voltha` kpis you need to have `voltha`
+> and `voltha-kafka` installed.
+
+### Monitoring Dashboards
+
+This chart exposes two dashboards:
+
+- [Grafana](http://docs.grafana.org/) on port `31300`
+- [Prometheus](https://prometheus.io/docs/) on port `31301`
+
+## `logging` charts
+
+```shell
+helm dep up logging
+helm install -n logging logging
+```
+
+For smaller developer/test environments without persistent storage, please use
+the `examples/logging-single.yaml` file to run the logging chart, which doesn't
+create PVC's.
+
+### Logging Dashboard
+
+The [Kibana](https://www.elastic.co/guide/en/kibana/current/index.html)
+dashboard can be found on port `30601`
+
+To start using Kibana, you must create an index under *Management > Index
+Patterns*.  Create one with a name of `logstash-*`, then you can search for
+events in the *Discover* section.
diff --git a/charts/monitoring.md b/charts/monitoring.md
deleted file mode 100644
index 60af004..0000000
--- a/charts/monitoring.md
+++ /dev/null
@@ -1,20 +0,0 @@
-# Deploy Monitoring
-
-To read more about the monitoring in CORD, please refer to this [document](https://docs.google.com/document/d/1hCljvKzsNW9D2Y1cbvOTNOCbTy1AgH33zXvVjbicjH8/edit).
-
-To install the required components in you cluster:
-
-```shell
-helm dep update nem-monitoring
-helm install -n nem-monitoring nem-monitoring
-```
-
-> NOTE: In order to display `voltha` kpis you need to have `voltha`
-> and `voltha-kafka` installed.
-
-## Access the monitoring dashboard
-
-This chart exposes two dashboards:
-
-- grafana on port `31300`
-- prometheus on port `31301`
diff --git a/charts/storage.md b/charts/storage.md
new file mode 100644
index 0000000..6f8301b
--- /dev/null
+++ b/charts/storage.md
@@ -0,0 +1,339 @@
+# Persistent Storage charts
+
+These charts implement persistent storage that is within Kubernetes.
+
+See the Kubernetes documentation for background material on how persistent
+storage works:
+
+- [StorageClass](https://kubernetes.io/docs/concepts/storage/storage-classes/)
+- [PersistentVolume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
+
+Using persistent storage is optional during development, but should be
+provisioned for and configured during production and realistic testing
+scenarios.
+
+## Local Directory
+
+The `local-provisioner` chart creates
+[local](https://kubernetes.io/docs/concepts/storage/volumes/#local) volumes on
+specific nodes, from directories. As there are no enforced limits for volume
+size and the node names are preconfigured, this chart is intended for use only
+for development and testing.
+
+Multiple directories can be specified in the `volumes` list - an example is
+given in the `values.yaml` file of the chart.
+
+The `StorageClass` created for all volumes is `local-directory`.
+
+There is an ansible script that automates the creation of directories on all
+the kubernetes nodes.  Make sure that the inventory name in ansible matches the
+one given as `host` in the `volumes` list, then invoke with:
+
+```shell
+ansible-playbook -i <path to ansbible inventory> --extra-vars "helm_values_file:<path to values.yaml>" local-directory-playbook.yaml
+```
+
+## Local Provisioner
+
+The `local-provisioner` chart provides a
+[local](https://kubernetes.io/docs/concepts/storage/volumes/#local),
+non-distributed `PersistentVolume` that is usable on one specific node.  It
+does this by running the k8s [external storage local volume
+provisioner](https://github.com/kubernetes-incubator/external-storage/tree/master/local-volume/helm/provisioner).
+
+This type of storage is useful for workloads that have their own intrinsic HA
+or redundancy strategies, and only need storage on multiple nodes.
+
+This provisioner is not "dynamic" in the sense that that it can't create a new
+`PersistentVolume` on demand from a storage pool, but the provisioner can
+automatically create volumes as disks/partitions are mounted on the nodes.
+
+To create a new PV, a disk or partition on a node has to be formatted and
+mounted in specific locations, after which the provisioner will automatically
+create a `PersistentVolume` for the mount. As these volumes can't be split or
+resized, care must be taken to ensure that the correct quantity, types, and
+sizes of mounts are created for all the `PersistentVolumeClaim`'s required to
+be bound for a specific workload.
+
+By default, two `StorageClasses` were created to differentiate between Hard
+Disks and SSD's:
+
+- `local-hdd`, which offers PV's on volumes mounted in `/mnt/local-storage/hdd/*`
+- `local-ssd`, which offers PV's on volumes mounted in `/mnt/local-storage/ssd/*`
+
+### Adding a new local volume on a node
+
+If you wanted to add a new volume to a node, you'd physically install a new
+disk in the system, then determine the device file it uses. Assuming that it's
+a hard disk and the device file is `/dev/sdb`, you might partition, format, and
+mount the disk like this:
+
+```shell
+$ sudo parted -s /dev/sdb \
+    mklabel gpt \
+    mkpart primary ext4 1MiB 100%
+$ sudo mkfs.ext4 /dev/sdb1
+$ echo "/dev/sdb1 /mnt/local-storage/hdd/sdb1 ext4 defaults 0 0" | sudo tee -a /etc/fstab
+$ sudo mount /mnt/local-storage/hdd/sdb1
+```
+
+Then check that the `PersistentVolume` is created by the `local-provisioner`:
+
+```shell
+$ kubectl get pv
+NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                  STORAGECLASS     REASON    AGE
+local-pv-2bfa2c43   19Gi       RWO            Delete           Available                          local-hdd                  6h
+
+$ kubectl describe pv local-pv-
+Name:              local-pv-2bfa2c43
+Labels:            <none>
+Annotations:       pv.kubernetes.io/provisioned-by=local-volume-provisioner-node1-...
+Finalizers:        [kubernetes.io/pv-protection]
+StorageClass:      local-hdd
+Status:            Available
+Claim:
+Reclaim Policy:    Delete
+Access Modes:      RWO
+Capacity:          19Gi
+Node Affinity:
+  Required Terms:
+    Term 0:        kubernetes.io/hostname in [node1]
+Message:
+Source:
+    Type:  LocalVolume (a persistent volume backed by local storage on a node)
+    Path:  /mnt/local-storage/hdd/sdb1
+Events:    <none>
+```
+
+## Ceph deployed with Rook
+
+[Rook](https://rook.github.io/) provides an abstraction layer for Ceph and
+other distributed persistent data storage systems.
+
+There are 3 Rook charts included with CORD:
+
+- `rook-operator`, which runs the volume provisioning portion of Rook (and is a
+  thin wrapper around the upstream [rook-ceph
+  chart](https://rook.github.io/docs/rook/v0.8/helm-operator.html)
+
+- `rook-cluster`, which defines the Ceph cluster and creates these
+  `StorageClass` objects usable by other charts:
+
+    - `cord-ceph-rbd`, dynamically create `PersistentVolumes` when a
+      `PersistentVolumeClaim` is created. These volumes are only usable by a
+      single container at a time.
+
+    - `cord-cephfs`, a single shared filesystem which is mountable
+      `ReadWriteMulti` on multiple containers via `PersistentVolumeClaim`. It's
+      size is predetermined.
+
+- `rook-tools`, which provides a toolbox container for troubleshooting problems
+  with Rook/Ceph
+
+To create persistent volumes, you will need to load the first 2 charts, with
+the third only needed for troubleshooting and diagnostics.
+
+### Rook Node Prerequisties
+
+By default, all the nodes running k8s are expected to have a directory named
+`/mnt/ceph` where the Ceph data is stored (the `cephDataDir` variable can be
+used to change this path).
+
+In a production deployment, this would ideally be located on its own block
+storage device.
+
+There should be at least 3 nodes with storage available to provide data
+redundancy.
+
+### Loading Rook Charts
+
+First, add the `rook-beta` repo to helm, then load the `rook-operator` chart
+into the `rook-ceph-system` namespace:
+
+```shell
+cd helm-charts/storage
+helm repo add rook-beta https://charts.rook.io/beta
+helm dep update rook-operator
+helm install --namespace rook-ceph-system -n rook-operator rook-operator
+```
+
+Check that it's running (it will start the `rook-ceph-agent` and
+`rook-discover` DaemonSets):
+
+```shell
+$ kubectl -n rook-ceph-system get pods
+NAME                                  READY     STATUS    RESTARTS   AGE
+rook-ceph-agent-4c66b                 1/1       Running   0          6m
+rook-ceph-agent-dsdsr                 1/1       Running   0          6m
+rook-ceph-agent-gwjlk                 1/1       Running   0          6m
+rook-ceph-operator-687b7bb6ff-vzjsl   1/1       Running   0          7m
+rook-discover-9f87r                   1/1       Running   0          6m
+rook-discover-lmhz9                   1/1       Running   0          6m
+rook-discover-mxsr5                   1/1       Running   0          6m
+```
+
+Next, load the `rook-cluster` chart, which connects the storage on the nodes to
+the Ceph pool, and the CephFS filesystem:
+
+```shell
+helm install -n rook-cluster rook-cluster
+```
+
+Check that the cluster is running - this may take a few minutes, and look for the
+`rook-ceph-mds-*` containers to start:
+
+```shell
+$ kubectl -n rook-ceph get pods
+NAME                                                  READY     STATUS      RESTARTS   AGE
+rook-ceph-mds-cord-ceph-filesystem-7564b648cf-4wxzn   1/1       Running     0          1m
+rook-ceph-mds-cord-ceph-filesystem-7564b648cf-rcvnx   1/1       Running     0          1m
+rook-ceph-mgr-a-75654fb698-zqj67                      1/1       Running     0          5m
+rook-ceph-mon0-v9d2t                                  1/1       Running     0          5m
+rook-ceph-mon1-4sxgc                                  1/1       Running     0          5m
+rook-ceph-mon2-6b6pj                                  1/1       Running     0          5m
+rook-ceph-osd-id-0-85d887f76c-44w9d                   1/1       Running     0          4m
+rook-ceph-osd-id-1-866fb5c684-lmxfp                   1/1       Running     0          4m
+rook-ceph-osd-id-2-557dd69c5c-qdnmb                   1/1       Running     0          4m
+rook-ceph-osd-prepare-node1-bfzzm                     0/1       Completed   0          4m
+rook-ceph-osd-prepare-node2-dt4gx                     0/1       Completed   0          4m
+rook-ceph-osd-prepare-node3-t5fnn                     0/1       Completed   0          4m
+
+$ kubectl -n rook-ceph get storageclass
+NAME            PROVISIONER                    AGE
+cord-ceph-rbd   ceph.rook.io/block             6m
+cord-cephfs     kubernetes.io/no-provisioner   6m
+
+$ kubectl -n rook-ceph get filesystems
+NAME                   AGE
+cord-ceph-filesystem   6m
+
+$ kubectl -n rook-ceph get pools
+NAME             AGE
+cord-ceph-pool   6m
+
+$ kubectl -n rook-ceph get persistentvolume
+NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM     STORAGECLASS   REASON    AGE
+cord-cephfs-pv      20Gi       RWX            Retain           Available             cord-cephfs              7m
+```
+
+At this point you can create a `PersistentVolumeClaim` on `cord-ceph-rbd` and a
+corresponding `PersistentVolume` will be created by the `rook-ceph-operator`
+acting as a volume provisioner and bound to the PVC.
+
+Creating a `PeristentVolumeClaim` on `cord-cephfs` will mount the same CephFS
+filesystem on every container that requests it. The CephFS PV implementation
+currently isn't as mature as the Ceph RDB volumes, and may not remount properly
+when used with a PVC.
+
+### Troubleshooting Rook
+
+Checking the `rook-ceph-operator` logs can be enlightening:
+
+```shell
+kubectl -n rook-ceph-system logs -f rook-ceph-operator-...
+```
+
+The [Rook toolbox container](https://rook.io/docs/rook/v0.8/toolbox.html) has
+been containerized as the `rook-tools` chart, and provides a variety of tools
+for debugging Rook and Ceph.
+
+Load the `rook-tools` chart:
+
+```shell
+helm install -n rook-tools rook-tools
+```
+
+Once the container is running (check with `kubectl -n rook-ceph get pods`),
+exec into it to run a shell to access all tools:
+
+```shell
+kubectl -n rook-ceph exec -it rook-ceph-tools bash
+```
+
+or run a one-off command:
+
+```shell
+kubectl -n rook-ceph exec rook-ceph-tools -- ceph status
+```
+
+or mount the CephFS volume:
+
+```shell
+kubectl -n rook-ceph exec -it rook-ceph-tools bash
+mkdir /mnt/cephfs
+mon_endpoints=$(grep mon_host /etc/ceph/ceph.conf | awk '{print $3}')
+my_secret=$(grep key /etc/ceph/keyring | awk '{print $3}')
+mount -t ceph -o name=admin,secret=$my_secret $mon_endpoints:/ /mnt/cephfs
+ls /mnt/cephfs
+```
+
+### Cleaning up after Rook
+
+The `rook-operator` chart will leave a few `DaemonSet` behind after it's
+removed. Clean these up using these commands:
+
+```shell
+kubectl -n rook-ceph-system delete daemonset rook-ceph-agent
+kubectl -n rook-ceph-system delete daemonset rook-discover
+helm delete --purge rook-operator
+```
+
+If you have other charts that create `PersistentVolumeClaims`, you may need to
+clean them up manually (for example, if you've changed the `StorageClass` they
+use), list them with:
+
+```shell
+kubectl --all-namespaces get pvc
+```
+
+Files may be left behind in the Ceph storage directory and/or Rook
+configuration that need to be deleted before starting `rook-*` charts. If
+you've used the `automation-tools/kubespray-installer` scripts to set up a
+environment named `test`, you can delete all these files with the following
+commands:
+
+```shell
+cd cord/automation-tools/kubespray-installer
+ansible -i inventories/test/inventory.cfg -b -m shell -a "rm -rf /var/lib/rook && rm -rf /mnt/ceph/*" all
+```
+
+The current upgrade process for Rook involves manual intervention and
+inspection using the tools container.
+
+## Using Persistent Storage
+
+The general process for using persistent storage is to create a
+[PersistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims)
+on the appropriate
+[StorageClass](https://kubernetes.io/docs/concepts/storage/storage-classes/)
+for the workload you're trying to run.
+
+### Example: XOS Database on a local directory
+
+For development and testing, it may be useful to persist the XOS database
+
+```shell
+helm install -f examples/xos-db-local-dir.yaml -n xos-core xos-core
+```
+
+### Example: XOS Database on a Ceph RBD volume
+
+The XOS Database (Postgres) wants a volume that persists if a node goes down or
+is taken out of service, not shared with other containers running Postgres,
+thus the Ceph RBD volume is a reasonable choice to use with it.
+
+```shell
+helm install -f examples/xos-db-ceph-rbd.yaml -n xos-core xos-core
+```
+
+### Example: Docker Registry on CephFS shared filesystem
+
+The Docker Registry wants a filesystem that is the shared across all
+containers, so it's a suitable workload for the `cephfs` shared filesystem.
+
+There's an example values file available in `helm-charts/examples/registry-cephfs.yaml`
+
+```shell
+helm install -f examples/registry-cephfs.yaml -n docker-registry stable/docker-registry
+```
+
diff --git a/charts/voltha.md b/charts/voltha.md
index eb7b47a..03169b5 100644
--- a/charts/voltha.md
+++ b/charts/voltha.md
@@ -1,28 +1,21 @@
 # Deploy VOLTHA
 
+VOLTHA depends on having a [kafka message bus](kafka.md) deployed with a name
+of `voltha-kafka`, so deploy that with helm before deploying the voltha chart.
+
+
 ## First Time Installation
 
-Download the helm charts `incubator` repository
+Download the helm charts `incubator` repository:
 
 ```shell
 helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com/
 ```
 
-Build dependencies
+Update dependencies within the voltha chart:
 
 ```shell
-helm dep build voltha
-```
-
-Install the kafka dependency
-
-```shell
-helm install --name voltha-kafka \
---set replicas=1 \
---set persistence.enabled=false \
---set zookeeper.servers=1 \
---set zookeeper.persistence.enabled=false \
-incubator/kafka
+helm dep up voltha
 ```
 
 There is an `etcd-operator` **known bug** that prevents deploying
diff --git a/prereqs/k8s-multi-node.md b/prereqs/k8s-multi-node.md
index 7b70562..40eba73 100644
--- a/prereqs/k8s-multi-node.md
+++ b/prereqs/k8s-multi-node.md
@@ -19,7 +19,8 @@
 * **Operator/Developer Machine** (1x, either physical or virtual machine)
     * Has Git installed
     * Has Python3 installed (<https://www.python.org/downloads/>)
-    * Has a stable version of Ansible installed (<http://docs.ansible.com/ansible/latest/intro_installation.html>)
+    * Has a stable version of Ansible installed (<http://docs.ansible.com/ansible/latest/intro_installation.html>), tested with version `2.5.3`
+    * Has [ansible-modules-hashivault](https://pypi.org/project/ansible-modules-hashivault/) installed where ansible can use it.
     * Is able to reach the target servers (ssh into them)
 * **Target/Cluster Machines** (at least 3x, either physical or virtual machines)
     * Run Ubuntu 16.04 server
diff --git a/profiles/rcord/workflows/att.md b/profiles/rcord/workflows/att.md
index 0d384a1..4e889bb 100644
--- a/profiles/rcord/workflows/att.md
+++ b/profiles/rcord/workflows/att.md
@@ -159,4 +159,4 @@
 
 ### Device monitoring
 
-Please refer to the [monitoring](../../../charts/monitoring.md) chart.
\ No newline at end of file
+Please refer to the [monitoring](../../../charts/logging-monitoring.md) chart.