blob: d62feb2d9ebb850dfe236ac6aafda1abd2d123d4 [file] [log] [blame]
Andrea Campanella448fbc22021-05-13 15:39:00 +02001=============================================
2VOLTHA and ONOS software update procedures
3=============================================
4
5This document describes the software upgrade procedure for VOLTHA and ONOS in a deployed system.
6Distinction is made between a `minor` software upgrade, which can be done in service,
7meaning with no dataplane service interruption to existing customers, and a `major` software upgrade,
8which in turns requires a full maintenance window during which service is impacted.
9
10Changes to data-structures in storage (ETCD for VOLTHA and Atomix for ONOS) are out of scope for in-service upgrades.
11Such changes qualify as major software upgrades that require a maintenance windows.
12The KAFKA bus update has its own section given that the procedure is different from the rest of the components.
13The following elements expect a fully working provisioned VOLTHA and ONOS deployment on top of a Kubernetes cluster,
14with exposed ONOS REST API ports.
15It is also expected that new versions of the different components are available to the operator that performs
16the upgrade.
17
18Minor Software Version Update
19=============================
20The `minor` software upgrade qualifier refers to an upgrade that does not involve API
21changes, which in VOLTHA, refers to either a change to the protos or to voltha-lib-go,
22and in ONOS to a change in the Java interfaces, CLI commands or REST APIs of either the Apps or the platform.
23A `minor` software update is intended for bug fixes and not for new features.
24`Minor` software update is supported only for ONOS apps and VOLTHA components. No in service software update
25is supported for ETCD or Kafka.
26
27VOLTHA services
28---------------
29VOLTHA components `minor` software upgrade leverages `helm` and `k8s`.
30During this process is expected that no provision subscriber call is executed from the northbound.
31In process calls will be executed thanks to the stored data and/or the persistence of messages over KAFKA.
32
33After changes in the code are made and verified the following steps are needed:
34
35#. Update Minor Version of the component
36#. Build a new version of the needed component to update
37#. update the component's minor version in the helm chart
38#. | issue the helm upgrade command. If the changes have been already upstreamed to ONF the upstream chart
39 | `onf/<component name>` can be used, otherwise a local copy of the chart is required.
40
41Following is an example of the `helm` command to upgrade the openonu adapter.
42Topics, kv store paths and kafka endpoints need to be adapted to the specific deployment.
43
44.. code:: bash
45
46 helm upgrade --install --create-namespace \
47 -n voltha1 opeonu-adapter onf/voltha-adapter-openonu \
48 --set global.stack_name=voltha1 \
49 --set adapter_open_onu.kv_store_data_prefix=service/voltha/voltha1_voltha1 \
50 --set adapter_open_onu.topics.core_topic=voltha1_voltha1_rwcore \
51 --set adapter_open_onu.topics.adapter_open_onu_topic=voltha1_voltha1_brcm_openomci_onu \
52 --set services.kafka.adapter.service=voltha-infra-kafka.infra.svc \
53 --set services.kafka.cluster.service=voltha-infra-kafka.infra.svc \
54 --set services.etcd.service=voltha-infra-etcd.infra.svc
55
56ONOS apps
57---------
58`Minor` software update is also available for the following ONOS apps - `sadis`, `olt`, `aaa`, `kafka`, `dhcpl2relay`,
59`mac-learning`, `igmpproxy`, and `mcast`. These apps can be thus updated with no impact on the dataplane of provisioned
60subscribers. The `minor` software update for the ONOS apps leverage existing ONOS REST APIs.
61
62During this process is expected that no provision subscriber call is executed from the REST APIs.
63In process calls will be executed thanks to the Atomix stored flows.
64Some metrics and/or packet processing might be lost during this procedure, the system relies on retry mechanisms
65present in the services and the dataplane protocols for converging to a stable stated (e.g. DHCP retry)
66
67
68After changes in the code of ONOS apps are made and verified the following steps are needed:
69
70#. | obtain the .oar of the app, either via a local build with `mvn clean install` or, if the code has been upstreamed
71 | by downloading it from `maven central <https://search.maven.org/search?q=g:org.opencord>`_ or sonatype.
72#. Delete the old version of the ONOS app.
73#. Upload install and activate the new `oar` file.
74
75Following is an example of the different `curl` commands to upgrade the olt app. This assumes the .oar to be present in
76the directory where the command is executed from/
77
78.. code:: bash
79
80 # download the app
81 curl --fail -sSL https://oss.sonatype.org/content/groups/public/org/opencord/olt-app/4.5.0-SNAPSHOT/olt-app-4.5.0-20210504.162620-3.oar > org.opencord.olt-4.5.0.SNAPSHOT.oar
82 # delete the app
83 curl --fail -sSL -X DELETE http://karaf:karaf@127.0.0.1:8181/onos/v1/applications/org.opencord.olt
84 # install and activate the new version of the app
85 curl --fail -sSL -H Content-Type:application/octet-stream -X POST http://karaf:karaf@127.0.0.1:8181/onos/v1/applications?activate=true --data-binary @org.opencord.olt-4.5.0.SNAPSHOT.oar 2>&1
86
87
88Major Software Version Update
89=============================
90A software update is qualified to be `major` where there are changes in the APIs or in the format of the
91data stored by a component.
92
93A major software update at the moment in VOLTHA and ONOS requires a maintenance window
94during which the dataplane for the subscribers is going to be interrupted, thus no service will be provided.
95There are several cases and they can be handled differently.
96
97VOLTHA services API or Data format changes
98------------------------------------------
99A `major` update is needed because VOLTHA API between components have been changed or because format of the data being
100stored is different, thus a complete-wipe out needs to be performed.
101In such scenario each stack can be updated independently with no teardown required of the infrastructure of ONOS,
102ETCD, KAFKA.
103Different versions of Voltha can co-exists over the same infrastructure.
104
105The procedure is iterative on each stack and is performed as follows:
106
107#. un-provision all the subscribers via ONOS REST API.
108#. delete all the OLTs managed by the stack via VOLTHA gRPC API.
109#. upgrade the stack version via `helm` upgrade command and the correct version of the `voltha-stack` chart.
110
Andrea Campanellac18d1182021-09-10 12:01:38 +0200111Details on the `helm` commands can be found in the voltha-helm-charts README file <voltha-helm-charts/README.md>_
Andrea Campanella448fbc22021-05-13 15:39:00 +0200112
113If the API change is between the `openolt adapter` and the `openolt agent` on the OLT hardware please refer to section
114:ref:`OpenOLT Agent Update <openolt-update>`.
115
116
117ONOS, Atomix or ONOS apps
118-------------------------
119A `major` update is needed because of changes in the interfaces (Java APIs), REST APIs, of ONOS itself or in one
120of the apps have been made, rendering incompatible the two subsequent implementations. A `major` software update is
121also needed for changes made to the data stored in Atomix or for an update of the Atomix version iself.
122In this scenario all the stacks connected to an ONOS instance need to be cleaned of data before moving them
123over to a new ONOS cluster.
124
125The procedure is as follows:
126
127#. deploy a new ONOS cluster in a new namespace `infra1`
128#. un-provision all the subscribers via ONOS REST API
129#. delete the OLT device (not strictly required, but best to ensure clean state)
130#. redeploy the of-agent with the new ONOS cluster endpoints
131#. re-provision the OLT
132#. re-provision the subscribers
133#. iterate over steps 2,3,4,5,6 for each of the stack connected to the ONOS you want to update.
134
135Following is an example on how to deploy ONOS:
136
137.. code:: bash
138
139 helm install --create-namespace \
140 --set replicas=3,atomix.replicas=3 \
141 --set atomix.persistence.enabled=false \
142 --set image.pullPolicy=Always,image.repository=voltha/voltha-onos,image.tag=5.0.0 \
143 --namespace infra1 onos onos/onos-classic
144
145Following is an example on how to re-deploy the of-agent, using the `voltha-stack` chart,
146pointing new controller endpoints. Only the `ofagent` pod will be restarted.
147
148.. code:: bash
149
150 helm upgrade --install --create-namespace \
151 --set global.topics.core_topic=voltha1_voltha1_rwcore,defaults.kv_store_data_prefix=service/minimal \
152 --set global.kv_store_data_prefix=service/voltha/voltha1_voltha1 \
153 --set services.etcd.port=2379 --set services.etcd.address=etcd.default.svc:2379 \
154 --set services.kafka.adapter.service=voltha-infra-kafka.infra.svc \
155 --set services.kafka.cluster.service=voltha-infra-kafka.infra.svc \
156 --set services.etcd.service=voltha-infra-etcd.infra.svc
157 --set 'voltha.services.controller[0].service=voltha-infra1-onos-classic-0.voltha-infra1-onos-classic-hs.infra1.svc' \
158 --set 'voltha.services.controller[0].port=6653' \
159 --set 'voltha.services.controller[0].address=voltha-infra1-onos-classic-0.voltha-infra1-onos-classic-hs.infra1.svc:6653' \
160 --set 'voltha.services.controller[1].service=voltha-infra1-onos-classic-1.voltha-infra1-onos-classic-hs.infra1.svc' \
161 --set 'voltha.services.controller[1].port=6653' \
162 --set 'voltha.services.controller[1].address=voltha-infra1-onos-classic-1.voltha-infra1-onos-classic-hs.infra1.svc:6653' \
163 --set 'voltha.services.controller[2].service=voltha-infra1-onos-classic-2.voltha-infra1-onos-classic-hs.infra1.svc' \
164 --set 'voltha.services.controller[2].port=6653' \
165 --set 'voltha.services.controller[2].address=voltha-infra1-onos-classic-2.voltha-infra1-onos-classic-hs.infra1.svc:6653' \
166 --set global.log_level=WARN --namespace voltha voltha onf/voltha-stack
167
168ETCD
169----
170A `major` update is needed because tearing down the ETCD cluster means deleting the data stored,
171thus requiring a rebuild by the different components.
172
173The procedure is as follows:
174
175#. deploy a new ETCD cluster.
176#. un-provision all the subscribers via ONOS REST API
177#. delete the OLT device (not strictly required, but best to ensure clean state)
178#. redeploy the voltha stack with the `voltha-stack` `helm` chart pointing it to the new ETCD endpoints.
179#. re-provision the OLT
180#. re-provision the subscribers
181#. iterate over steps 2,3,4,5,6 for each stack connected to the ETCD cluster you want to update.
182
Andrea Campanellac18d1182021-09-10 12:01:38 +0200183Details on the `helm` commands for the voltha stack can be found in the `voltha-helm-charts README file <../voltha-helm-charts/README.md>`_
Andrea Campanella448fbc22021-05-13 15:39:00 +0200184
185Following is an example on how to deploy a new 3 node ETCD cluster:
186
187.. code:: bash
188
189 helm install --create-namespace --set auth.rbac.enabled=false,persistence.enabled=false,statefulset.replicaCount=3 --namespace infra etcd bitnami/etcd
190
191KAFKA Update
192============
193An update of Kafka is not considered to be a `major` software upgrade because it can be performed with
194no service impact to the user.
195
196.. code:: bash
197
198 helm install --create-namespace --set global.log_level=WARN --namespace infra kafka bitnami/kafka
199
200Following is an example on how to re-deploy the stack pods, using the `voltha-stack` chart,
201pointing new kafka (`voltha-infra-kafka-2.infra.svc`) endpoints.
202Each pod will be restarted but without dataplane interruption because it will be the same of a pod restart,
203thus leveraging the data stored in ETCD.
204
205.. code:: bash
206
207 helm upgrade --install --create-namespace \
208 --set global.topics.core_topic=voltha1_voltha1_rwcore,defaults.kv_store_data_prefix=service/minimal \
209 --set global.kv_store_data_prefix=service/voltha/voltha1_voltha1 \
210 --set services.etcd.port=2379 --set services.etcd.address=etcd.default.svc:2379 \
211 --set services.kafka.adapter.service=voltha-infra-kafka-2.infra.svc \
212 --set services.kafka.cluster.service=voltha-infra-kafka-2.infra.svc \
213 --set services.etcd.service=voltha-infra-etcd.infra.svc
214 --set 'voltha.services.controller[0].service=voltha-infra-onos-classic-0.voltha-infra-onos-classic-hs.infra.svc' \
215 --set 'voltha.services.controller[0].port=6653' \
216 --set 'voltha.services.controller[0].address=voltha-infra-onos-classic-0.voltha-infra-onos-classic-hs.infra.svc:6653' \
217 --set 'voltha.services.controller[1].service=voltha-infra-onos-classic-1.voltha-infra-onos-classic-hs.infra.svc' \
218 --set 'voltha.services.controller[1].port=6653' \
219 --set 'voltha.services.controller[1].address=voltha-infra-onos-classic-1.voltha-infra-onos-classic-hs.infra.svc:6653' \
220 --set 'voltha.services.controller[2].service=voltha-infra-onos-classic-2.voltha-infra-onos-classic-hs.infra.svc' \
221 --set 'voltha.services.controller[2].port=6653' \
222 --set 'voltha.services.controller[2].address=voltha-infra-onos-classic-2.voltha-infra-onos-classic-hs.infra.svc:6653' \
223 --set global.log_level=WARN --namespace voltha voltha onf/voltha
224
225
226.. _openolt-update:
227
228OpenOLT Agent Update
229====================
230
231The `openolt agent` on the box can be upgrade without having to teardown all the VOLTHA stack to which the OLT was
232connected. Again here we make the ditinction of a minor update and a major update of the openolt agent.
233A minor update happens when there is no API change between the `openolt agent` and the `openolt adapter`, meaning the
234`openolt.proto` has not been updated in either of those components.
235A major update is required when there are changes to the `openolt.proto` API.
236
237Both updates of the OpenOLT agent are service impacting for the customer.
238
239Minor Update
240------------
241A minor update will be seen from VOLTHA as a reboot of the OLT.
242During a minor update of the openolt agent no northbound should be done, in progress provision call will
243reconcile upon OLT reboot. Events, metrics and performance measurements data can be lost and should not be expected
244during this procedure.
245The procedure is as follows:
246
247#. place the new openolt agent `.deb` package on the desired OLT.
248#. stop the running `openolt`, `dev_mgmnt_deamon` and optionally the `watchdog` processes on the OLT.
249#. run the new openolt packages
250#. reboot the OLT hardware.
251
252After these steps are done VOLTHA will re-receive the OLT connection and re-provision data accordingly.
253
254Major update
255------------
256A major update will require the OLT to be deleted from VOLTHA to ensure no inconsistent data is stored.
257During a major update of the openolt agent and adapter no northbound should be done and
258in progress call will fail. Events, metrics and performance measurements data will be lost.
259The procedure is as follows:
260
261#. Delete the OLT device from VOLTHA (e.g. voltctl device delete <olt_id>)
262#. Upgrade the openolt-adapter to the new version via `helm upgrade`.
263#. place the new openolt agent `.deb` package on the desired OLT.
264#. stop the running `openolt`, `dev_mgmnt_deamon` and optionally the `watchdog` processes on the OLT.
265#. run the new openolt packages
266#. reboot the OLT hardware.
267#. re-provision the OLT (e.g. `voltctl device provision <ip:port>`
268#. re-enable the OLT (e.g. `voltctl device enable <olt_id>`
269#. re-provision the subscribers.
270
271After these steps VOLTHA effectively treats the OLT as a brand new one which it had no prior knowledge of.