VOL-1140: Add support for new KPI message format
Fix for OpenOMCI Comm Channel KPI group name, was causing key error exception during collection
Change-Id: If5a6e2807c9a934d9e35c40f12e1452a3040cbf9
diff --git a/voltha/core/adapter_agent.py b/voltha/core/adapter_agent.py
index ad38d5a..a063533 100644
--- a/voltha/core/adapter_agent.py
+++ b/voltha/core/adapter_agent.py
@@ -35,7 +35,7 @@
from voltha.protos.device_pb2 import Device, Port, PmConfigs
from voltha.protos.events_pb2 import AlarmEvent, AlarmEventType, \
AlarmEventSeverity, AlarmEventState, AlarmEventCategory
-from voltha.protos.events_pb2 import KpiEvent
+from voltha.protos.events_pb2 import KpiEvent, KpiEvent2
from voltha.protos.voltha_pb2 import DeviceGroup, LogicalDevice, \
LogicalPort, AdminState, OperStatus, AlarmFilterRuleKey
from voltha.registry import registry
@@ -913,7 +913,7 @@
def submit_kpis(self, kpi_event_msg):
try:
- assert isinstance(kpi_event_msg, KpiEvent)
+ assert isinstance(kpi_event_msg, (KpiEvent, KpiEvent2))
self.event_bus.publish('kpis', kpi_event_msg)
except Exception as e:
self.log.exception('failed-kpi-submission',
diff --git a/voltha/extensions/kpi/README.md b/voltha/extensions/kpi/README.md
index 305ce3b..0cc9330 100644
--- a/voltha/extensions/kpi/README.md
+++ b/voltha/extensions/kpi/README.md
@@ -2,7 +2,10 @@
This directory provides a common library for the creation of Performance Monitoring groups
within VOLTHA and should be used to insure that KPI information from different adapters use
-the same format
+the same format.
+
+The original KpiEvent protobuf message is still supported for adapters that wish to use theprevious format but device adapter developers are encouraged to support the new format and
+make use of this shared library.
## KPI Manager Creation
@@ -10,13 +13,15 @@
register PM Metric manager. This is typically performed in the device handler's
'activate' method (called in response to the device handler first being enabled)
-1. Create an instance of a **AdapterPmMetrics** manager object. This is typically an
+1. Create an instance of a derived **AdapterPmMetrics** manager object. This is currently an
**OltPmMetrics** object for an _OLT_ adapter, or an **OnuPmMetrics** adapter for an
_ONU_ adapter. If you have additional device specific metrics to report, you can
- derive your own manager object from one of these two derived classes.
+ derive your own manager object from one of these two derived classes. In order to
+ inherit (or modify) the metrics defined in those classes as well as support any new
+ metrics specific to your device.
This call takes a number of device adapter specific arguments and these are detailed
- in the pydoc headers for the managers _\_\_init___() method.
+ in the pydoc headers for the appropriate **AdapterPmMetrics** _\_\_init___() method.
2. Create the ProtoBuf message for your metrics by calling the newly created _manager's_
**_make_proto_**() method.
@@ -28,6 +33,10 @@
manager's _start_collector_() method. You may wish to do this after a short pause
depending on how your adapter is designed.
+**NOTE:** Currently there is only a single collection frequency for all metrics for
+a given device adapter. In the future, individual collection intervals on a per-metric/metric-group
+will be supported by the shared library.
+
The next two subsections provides examples of these steps for both an OLT and an ONU
device adapter
@@ -39,7 +48,7 @@
'nni-ports': self.northbound_ports.values(),
'pon-ports': self.southbound_ports.values()
}
- self.pm_metrics = OltPmMetrics(self.adapter_agent, self.device_id,
+ self.pm_metrics = OltPmMetrics(self.adapter_agent, self.device_id, self.logical_device_id,
grouped=True, freq_override=False,
**kwargs)
@@ -54,11 +63,12 @@
reactor.callLater(10, self.pm_metrics.start_collector)
```
-### ONU Device Adapters PM Manager setup
+### ONU Device Adapters PM Manager Setup
For ONU devices, if you wish to include OpenOMCI 15-minute historical interval
intervals, you will need to register the PM Metrics OpenOMCI Interval PM class
-with OpenOMCI
+with OpenOMCI. This ties in the OpenOMCI PM Interval State Machine with the KPI
+shared library.
```python
@@ -67,7 +77,7 @@
'heartbeat': self.heartbeat,
'omci-cc': self.openomci.omci_cc
}
- self.pm_metrics = OnuPmMetrics(self.adapter_agent, self.device_id,
+ self.pm_metrics = OnuPmMetrics(self.adapter_agent, self.device_id, self.logical_device_id,
grouped=True, freq_override=False,
**kwargs)
@@ -84,89 +94,132 @@
reactor.callLater(30, self.pm_metrics.start_collector)
```
-# Basic KPI Format
+### How metrics are currently collected
-**TODO**: This needs to be defined by the community with assistance from the _SEBA_
-developers.
+Currently, the default behaviour is to collect KPI information on a single periodic
+interval that can be adjusted via the NBI/CLI of VOLTHA. It collects data by extracting
+it from an object provided during the collection request and this object should either
+provide attributes or a property method that matches the metric to be collected.
+For instance, assume that you have an NNI metric called 'tx_packets'. You would pass
+an object during collection that should have one of the two following;
+
+- a _tx_packets_ attribute/member name defined for the object that has the requested
+ value already set (via background poll)
+
+- a _tx_packets_ **property** method that accesses an internal variable with the value
+ already set (via background poll) or that calculates/extracts the value without blockin
+ the call.
+
+### Known Issues in collection
+
+Note that a future story will be created to allow for collection to be requested for
+a metric/metric-group on demand so that background polling of KPI information is not
+required for all reported metrics.
+
+Note that a future story will be created to allow KPI information to be collected on
+per-group/metric intervals.
+
+# Basic KPI Format (**KpiEvent2**)
The KPI information is published on the kafka bus under the _voltha.kpi_ topic. For
VOLTHA PM information, the kafka key is empty and the value is a JSON message composed
of the following key-value pairs.
-| key | value | Notes |
-| :-: | :----- | :---- |
-| type | string | "slice" or "ts". A "slice" is a set of path/metric data for the same time-stamp. A "ts" is a time-series: array of data for same metric |
-| ts | float | UTC time-stamp of data in slice mode (seconds since the epoch of January 1, 1970) |
-| prefixes | list | One or more prefixes. A prefix is a key-value pair described below |
+| key | value | Notes |
+| :--------: | :----- | :---- |
+| type | string | "slice" or "ts". A "slice" is a set of path/metric data for the same time-stamp. A "ts" is a time-series: array of data for same metric |
+| ts | float | UTC time-stamp of when the KpiEvent2 was created (seconds since the epoch of January 1, 1970) |
+| slice_data | list | One or more sets of metrics composed of a _metadata_ section and a _metrics_ section. |
-**NOTE**: The timestamp is currently retrieved as a whole value. It is also possible to easily get
-the floating timestamp which contains the fractional seconds since epoch. **Is this of use**?
+**NOTE**: Time-series metrics and corresponding protobuf messages have not been defined.
-For group PM information, the key composed of a string with the following format:
-```
- voltha.<device-adapter>.<device-id>.<group>[.<group-id>]
-```
-Here is an JSON **example** of a current KPI published on the kafka bus under the
-_voltha.kpi_ topic. In this case, the _device-adapter_ is the **adtran_olt**, the _device-id_ is
-the value **0001c4397d43bc51**, the _group_ is **nni** port statistics, and the _group-id_ is the
-port number is **1**.
+## Slice Data Format
+
+For KPI slice KPI messages, the _slice_data_ portion of the **KpiEvent2** is composed of a _metadata_
+section and a _metrics_ section.
+
+### _metadata_ Section Format
+
+The metadata section is used to:
+ - Define which metric/metric-group is being reported (The _title_ field)
+ - Provide some common fields required by all metrics (_title_, _timestamp_, _device ID_, ...)
+ - Provide metric/metric-group specific context (the _context_ fields)
+
+| key | value | Notes |
+| :--------: | :----- | :---- |
+| title | string | "slice" or "ts". A "slice" is a set of path/metric data for the same time-stamp. A "ts" is a time-series: array of data for same metric |
+| ts | float | UTC time-stamp of data at the time of collection (seconds since the epoch of January 1, 1970) |
+| logical_device_id | string | The logical ID that the device belongs to. This is equivalent to the DPID reported in ONOS for the VOLTHA logical device with the 'of:' prefix removed. |
+| device_id | string | The physical device ID that is reporting the metric. |
+| serial_number | string | The reported serial number for the physical device reporting the metric. |
+| context | map | A key-value map of metric/metric-group specific information.|
+
+The context map is composed of key-value pairs where the key (string) is the label for the context
+specific value and the value (string) is the corresponding context value. While most values may be
+better represented as a float/integer, there may be some that are better represented as text. For
+this reason, values are always represented as strings to allow the ProtoBuf message format to be
+as simple as possible.
+
+Here is an JSON _example_ of a current KPI published on the kafka bus under the
+_voltha.kpi_ topic.
```json
{
"type": "slice",
- "ts": 1532379520.0,
- "prefixes": {
- "voltha.adtran_olt.0001c4397d43bc51.nni.1": {
+ "ts": 1534440704.0,
+ "slice_data": [
+ {
+ "metadata": {
+ "title": "Ethernet",
+ "ts": 1534440704.0,
+ "logical_device_id": "000000139521a269",
+ "device_id": "000115929ed71696",
+ "serial_no": "dummy_sn2209199",
+ "context": {
+ "port_no": "1"
+ }
+ },
"metrics": {
- "tx_dropped": 0.0,
+ "tx_dropped": 0.0, # A COUNTER
"rx_packets": 0.0,
"rx_bytes": 0.0,
- "rx_mcast": 0.0,
- "tx_mcast": 16.0,
- "rx_bcast": 0.0,
+ "rx_mcast_packets": 0.0,
+ "tx_mcast_packets": 16.0,
+ "rx_bcast_packets": 0.0,
+ "oper_status": 4.0, # A STATE
+ "admin_state": 3.0,
+ "rx_errors": 0.0,
+ "tx_bytes": 1436.0,
+ "rx_dropped": 0.0,
+ "tx_packets": 16.0,
+ "tx_bcast": 0.0
+ }
+ },
+ {
+ "metadata": {
+ "title": "PON",
+ "logical_device_id": "000000139521a269",
+ "device_id": "000115929ed71696",
+ "serial_no": "dummy_sn2209199",
+ "ts": 1534440704.0,
+ "context": {
+ "port_no": "5",
+ "pon_id": "0"
+ },
+ },
+ "metrics": {
+ "rx_packets": 0.0,
+ "in_service_onus": 0.0, # A GAUGE
+ "rx_bytes": 0.0,
+ "closest_onu_distance": -1.0,
+ "tx_bip_errors": 0.0,
"oper_status": 4.0,
"admin_state": 3.0,
- "tx_bcast": 5639.0,
- "tx_bytes": 1997642.0,
- "rx_dropped": 0.0,
- "tx_packets": 5655.0,
- "port_no": 1.0,
- "rx_errors": 0.0
- }
- },
- "voltha.adtran_olt.0001c4397d43bc51.pon.0.onu.0": {
- "metrics": {
- "fiber_length": 29.0,
- "onu_id": 0.0,
- "pon_id": 0.0,
- "equalization_delay": 621376.0,
- "rssi": -167.0
- }
- },
- "voltha.adtran_olt.0001c4397d43bc51.pon.0.onu.1": {
- "metrics": {
- "fiber_length": 29.0,
- "onu_id": 1.0,
- "pon_id": 0.0,
- "equalization_delay": 621392.0,
- "rssi": -164.0
- },
- ...
-
- "voltha.adtran_olt.0001c4397d43bc51.pon.0.onu.0.gem.2176": {
- "metrics": {
- "rx_packets": 0.0,
- "rx_bytes": 0.0,
- "alloc_id": 1024.0,
- "gem_id": 2176.0,
- "pon_id": 0.0,
"tx_bytes": 0.0,
- "onu_id": 0.0,
"tx_packets": 0.0
}
},
...
- }
}
```
@@ -208,8 +261,6 @@
This initial code is only a preliminary sample. The following tasks need to be
added to the VOLTHA JIRA or performed in the SEBA group:
-- Get a list from SEBA/VOLTHA on required metrics.
-
- Get feedback from other OLT/ONU developers on any needed changes
- Allow PM groups to have different collection times
@@ -219,4 +270,3 @@
- For statistics groups that have more than one instance, do we need to be able to
enable/disable specific instances? Major refactor of code if so (database work, ...)
-
diff --git a/voltha/extensions/kpi/adapter_pm_metrics.py b/voltha/extensions/kpi/adapter_pm_metrics.py
index d9432a0..54983a3 100644
--- a/voltha/extensions/kpi/adapter_pm_metrics.py
+++ b/voltha/extensions/kpi/adapter_pm_metrics.py
@@ -13,6 +13,7 @@
# limitations under the License.
import structlog
+import arrow
from twisted.internet.task import LoopingCall
@@ -24,13 +25,14 @@
and this base class is primarily used to provide a consistent interface to configure,
start, and stop statistics collection.
"""
- def __init__(self, adapter_agent, device_id,
+ def __init__(self, adapter_agent, device_id, logical_device_id,
grouped=False, freq_override=False, **kwargs):
"""
Initializer for shared Device Adapter PM metrics manager
:param adapter_agent: (AdapterAgent) Adapter agent for the device
:param device_id: (str) Device ID
+ :param logical_device_id: (str) VOLTHA Logical Device ID
:param grouped: (bool) Flag indicating if statistics are managed as a group
:param freq_override: (bool) Flag indicating if frequency collection can be specified
on a per group basis
@@ -40,11 +42,15 @@
self.device_id = device_id
self.adapter_agent = adapter_agent
self.name = adapter_agent.adapter_name
+ # Sanitize the vcore ID in the logical device ID
+ self.logical_device_id = '0000' + logical_device_id[4:]
+ device = self.adapter_agent.get_device(self.device_id)
+ self.serial_number = device.serial_number
+
self.default_freq = 150
self.grouped = grouped
self.freq_override = grouped and freq_override
self.lc = None
- self.prefix = 'voltha.{}.{}'.format(self.name, self.device_id)
self.pm_group_metrics = dict() # name -> PmGroupConfig
def update(self, pm_config):
@@ -75,81 +81,115 @@
if self.lc is not None and self.default_freq > 0:
self.lc.stop()
- def collect_group_metrics(self, group, names, config):
+ def collect_group_metrics(self, group_name, group, names, config):
"""
Collect the metrics for a specific PM group.
- This common collection method expects that the group object provide as the first
+ This common collection method expects that the 'group object' provide as the second
parameter supports an attribute or property with the name of the value to
retrieve.
+ :param group_name: (str) The unique collection name. The name should not contain spaces.
:param group: (object) The object to query for the value of various attributes (PM names)
:param names: (set) A collection of PM names that, if implemented as a property in the object,
will return a value to store in the returned PM dictionary
:param config: (PMConfig) PM Configuration settings. The enabled flag is examined to determine
if the data associated with a PM Name will be collected.
- :return: (dict) collected metrics
+ :return: (MetricInformation) collected metrics
"""
+ from voltha.protos.device_pb2 import PmConfig
+ from voltha.protos.events_pb2 import MetricInformation, MetricMetaData
+ assert ' ' not in group_name, 'Spaces are not allowed in metric titles, use an underscore'
+
+ if group is None:
+ return None
+
metrics = dict()
+ context = dict()
+ now = arrow.utcnow().float_timestamp
for (metric, t) in names:
- if config[metric].enabled and hasattr(group, metric):
- metrics[metric] = getattr(group, metric)
+ if config[metric].type == PmConfig.CONTEXT and hasattr(group, metric):
+ context[metric] = str(getattr(group, metric))
- return metrics
+ elif config[metric].type in (PmConfig.COUNTER, PmConfig.GAUGE, PmConfig.STATE):
+ if config[metric].enabled and hasattr(group, metric):
+ metrics[metric] = getattr(group, metric)
- def collect_metrics(self, metrics=None):
+ # Check length of metric data. Will be zero if if/when individual group
+ # metrics can be disabled and all are (or or not supported by the
+ # underlying adapter)
+ if len(metrics) == 0:
+ return None
+
+ return MetricInformation(metadata=MetricMetaData(title=group_name,
+ ts=now,
+ logical_device_id=self.logical_device_id,
+ serial_no=self.serial_number,
+ device_id=self.device_id,
+ context=context),
+ metrics=metrics)
+
+ def collect_metrics(self, data=None):
"""
Collect metrics for this adapter.
- This method is called for each adapter at a fixed frequency. The adapter type
- (OLT, ONU, ..) should provide a derived class where this method iterates
- through all metrics and collects them up in a dictionary with the group/metric
- name as the key, and the metric values as the contents.
+ The adapter type (OLT, ONU, ..) should provide a derived class where this
+ method iterates through all metrics and collects them up in a dictionary with
+ the group/metric name as the key, and the metric values as the contents.
- For a group, the values are a map where metric_name -> metric_value
- For and individual metric, the values are the metric value
+ The data collected (or passed in) is a list of pairs/tuples. Each
+ pair is composed of a MetricMetaData metadata-portion and list of MetricValuePairs
+ that contains a single individual metric or list of metrics if this is a
+ group metric.
- TODO: Currently all group metrics are collected on a single timer tick. This needs to be fixed.
+ This method is called for each adapter at a fixed frequency.
+ TODO: Currently all group metrics are collected on a single timer tick.
+ This needs to be fixed as independent group or instance collection is
+ desirable.
- :param metrics: (dict) Existing map to add collected metrics to. This is
- provided to allow derived classes to call into further
- encapsulated classes
+ :param data: (list) Existing list of collected metrics (MetricInformation).
+ This is provided to allow derived classes to call into
+ further encapsulated classes.
- :return: (dict) metrics - see description above
+ :return: (list) metadata and metrics pairs - see description above
"""
raise NotImplementedError('Your derived class should override this method')
def collect_and_publish_metrics(self):
""" Request collection of all enabled metrics and publish them """
try:
- metrics = self.collect_metrics()
- self.publish_metrics(metrics)
+ data = self.collect_metrics()
+ self.publish_metrics(data)
except Exception as e:
self.log.exception('failed-to-collect-kpis', e=e)
- def publish_metrics(self, metrics):
+ def publish_metrics(self, data):
"""
- Publish the metrics during a collection
+ Publish the metrics during a collection.
- :param metrics: (dict) Metrics to publish. If empty, no metrics will be published
+ The data collected (or passed in) is a list of dictionary pairs/tuple. Each
+ pair is composed of a metadata-portion and a metrics-portion that contains
+ information for a specific instance of an individual metric or metric group.
+
+ :param data: (list) Existing list of collected metrics (MetricInformation)
+ to convert to a KPIEvent and publish
"""
- self.log.debug('publish-metrics', metrics=metrics)
+ from voltha.protos.events_pb2 import KpiEvent2, KpiEventType
- if len(metrics):
- import arrow
- from voltha.protos.events_pb2 import KpiEvent, KpiEventType, MetricValuePairs
+ self.log.debug('publish-metrics', data=data)
+ if len(data):
try:
- ts = arrow.utcnow().timestamp
- kpi_event = KpiEvent(
+ # TODO: Existing adapters use the KpiEvent, if/when all existing
+ # adapters use the shared KPI library, we may want to
+ # deprecate the KPIEvent
+ kpi_event = KpiEvent2(
type=KpiEventType.slice,
- ts=ts,
- prefixes={
- self.prefix + '.{}'.format(k): MetricValuePairs(metrics=metrics[k])
- for k in metrics.keys()}
+ ts=arrow.utcnow().float_timestamp,
+ slice_data=data
)
self.adapter_agent.submit_kpis(kpi_event)
diff --git a/voltha/extensions/kpi/olt/olt_pm_metrics.py b/voltha/extensions/kpi/olt/olt_pm_metrics.py
index dae1803..8965fcd 100644
--- a/voltha/extensions/kpi/olt/olt_pm_metrics.py
+++ b/voltha/extensions/kpi/olt/olt_pm_metrics.py
@@ -20,17 +20,18 @@
"""
Shared OL Device Adapter PM Metrics Manager
- This class specifically addresses ONU genernal PM (health, ...) area
+ This class specifically addresses ONU general PM (health, ...) area
specific PM (OMCI, PON, UNI) is supported in encapsulated classes accessible
from this object
"""
- def __init__(self, adapter_agent, device_id, grouped=False, freq_override=False,
- **kwargs):
+ def __init__(self, adapter_agent, device_id, logical_device_id,
+ grouped=False, freq_override=False, **kwargs):
"""
Initializer for shared ONU Device Adapter PM metrics
:param adapter_agent: (AdapterAgent) Adapter agent for the device
:param device_id: (str) Device ID
+ :param logical_device_id: (str) VOLTHA Logical Device ID
:param grouped: (bool) Flag indicating if statistics are managed as a group
:param freq_override: (bool) Flag indicating if frequency collection can be specified
on a per group basis
@@ -41,70 +42,63 @@
'nni-ports': List of objects that provide NNI (northbound) port statistics
'pon-ports': List of objects that provide PON port statistics
"""
- super(OltPmMetrics, self).__init__(adapter_agent, device_id,
+ super(OltPmMetrics, self).__init__(adapter_agent, device_id, logical_device_id,
grouped=grouped, freq_override=freq_override,
**kwargs)
- # PM Config Types are COUNTER, GUAGE, and STATE # GAUGE is misspelled device.proto
+ # PM Config Types are COUNTER, GAUGE, and STATE
self.nni_pm_names = {
+ ('intf_id', PmConfig.CONTEXT), # Physical device interface ID/Port number
+
('admin_state', PmConfig.STATE),
('oper_status', PmConfig.STATE),
- ('port_no', PmConfig.GUAGE), # Device and logical_device port numbers same
- ('rx_packets', PmConfig.COUNTER),
+
('rx_bytes', PmConfig.COUNTER),
- ('rx_dropped', PmConfig.COUNTER),
- ('rx_errors', PmConfig.COUNTER),
- ('rx_bcast', PmConfig.COUNTER),
- ('rx_mcast', PmConfig.COUNTER),
- ('tx_packets', PmConfig.COUNTER),
+ ('rx_packets', PmConfig.COUNTER),
+ ('rx_ucast_packets', PmConfig.COUNTER),
+ ('rx_mcast_packets', PmConfig.COUNTER),
+ ('rx_bcast_packets', PmConfig.COUNTER),
+ ('rx_error_packets', PmConfig.COUNTER),
+
('tx_bytes', PmConfig.COUNTER),
- ('tx_dropped', PmConfig.COUNTER),
- ('tx_bcast', PmConfig.COUNTER),
- ('tx_mcast', PmConfig.COUNTER),
- #
- # Commented out are from spec. May not be supported or implemented yet
- # ('rx_64', PmConfig.COUNTER),
- # ('rx_65_127', PmConfig.COUNTER),
- # ('rx_128_255', PmConfig.COUNTER),
- # ('rx_256_511', PmConfig.COUNTER),
- # ('rx_512_1023', PmConfig.COUNTER),
- # ('rx_1024_1518', PmConfig.COUNTER),
- # ('rx_frame_err', PmConfig.COUNTER),
- # ('rx_over_err', PmConfig.COUNTER),
- # ('rx_crc_err', PmConfig.COUNTER),
- # ('rx_64', PmConfig.COUNTER),
- # ('tx_65_127', PmConfig.COUNTER),
- # ('tx_128_255', PmConfig.COUNTER),
- # ('tx_256_511', PmConfig.COUNTER),
- # ('tx_512_1023', PmConfig.COUNTER),
- # ('tx_1024_1518', PmConfig.COUNTER),
- # ('collisions', PmConfig.COUNTER),
+ ('tx_packets', PmConfig.COUNTER),
+ ('tx_ucast_packets', PmConfig.COUNTER),
+ ('tx_mcast_packets', PmConfig.COUNTER),
+ ('tx_bcast_packets', PmConfig.COUNTER),
+ ('tx_error_packets', PmConfig.COUNTER),
+ ('rx_crc_errors', PmConfig.COUNTER),
+ ('bip_errors', PmConfig.COUNTER),
}
self.pon_pm_names = {
+ ('intf_id', PmConfig.CONTEXT), # Physical device port number (PON)
+ ('pon_id', PmConfig.CONTEXT), # PON ID (0..n)
+
('admin_state', PmConfig.STATE),
('oper_status', PmConfig.STATE),
- ('port_no', PmConfig.GUAGE), # Physical device port number
- ('pon_id', PmConfig.GUAGE),
('rx_packets', PmConfig.COUNTER),
('rx_bytes', PmConfig.COUNTER),
('tx_packets', PmConfig.COUNTER),
('tx_bytes', PmConfig.COUNTER),
('tx_bip_errors', PmConfig.COUNTER),
- ('in_service_onus', PmConfig.GUAGE),
- ('closest_onu_distance', PmConfig.GUAGE)
+ ('in_service_onus', PmConfig.GAUGE),
+ ('closest_onu_distance', PmConfig.GAUGE)
}
self.onu_pm_names = {
- ('pon_id', PmConfig.GUAGE),
- ('onu_id', PmConfig.GUAGE),
- ('fiber_length', PmConfig.GUAGE),
- ('equalization_delay', PmConfig.GUAGE),
- ('rssi', PmConfig.GUAGE), #
+ ('intf_id', PmConfig.CONTEXT), # Physical device port number (PON)
+ ('pon_id', PmConfig.CONTEXT),
+ ('onu_id', PmConfig.CONTEXT),
+
+ ('fiber_length', PmConfig.GAUGE),
+ ('equalization_delay', PmConfig.GAUGE),
+ ('rssi', PmConfig.GAUGE),
}
self.gem_pm_names = {
- ('pon_id', PmConfig.GUAGE),
- ('onu_id', PmConfig.GUAGE),
- ('gem_id', PmConfig.GUAGE),
- ('alloc_id', PmConfig.GUAGE),
+ ('intf_id', PmConfig.CONTEXT), # Physical device port number (PON)
+ ('pon_id', PmConfig.CONTEXT),
+ ('onu_id', PmConfig.CONTEXT),
+ ('gem_id', PmConfig.CONTEXT),
+
+ ('alloc_id', PmConfig.GAUGE),
('rx_packets', PmConfig.COUNTER),
('rx_bytes', PmConfig.COUNTER),
('tx_packets', PmConfig.COUNTER),
@@ -239,39 +233,71 @@
return pm_config
- def collect_metrics(self, metrics=None):
- # TODO: Currently PM collection is done for all metrics/groups on a single timer
- if metrics is None:
- metrics = dict()
+ def collect_metrics(self, data=None):
+ """
+ Collect metrics for this adapter.
- if self.pm_group_metrics['Ethernet'].enabled:
+ The data collected (or passed in) is a list of pairs/tuples. Each
+ pair is composed of a MetricMetaData metadata-portion and list of MetricValuePairs
+ that contains a single individual metric or list of metrics if this is a
+ group metric.
+
+ This method is called for each adapter at a fixed frequency.
+ TODO: Currently all group metrics are collected on a single timer tick.
+ This needs to be fixed as independent group or instance collection is
+ desirable.
+
+ :param data: (list) Existing list of collected metrics (MetricInformation).
+ This is provided to allow derived classes to call into
+ further encapsulated classes.
+
+ :return: (list) metadata and metrics pairs - see description above
+ """
+ if data is None:
+ data = list()
+
+ group_name = 'Ethernet'
+ if self.pm_group_metrics[group_name].enabled:
for port in self._nni_ports:
- name = 'nni.{}'.format(port.port_no)
- metrics[name] = self.collect_group_metrics(port,
- self.nni_pm_names,
- self.nni_metrics_config)
+ group_data = self.collect_group_metrics(group_name,
+ port,
+ self.nni_pm_names,
+ self.nni_metrics_config)
+ if group_data is not None:
+ data.append(group_data)
+
for port in self._pon_ports:
- if self.pm_group_metrics['PON'].enabled:
- name = 'pon.{}'.format(port.pon_id)
- metrics[name] = self.collect_group_metrics(port,
- self.pon_pm_names,
- self.pon_metrics_config)
+ group_name = 'PON'
+ if self.pm_group_metrics[group_name].enabled:
+ group_data = self.collect_group_metrics(group_name,
+ port,
+ self.pon_pm_names,
+ self.pon_metrics_config)
+ if group_data is not None:
+ data.append(group_data)
+
for onu_id in port.onu_ids:
onu = port.onu(onu_id)
if onu is not None:
- if self.pm_group_metrics['ONU'].enabled:
- name = 'pon.{}.onu.{}'.format(port.pon_id, onu.onu_id)
- metrics[name] = self.collect_group_metrics(onu,
- self.onu_pm_names,
- self.onu_metrics_config)
- if self.pm_group_metrics['GEM'].enabled:
+ group_name = 'ONU'
+ if self.pm_group_metrics[group_name].enabled:
+ group_data = self.collect_group_metrics(group_name,
+ onu,
+ self.onu_pm_names,
+ self.onu_metrics_config)
+ if group_data is not None:
+ data.append(group_data)
+
+ group_name = 'GEM'
+ if self.pm_group_metrics[group_name].enabled:
for gem in onu.gem_ports:
if not gem.multicast:
- name = 'pon.{}.onu.{}.gem.{}'.format(port.pon_id,
- onu.onu_id,
- gem.gem_id)
- metrics[name] = self.collect_group_metrics(onu,
- self.gem_pm_names,
- self.gem_metrics_config)
+ group_data = self.collect_group_metrics(group_name,
+ onu,
+ self.gem_pm_names,
+ self.gem_metrics_config)
+ if group_data is not None:
+ data.append(group_data)
+
# TODO: Do any multicast GEM PORT metrics here...
- return metrics
+ return data
diff --git a/voltha/extensions/kpi/onu/IntervalMetrics.md b/voltha/extensions/kpi/onu/IntervalMetrics.md
index cdf4e43..85d673c 100644
--- a/voltha/extensions/kpi/onu/IntervalMetrics.md
+++ b/voltha/extensions/kpi/onu/IntervalMetrics.md
@@ -41,15 +41,16 @@
## Common Elements for All Reported MEs
In addition to counter elements (attributes) reported in each ME, every reported
-historical interval report the following Elements. All widths are reported below
-in bytes.
+historical interval report the following Elements as context values in the KPI
+Event metadata field. Each value is reported as a _string_ per the Protobuf structure
+but are actually integer/floats.
-| Label | Width | Description |
-| ----------------: | :---: | :---------- |
-| class_id | 2 | The ME Class ID of the PM Interval ME |
-| entity_id | 2 | The OMCI Entity Instance of the particular PM Interval ME |
-| interval_end_time | 2 | Identifies the most recently finished 15 minute. This attribute is set to zero when a synchronize time request is performed by OpenOMCI. This counter rolls over from 255 to 0 upon saturation.
-
+| Label | Type | Description |
+| ------------------: | :----------: | :---------- |
+| class_id | int, 16-bits | The ME Class ID of the PM Interval ME |
+| entity_id | int, 16-bits | The OMCI Entity Instance of the particular PM Interval ME |
+| interval_end_time | int, 8-bits | Identifies the most recently finished 15 minute. This attribute is set to zero when a synchronize time request is performed by OpenOMCI. This counter rolls over from 255 to 0 upon saturation. |
+| interval_start_time | int, 64-bits | The UTC timestamp (seconds since epoch) rounded down to the start time of the specific interval |
## Ethernet Frame Performance Monitoring MEs
@@ -68,12 +69,12 @@
The table below describes the four Ethernet Frame Performance Monitoring MEs and provides their
counter width (in bytes) and ME Class ID.
-| ME Name | Class ID | Width |
-| ----------------------------------------------------------: | :------: | :---: |
-| Ethernet Frame Extended Performance Monitoring64Bit | 426 | 8 |
-| Ethernet Frame Extended Performance Monitoring | 334 | 8 |
-| Ethernet Frame Upstream Performance MonitoringHistoryData | 322 | 8 |
-| Ethernet Frame Downstream Performance MonitoringHistoryData | 321 | 8 |
+| ME Name | Class ID | Counter Width |
+| ----------------------------------------------------------: | :------: | :---: |
+| Ethernet Frame Extended Performance Monitoring64Bit | 426 | 64-bit |
+| Ethernet Frame Extended Performance Monitoring | 334 | 32-bit |
+| Ethernet Frame Upstream Performance MonitoringHistoryData | 322 | 32-bit |
+| Ethernet Frame Downstream Performance MonitoringHistoryData | 321 | 32-bit |
### Counter Information
@@ -111,7 +112,7 @@
termination point Ethernet UNI.
### Attributes
-All counters are 2 bytes wide.
+All counters are 32-bits wide.
| Attribute Name | Description |
| ------------------: | :-----------|
@@ -144,13 +145,13 @@
### Attributes
-| Attribute Name | Width | Description |
-| -----------------------: | :---: | :-----------|
-| corrected_bytes | 4 | This attribute counts the number of bytes that were corrected by the FEC function. |
-| corrected_code_words | 4 | This attribute counts the code words that were corrected by the FEC function. |
-| uncorrectable_code_words | 4 | This attribute counts errored code words that could not be corrected by the FEC function. |
-| total_code_words | 4 | This attribute counts the total received code words. |
-| fec_seconds | 2 | This attribute counts seconds during which there was a forward error correction anomaly. |
+| Attribute Name | Counter Width | Description |
+| -----------------------: | :-----: | :-----------|
+| corrected_bytes | 32-bits | This attribute counts the number of bytes that were corrected by the FEC function. |
+| corrected_code_words | 32-bits | This attribute counts the code words that were corrected by the FEC function. |
+| uncorrectable_code_words | 32-bits | This attribute counts errored code words that could not be corrected by the FEC function. |
+| total_code_words | 32-bits | This attribute counts the total received code words. |
+| fec_seconds | 16-bits | This attribute counts seconds during which there was a forward error correction anomaly. |
## GEM Port Network CTP Monitoring History Data (Class ID 341)
@@ -173,13 +174,13 @@
### Attributes
-| Attribute Name | Width | Description |
-| ------------------------: | :---: | :-----------|
-| transmitted_gem_frames | 4 | This attribute counts GEM frames transmitted on the monitored GEM port. |
-| received_gem_frames | 4 | This attribute counts GEM frames received correctly on the monitored GEM port. A correctly received GEM frame is one that does not contain uncorrectable errors and has a valid HEC. |
-| received_payload_bytes | 8 | This attribute counts user payload bytes received on the monitored GEM port. |
-| transmitted_payload_bytes | 8 | This attribute counts user payload bytes transmitted on the monitored GEM port. |
-| encryption_key_errors | 4 | This attribute is defined in ITU-T G.987 systems only. It counts GEM frames with erroneous encryption key indexes. If the GEM port is not encrypted, this attribute counts any frame with a key index not equal to 0. If the GEM port is encrypted, this attribute counts any frame whose key index specifies a key that is not known to the ONU. |
+| Attribute Name | Counter Width | Description |
+| ------------------------: | :-----: | :-----------|
+| transmitted_gem_frames | 32-bits | This attribute counts GEM frames transmitted on the monitored GEM port. |
+| received_gem_frames | 32-bits | This attribute counts GEM frames received correctly on the monitored GEM port. A correctly received GEM frame is one that does not contain uncorrectable errors and has a valid HEC. |
+| received_payload_bytes | 64-bits | This attribute counts user payload bytes received on the monitored GEM port. |
+| transmitted_payload_bytes | 64-bits | This attribute counts user payload bytes transmitted on the monitored GEM port. |
+| encryption_key_errors | 32-bits | This attribute is defined in ITU-T G.987 systems only. It counts GEM frames with erroneous encryption key indexes. If the GEM port is not encrypted, this attribute counts any frame with a key index not equal to 0. If the GEM port is encrypted, this attribute counts any frame whose key index specifies a key that is not known to the ONU. |
Note 3: GEM PM ignores idle GEM frames.
@@ -196,10 +197,10 @@
### Attributes
-All counters are 2 bytes wide.
+All counters are 32-bits wide.
-| Attribute Name | Description |
-| ------------------: | :-----------|
+| Attribute Name | Description |
+| ------------------------: | :-----------|
| psbd_hec_error_count | This attribute counts HEC errors in any of the fields of the downstream physical sync block. |
| xgtc_hec_error_count | This attribute counts HEC errors detected in the XGTC header. |
| unknown_profile_count | This attribute counts the number of grants received whose specified profile was not known to the ONU. |
@@ -221,10 +222,10 @@
### Attributes
-All counters are 2 bytes wide.
+All counters are 32-bits wide.
-| Attribute Name | Description |
-| ------------------: | :-----------|
+| Attribute Name | Description |
+| --------------------------------------: | :-----------|
| ploam_mic_error_count | This attribute counts MIC errors detected in downstream PLOAM messages, either directed to this ONU or broadcast to all ONUs. |
| downstream_ploam_messages_count | This attribute counts PLOAM messages received, either directed to this ONU or broadcast to all ONUs. |
| profile_messages_received | This attribute counts the number of profile messages received, either directed to this ONU or broadcast to all ONUs. |
@@ -252,10 +253,10 @@
### Attributes
-All counters are 2 bytes wide.
+All counters are 32-bits wide.
-| Attribute Name | Description |
-| ------------------: | :-----------|
+| Attribute Name | Description |
+| ------------------------------: | :-----------|
| upstream_ploam_message_count | This attribute counts PLOAM messages transmitted upstream, excluding acknowledge messages. |
| serial_number_onu_message_count | This attribute counts Serial_number_ONU PLOAM messages transmitted. |
| registration_message_count | This attribute counts registration PLOAM messages transmitted. |
@@ -265,4 +266,4 @@
# Remaining Work Items
-- The enable/disable of a PM group (CLI/NBI) should control whether or not a PM interval is collected
\ No newline at end of file
+- The enable/disable of a PM group (CLI/NBI) should control whether or not a PM interval ME is created and collected.
\ No newline at end of file
diff --git a/voltha/extensions/kpi/onu/onu_omci_pm.py b/voltha/extensions/kpi/onu/onu_omci_pm.py
index b0424d9..c2e6682 100644
--- a/voltha/extensions/kpi/onu/onu_omci_pm.py
+++ b/voltha/extensions/kpi/onu/onu_omci_pm.py
@@ -24,13 +24,16 @@
DEFAULT_OMCI_CC_ENABLED = False
DEFAULT_OMCI_CC_FREQUENCY = 1200 # 1/10ths of a second
- def __init__(self, adapter_agent, device_id,
+ OMCI_CC_GROUP_NAME = 'OMCI_CC'
+
+ def __init__(self, adapter_agent, device_id, logical_device_id,
grouped=False, freq_override=False, **kwargs):
"""
Initializer for shared ONU Device Adapter OMCI CC PM metrics
:param adapter_agent: (AdapterAgent) Adapter agent for the device
:param device_id: (str) Device ID
+ :param logical_device_id: (str) VOLTHA Logical Device ID
:param grouped: (bool) Flag indicating if statistics are managed as a group
:param freq_override: (bool) Flag indicating if frequency collection can be specified
on a per group basis
@@ -42,7 +45,7 @@
Communications channel statistics. Available from the ONU's
OpenOMCI OnuDeviceEntry object.
"""
- super(OnuOmciPmMetrics, self).__init__(adapter_agent, device_id,
+ super(OnuOmciPmMetrics, self).__init__(adapter_agent, device_id, logical_device_id,
grouped=grouped, freq_override=freq_override,
**kwargs)
self._omci_cc = kwargs.pop('omci-cc', None)
@@ -53,20 +56,17 @@
('rx_frames', PmConfig.COUNTER),
('rx_unknown_tid', PmConfig.COUNTER),
('rx_onu_frames', PmConfig.COUNTER), # Rx ONU autonomous messages
- ('rx_alarm_overflow', PmConfig.COUNTER), # Autonomous ONU generated alarm message overflows
- ('rx_avc_overflow', PmConfig.COUNTER), # Autonomous ONU generated AVC message overflows
- ('rx_onu_discards', PmConfig.COUNTER), # Autonomous ONU message unknown type discards
('rx_unknown_me', PmConfig.COUNTER), # Managed Entities without a decode definition
('rx_timeouts', PmConfig.COUNTER),
('consecutive_errors', PmConfig.COUNTER),
- ('reply_min', PmConfig.GUAGE), # Milliseconds
- ('reply_max', PmConfig.GUAGE), # Milliseconds
- ('reply_average', PmConfig.GUAGE), # Milliseconds
+ ('reply_min', PmConfig.GAUGE), # Milliseconds
+ ('reply_max', PmConfig.GAUGE), # Milliseconds
+ ('reply_average', PmConfig.GAUGE), # Milliseconds
}
self.omci_metrics_config = {m: PmConfig(name=m, type=t, enabled=True)
for (m, t) in self.omci_pm_names}
- self.openomci_interval_pm = OnuPmIntervalMetrics(adapter_agent, device_id)
+ self.openomci_interval_pm = OnuPmIntervalMetrics(adapter_agent, device_id, logical_device_id)
def update(self, pm_config):
# TODO: Test frequency override capability for a particular group
@@ -92,7 +92,7 @@
if self._omci_cc is not None:
if self.grouped:
- pm_omci_stats = PmGroupConfig(group_name='OMCI-CC',
+ pm_omci_stats = PmGroupConfig(group_name=OnuOmciPmMetrics.OMCI_CC_GROUP_NAME,
group_freq=OnuOmciPmMetrics.DEFAULT_OMCI_CC_FREQUENCY,
enabled=OnuOmciPmMetrics.DEFAULT_OMCI_CC_ENABLED)
self.pm_group_metrics[pm_omci_stats.group_name] = pm_omci_stats
@@ -115,35 +115,38 @@
return self.openomci_interval_pm.make_proto(pm_config)
- def collect_metrics(self, metrics=None):
+ def collect_metrics(self, data=None):
"""
Collect metrics for this adapter.
- This method is callfor each adapter at a fixed frequency. The adapter type
- (OLT, ONU, ..) should provide a derived class where this method iterates
- through all metrics and collects them up in a dictionary with the group/metric
- name as the key, and the metric values as the contents.
+ The data collected (or passed in) is a list of pairs/tuples. Each
+ pair is composed of a MetricMetaData metadata-portion and list of MetricValuePairs
+ that contains a single individual metric or list of metrics if this is a
+ group metric.
- For a group, the values are a map where metric_name -> metric_value
- For and individual metric, the values are the metric value
+ This method is called for each adapter at a fixed frequency.
+ TODO: Currently all group metrics are collected on a single timer tick.
+ This needs to be fixed as independent group or instance collection is
+ desirable.
- TODO: Currently all group metrics are collected on a single timer tick. This needs to be fixed.
+ :param data: (list) Existing list of collected metrics (MetricInformation).
+ This is provided to allow derived classes to call into
+ further encapsulated classes.
- :param metrics: (dict) Existing map to add collected metrics to. This is
- provided to allow derived classes to call into further
- encapsulated classes
-
- :return: (dict) metrics - see description above
+ :return: (list) metadata and metrics pairs - see description above
"""
- # TODO: Currently PM collection is done for all metrics/groups on a single timer
- if metrics is None:
- metrics = dict()
+ if data is None:
+ data = list()
# Note: Interval PM is collection done autonomously, not through this method
if self._omci_cc is not None:
- if self.pm_group_metrics['OMCI-CC'].enabled:
- metrics['OMCI-CC'] = self.collect_group_metrics(self._omci_cc,
- self.omci_pm_names,
- self.omci_metrics_config)
- return metrics
+ group_name = OnuOmciPmMetrics.OMCI_CC_GROUP_NAME
+ if self.pm_group_metrics[group_name].enabled:
+ group_data = self.collect_group_metrics(group_name,
+ self._omci_cc,
+ self.omci_pm_names,
+ self.omci_metrics_config)
+ if group_data is not None:
+ data.append(group_data)
+ return data
diff --git a/voltha/extensions/kpi/onu/onu_pm_interval_metrics.py b/voltha/extensions/kpi/onu/onu_pm_interval_metrics.py
index 5b13ab8..9e44393 100644
--- a/voltha/extensions/kpi/onu/onu_pm_interval_metrics.py
+++ b/voltha/extensions/kpi/onu/onu_pm_interval_metrics.py
@@ -12,7 +12,9 @@
# See the License for the specific language governing permissions and
# limitations under the License.
+import arrow
from voltha.protos.device_pb2 import PmConfig, PmGroupConfig
+from voltha.protos.events_pb2 import KpiEvent2, MetricInformation, MetricMetaData, KpiEventType
from voltha.extensions.kpi.adapter_pm_metrics import AdapterPmMetrics
from voltha.extensions.omci.omci_entities import \
EthernetFrameUpstreamPerformanceMonitoringHistoryData, \
@@ -34,26 +36,27 @@
also always managed as a group with a fixed frequency of 15 minutes.
"""
ME_ID_INFO = {
- EthernetFrameUpstreamPerformanceMonitoringHistoryData.class_id: 'Ethernet Bridge Port History',
- EthernetFrameDownstreamPerformanceMonitoringHistoryData.class_id: 'Ethernet Bridge Port History',
- EthernetFrameExtendedPerformanceMonitoring.class_id: 'Ethernet Bridge Port History',
- EthernetFrameExtendedPerformanceMonitoring64Bit.class_id: 'Ethernet Bridge Port History',
- EthernetPMMonitoringHistoryData.class_id: 'Ethernet UNI History',
- FecPerformanceMonitoringHistoryData.class_id: 'FEC History',
- GemPortNetworkCtpMonitoringHistoryData.class_id: 'GEM Port History',
- XgPonTcPerformanceMonitoringHistoryData.class_id: 'xgPON TC History',
- XgPonDownstreamPerformanceMonitoringHistoryData.class_id: 'xgPON Downstream History',
- XgPonUpstreamPerformanceMonitoringHistoryData.class_id: 'xgPON Upstream History'
+ EthernetFrameUpstreamPerformanceMonitoringHistoryData.class_id: 'Ethernet_Bridge_Port_History',
+ EthernetFrameDownstreamPerformanceMonitoringHistoryData.class_id: 'Ethernet_Bridge_Port_History',
+ EthernetFrameExtendedPerformanceMonitoring.class_id: 'Ethernet_Bridge_Port_History',
+ EthernetFrameExtendedPerformanceMonitoring64Bit.class_id: 'Ethernet_Bridge_Port_History',
+ EthernetPMMonitoringHistoryData.class_id: 'Ethernet_UNI_History',
+ FecPerformanceMonitoringHistoryData.class_id: 'FEC_History',
+ GemPortNetworkCtpMonitoringHistoryData.class_id: 'GEM_Port_History',
+ XgPonTcPerformanceMonitoringHistoryData.class_id: 'xgPON_TC_History',
+ XgPonDownstreamPerformanceMonitoringHistoryData.class_id: 'xgPON_Downstream_History',
+ XgPonUpstreamPerformanceMonitoringHistoryData.class_id: 'xgPON_Upstream_History'
}
- def __init__(self, adapter_agent, device_id, **kwargs):
- super(OnuPmIntervalMetrics, self).__init__(adapter_agent, device_id,
+ def __init__(self, adapter_agent, device_id, logical_device_id, **kwargs):
+ super(OnuPmIntervalMetrics, self).__init__(adapter_agent, device_id, logical_device_id,
grouped=True, freq_override=False,
**kwargs)
ethernet_bridge_history = {
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("drop_events", PmConfig.COUNTER),
("octets", PmConfig.COUNTER),
("packets", PmConfig.COUNTER),
@@ -73,9 +76,10 @@
for (m, t) in ethernet_bridge_history}
ethernet_uni_history = { # Ethernet History Data (Class ID 24)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("fcs_errors", PmConfig.COUNTER),
("excessive_collision_counter", PmConfig.COUNTER),
("late_collision_counter", PmConfig.COUNTER),
@@ -95,9 +99,10 @@
for (m, t) in ethernet_uni_history}
fec_history = { # FEC History Data (Class ID 312)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("corrected_bytes", PmConfig.COUNTER),
("corrected_code_words", PmConfig.COUNTER),
("uncorrectable_code_words", PmConfig.COUNTER),
@@ -108,9 +113,10 @@
for (m, t) in fec_history}
gem_port_history = { # GEM Port Network CTP History Data (Class ID 341)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("transmitted_gem_frames", PmConfig.COUNTER),
("received_gem_frames", PmConfig.COUNTER),
("received_payload_bytes", PmConfig.COUNTER),
@@ -121,9 +127,10 @@
for (m, t) in gem_port_history}
xgpon_tc_history = { # XgPon TC History Data (Class ID 344)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("psbd_hec_error_count", PmConfig.COUNTER),
("xgtc_hec_error_count", PmConfig.COUNTER),
("unknown_profile_count", PmConfig.COUNTER),
@@ -137,9 +144,10 @@
for (m, t) in xgpon_tc_history}
xgpon_downstream_history = { # XgPon Downstream History Data (Class ID 345)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("ploam_mic_error_count", PmConfig.COUNTER),
("downstream_ploam_messages_count", PmConfig.COUNTER),
("profile_messages_received", PmConfig.COUNTER),
@@ -159,9 +167,10 @@
for (m, t) in xgpon_downstream_history}
xgpon_upstream_history = { # XgPon Upstream History Data (Class ID 346)
- ('class_id', PmConfig.GUAGE),
- ('entity_id', PmConfig.GUAGE),
- ("interval_end_time", PmConfig.GUAGE),
+ ('class_id', PmConfig.CONTEXT),
+ ('entity_id', PmConfig.CONTEXT),
+ ("interval_end_time", PmConfig.CONTEXT),
+
("upstream_ploam_message_count", PmConfig.COUNTER),
("serial_number_onu_message_count", PmConfig.COUNTER),
("registration_message_count", PmConfig.COUNTER),
@@ -319,31 +328,42 @@
self.log.debug('publish-metrics', metrics=interval_data)
try:
- import arrow
- from voltha.protos.events_pb2 import KpiEvent, KpiEventType, MetricValuePairs
# Locate config
-
+ now = arrow.utcnow()
class_id = interval_data['class_id']
config = self._configs.get(class_id)
group = self.pm_group_metrics.get(OnuPmIntervalMetrics.ME_ID_INFO.get(class_id, ''))
if config is not None and group is not None and group.enabled:
# Extract only the metrics we need to publish
- config_keys = config.keys()
- metrics = {
- interval_data['me_name']: {k: v
- for k, v in interval_data.items()
- if k in config_keys and v is not None}
+ metrics = dict()
+ context = {
+ 'interval_start_time': str(now.replace(minute=int(now.minute / 15) * 15,
+ second=0,
+ microsecond=0).timestamp)
}
- # Prepare the KpiEvent for submission
- kpi_event = KpiEvent(
- type=KpiEventType.slice,
- ts=arrow.get(interval_data['interval_utc_time']).timestamp,
- prefixes={
- self.prefix + '.{}'.format(k): MetricValuePairs(metrics=metrics[k])
- for k in metrics.keys()}
- )
- self.adapter_agent.submit_kpis(kpi_event)
+ for metric, config_item in config.items():
+ if config_item.type == PmConfig.CONTEXT and metric in interval_data:
+ context[metric] = str(interval_data[metric])
+
+ elif (config_item.type in (PmConfig.COUNTER, PmConfig.GAUGE, PmConfig.STATE) and
+ metric in interval_data and
+ config_item.enabled):
+ metrics[metric] = interval_data[metric]
+
+ if len(metrics):
+ metadata = MetricMetaData(title=group.group_name,
+ ts=now.float_timestamp,
+ logical_device_id=self.logical_device_id,
+ serial_no=self.serial_number,
+ device_id=self.device_id,
+ context=context)
+ slice_data = [MetricInformation(metadata=metadata, metrics=metrics)]
+
+ kpi_event = KpiEvent2(type=KpiEventType.slice,
+ ts=now.float_timestamp,
+ slice_data=slice_data)
+ self.adapter_agent.submit_kpis(kpi_event)
except Exception as e:
self.log.exception('failed-to-submit-kpis', e=e)
diff --git a/voltha/extensions/kpi/onu/onu_pm_metrics.py b/voltha/extensions/kpi/onu/onu_pm_metrics.py
index 7237a6f..2155d19 100644
--- a/voltha/extensions/kpi/onu/onu_pm_metrics.py
+++ b/voltha/extensions/kpi/onu/onu_pm_metrics.py
@@ -30,26 +30,28 @@
DEFAULT_HEARTBEAT_ENABLED = False
DEFAULT_HEARTBEAT_FREQUENCY = 1200 # 1/10ths of a second
- def __init__(self, adapter_agent, device_id, grouped=False, freq_override=False, **kwargs):
+ def __init__(self, adapter_agent, device_id, logical_device_id,
+ grouped=False, freq_override=False, **kwargs):
"""
Initializer for shared ONU Device Adapter PM metrics
:param adapter_agent: (AdapterAgent) Adapter agent for the device
:param device_id: (str) Device ID
+ :param logical_device_id: (str) VOLTHA Logical Device ID
:param grouped: (bool) Flag indicating if statistics are managed as a group
:param freq_override: (bool) Flag indicating if frequency collection can be specified
on a per group basis
:param kwargs: (dict) Device Adapter specific values. For an ONU Device adapter, the
expected key-value pairs are listed below. If not provided, the
- associated PM statistics are not gathered:
+ associated PMv statistics are not gathered:
'heartbeat': Reference to the a class that provides an ONU heartbeat
- statistics. TODO: This needs to be standardized
+ statistics. TODO: This should be standardized across adapters
"""
- super(OnuPmMetrics, self).__init__(adapter_agent, device_id,
- grouped=grouped, freq_override=freq_override, **kwargs)
+ super(OnuPmMetrics, self).__init__(adapter_agent, device_id, logical_device_id,
+ grouped=grouped, freq_override=freq_override,
+ **kwargs)
- #
# The following HeartBeat PM is only an example. We may want to have a common heartbeat
# object for OLT and ONU DAs that work the same. If so, it could also provide PM information
#
@@ -70,8 +72,9 @@
self.health_metrics_config = {m: PmConfig(name=m, type=t, enabled=True)
for (m, t) in self.health_pm_names}
- self.omci_pm = OnuOmciPmMetrics(adapter_agent, device_id, grouped=grouped,
- freq_override=freq_override, **kwargs)
+ self.omci_pm = OnuOmciPmMetrics(adapter_agent, device_id, logical_device_id,
+ grouped=grouped, freq_override=freq_override,
+ **kwargs)
def update(self, pm_config):
try:
@@ -133,22 +136,34 @@
pm_config = self.omci_pm.make_proto(pm_config)
return pm_config
- def collect_metrics(self, metrics=None):
+ def collect_metrics(self, data=None):
"""
- Collect metrics
- :param metrics:
- :return:
+ Collect metrics for this adapter.
+
+ The data collected (or passed in) is a list of pairs/tuples. Each
+ pair is composed of a MetricMetaData metadata-portion and list of MetricValuePairs
+ that contains a single individual metric or list of metrics if this is a
+ group metric.
+
+ This method is called for each adapter at a fixed frequency.
+ TODO: Currently all group metrics are collected on a single timer tick.
+ This needs to be fixed as independent group or instance collection is
+ desirable.
+
+ :param data: (list) Existing list of collected metrics (MetricInformation).
+ This is provided to allow derived classes to call into
+ further encapsulated classes.
+
+ :return: (list) metadata and metrics pairs - see description above
"""
- # TODO: Currently PM collection is done for all metrics/groups on a single timer
- if metrics is None:
- metrics = dict()
+ if data is None:
+ data = list()
# TODO: Heartbeat stats disabled since it is not a common item on all ONUs (or OLTs)
# if self._heartbeat is not None:
- # metrics['heartbeat'] = self.collect_metrics(self._heartbeat,
- # self.health_pm_names,
- # self.health_metrics_config)
- self.omci_pm.collect_metrics(metrics=metrics)
+ # data.extend(self.collect_metrics(self._heartbeat, self.health_pm_names,
+ # self.health_metrics_config))
+ data.extend(self.omci_pm.collect_metrics(data=data))
# TODO Add PON Port PM
# TODO Add UNI Port PM
- return metrics
+ return data
diff --git a/voltha/northbound/diagnostics.py b/voltha/northbound/diagnostics.py
index b53c901..b1adf62 100644
--- a/voltha/northbound/diagnostics.py
+++ b/voltha/northbound/diagnostics.py
@@ -31,7 +31,7 @@
from zope.interface import implementer
from common.event_bus import EventBusClient
-from voltha.protos.events_pb2 import KpiEvent, KpiEventType, MetricValuePairs
+from voltha.protos.events_pb2 import KpiEvent2, KpiEventType, MetricInformation, MetricMetaData
from voltha.registry import IComponent, registry
log = structlog.get_logger()
@@ -63,7 +63,7 @@
def run_periodic_checks(self):
- ts = arrow.utcnow().timestamp
+ ts = arrow.utcnow().float_timestamp
def deferreds():
return len(gc.get_referrers(Deferred))
@@ -74,17 +74,17 @@
rss /= 1024
return rss
- kpi_event = KpiEvent(
+ kpi_event = KpiEvent2(
type=KpiEventType.slice,
ts=ts,
- prefixes={
- 'voltha.internal.{}'.format(self.instance_id):
- MetricValuePairs(metrics={
- 'deferreds': deferreds(),
- 'rss-mb': rss_mb(),
- })
- }
- )
-
+ slice_data=[
+ MetricInformation(metadata=MetricMetaData(title='voltha.internal',
+ ts=ts,
+ context={'instance_id': self.instance_id}),
+ metrics={'deferreds': deferreds(),
+ 'rss-mb': rss_mb()}
+ )
+ ])
self.event_bus.publish('kpis', kpi_event)
log.debug('periodic-check', ts=ts)
+
diff --git a/voltha/protos/device.proto b/voltha/protos/device.proto
index d8b4b23..58594ed 100644
--- a/voltha/protos/device.proto
+++ b/voltha/protos/device.proto
@@ -41,9 +41,10 @@
message PmConfig {
enum PmType {
- COUNTER = 0;
- GUAGE = 1;
- STATE = 2;
+ COUNTER = 0;
+ GAUGE = 1;
+ STATE = 2;
+ CONTEXT = 3;
}
string name = 1;
PmType type = 2;
diff --git a/voltha/protos/events.proto b/voltha/protos/events.proto
index 883bc68..019f698 100644
--- a/voltha/protos/events.proto
+++ b/voltha/protos/events.proto
@@ -30,6 +30,29 @@
}
/*
+ * Struct to convey a dictionary of metric metadata.
+ */
+message MetricMetaData {
+ string title = 1; // Metric group or individual metric name
+ double ts = 2; // UTC time-stamp of data (seconds since epoch) of
+ // when the metric or metric group was collected.
+ // If this is a 15-min historical group, it is the
+ // time of the collection and reporting, not the
+ // start or end of the 15-min group interval.
+
+ string logical_device_id = 3; // The logical device ID of the VOLTHA
+ // (equivalent to the DPID that ONOS has
+ // for the VOLTHA device without the
+ // 'of:' prefix
+ string serial_no = 4; // The OLT, ONU, ... device serial number
+ string device_id = 5; // The OLT, ONU, ... physical device ID
+
+ map<string, string> context = 6; // Name value pairs that provide additional
+ // context information on the metrics being
+ // reported.
+}
+
+/*
* Struct to convey a dictionary of metric->value pairs. Typically used in
* pure shared-timestamp or shared-timestamp + shared object prefix situations.
*/
@@ -40,7 +63,20 @@
}
+/*
+ * Struct to group metadata for a metric (or group of metrics) with the key-value
+ * pairs of collected metrics
+ */
+message MetricInformation {
+ MetricMetaData metadata = 1;
+ map<string, float> metrics = 2;
+}
+/*
+ * Legacy KPI Event structured. In mid-August, the KPI event format was updated
+ * to a more easily parsable format. See VOL-1140
+ * for more information.
+ */
message KpiEvent {
KpiEventType.KpiEventType type = 1;
@@ -53,6 +89,17 @@
}
+message KpiEvent2 {
+ // Type of KPI Event
+ KpiEventType.KpiEventType type = 1;
+
+ // Fields used when for slice:
+ double ts = 2; // UTC time-stamp of data in slice mode (seconds since epoch)
+ // of the time this entire KpiEvent was published to the kafka bus
+
+ repeated MetricInformation slice_data = 3;
+}
+
/*
* Identify to the area of the system impacted by the alarm
*/