blob: 2ca5c75f9bb32622c216c803d815958404b4093d [file] [log] [blame]
.. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org>
.. SPDX-License-Identifier: Apache-2.0
Dual Homing
===========
Overview
--------
.. image:: ../../images/config-dh.png
The dual-homing feature includes several sub components
- **Use of "paired" ToRs**: Each rack of compute nodes have exactly two
Top-of-Rack switches (ToRs), that are linked to each other via a single link
- such a link is referred to as a **pair link**. This pairing should NOT be
omitted.
Currently there is support for only a single link between paired ToRs. In
future releases, we may include dual pair links. Note that the pair link is
only used in failure scenarios, and not in normal operation.
- **Dual-homed servers (compute-nodes)**: Each server is connected to both
ToRs. The links to the paired ToRs are (Linux) bonded
- **Dual-homed upstream routers**: The upstream routers MUST be connected to
the two ToRs that are part of a leaf-pair. You cannot connect them to leafs
that are not paired. This feature also requires two Quagga instances.
- **Dual-homed access devices**. This component will be added in the future.
Paired ToRs
-----------
The reasoning behind two ToR (leaf) switches is simple. If you only have a
single ToR switch, and you lose it, the entire rack goes down. Using two ToR
switches increases your odds for continued connectivity for dual homed servers.
The reasoning behind pairing the two ToR switches is more involved, as is
explained in the Usage section below.
Configure pair ToRs
^^^^^^^^^^^^^^^^^^^
Configuring paired-ToRs involves device configuration. Assume switches of:205
and of:206 are paired ToRs.
.. code-block:: json
{
"devices" : {
"of:0000000000000205" : {
"segmentrouting" : {
"name" : "Leaf1-R2",
"ipv4NodeSid" : 205,
"ipv4Loopback" : "192.168.0.205",
"ipv6NodeSid" : 205,
"ipv6Loopback" : "2000::c0a8:0205",
"routerMac" : "00:00:02:05:00:01",
"pairDeviceId" : "of:0000000000000206",
"pairLocalPort" : 20,
"isEdgeRouter" : true,
"adjacencySids" : []
}
},
"of:0000000000000206" : {
"segmentrouting" : {
"name" : "Leaf2-R2",
"ipv4NodeSid" : 206,
"ipv4Loopback" : "192.168.0.206",
"ipv6NodeSid" : 206,
"ipv6Loopback" : "2000::c0a8:0206",
"routerMac" : "00:00:02:05:00:01",
"pairDeviceId" : "of:0000000000000205",
"pairLocalPort" : 30,
"isEdgeRouter" : true,
"adjacencySids" : []
}
}
}
}
There are two new pieces of device configuration.
Each device in the ToR pair needs to specify the **deviceId of the leaf it is
paired to**, in the ``pairDeviceId`` field. For example, in ``of:205``
configuration the ``pairDeviceId`` is specified as ``of:206``, and similarly in ``of:206``
configuration the ``pairDeviceId`` is ``of:205``. Each device in the ToR pair needs to
specify the **port on the device used for the pair link** in the
``pairLocalPort`` field. For example, the pair link in the config above show
that port 20 on of:205 is connected to port 30 on of:206.
In addition, there is one crucial piece of config that needs to **match for
both ToRs** – the ``routerMac`` address. The paired-ToRs MUST have the same
``routerMac`` - in the example above, they both have identical 00:00:02:05:00:01
``routerMac``.
All other fields are the same as before, as explained in :doc:`Device
Configuration <../../configuration/network>` section.
Usage of pair link
^^^^^^^^^^^^^^^^^^
.. image:: ../../images/config-dh-pair-link.png
Dual-Homed Servers
------------------
There are a number of things to note when connecting dual-homed servers to paired-ToRs.
- The switch ports on the two ToRs have to be configured the same way, when
connecting a dual-homed server to the two ToRs.
- The server ports have to be Linux-bonded in a particular mode.
Configure Switch Ports
^^^^^^^^^^^^^^^^^^^^^^
The way to configure ports are similar as described in :doc:`Bridging and
Unicast <../../configuration/network>`. However, there are a couple of things to note.
**First**, dual-homed servers should have the **identical configuration on each
switch port they connect to on the ToR pairs**. The example below shows that
the ``vlans`` and ``ips`` configured are the same on both switch ports
``of:205/12`` and ``of:206/29``. They are both configured to be access ports
in ``VLAN 20``, the subnet ``10.0.2.0/24`` is assigned to these ports, and the
gateway-IP is ``10.0.2.254/32``.
.. code-block:: json
{
"ports" : {
"of:0000000000000205/12" : {
"interfaces" : [{
"name" : "h3-intf-1",
"ips" : [ "10.0.2.254/24"],
"vlan-untagged": 20
}]
},
"of:0000000000000206/29" : {
"interfaces" : [{
"name" : "h3-intf-2",
"ips" : [ "10.0.2.254/24"],
"vlan-untagged": 20
}]
}
}
}
It is worth noting the meaning behind the configuration above from a routing
perspective. Simply put, by configuring the same subnets on these switch
ports, the fabric now believes that the entire subnet ``10.0.2.0/24`` is
reachable by BOTH ToR switches ``of:205`` and ``of:206``.
.. caution::
Configuring different VLANs, or different subnets, or mismatches like
``vlan-untagged`` in one switch port and ``vlan-tagged`` in the corresponding
switch port facing the dual-homed server, will result in incorrect
behavior.
**Second**, we need to configure the **pair link ports on both ToR switches to
be trunk (``vlan-tagged``) ports that contains all dual-homed VLANs and subnets**.
This is an extra piece of configuration, the need for which will be removed in
future releases. In the example above, a dual-homed server connects to the ToR
pair on port 12 on of:205 and port 29 on of:206. Assume that the pair link
between the two ToRs is connected to port 5 of both of:205 and of:206. The
config for these switch ports is shown below:
.. code-block:: json
{
"ports": {
"of:0000000000000205/5" : {
"interfaces" : [{
"name" : "205-pair-port",
"ips" : [ "10.0.2.254/24"],
"vlan-tagged": [20]
}]
},
"of:0000000000000206/5" : {
"interfaces" : [{
"name" : "206-pair-port",
"ips" : [ "10.0.2.254/24"],
"vlan-tagged": [20]
}]
}
}
}
.. note::
Even though the ports ``of:205/12`` and ``of:206/29`` facing the dual-homed
server are configured as ``vlan-untagged``, the same VLAN MUST be
configured as ``vlan-tagged`` on the pair-ports.
If additional subnets and VLANs are configured facing other dual-homed
servers, they need to be similarly added to the ``ips`` and ``vlan-tagged``
arrays in the pair port config.
Configure Servers
^^^^^^^^^^^^^^^^^
Assuming the interfaces we are going to use for bonding are ``eth1`` and ``eth2``.
- Bring down interfaces
.. code-block:: console
$ sudo ifdown eth1
$ sudo ifdown eth2
- Modify ``/etc/network/interfaces``
.. code-block:: text
auto bond0
iface bond0 inet dhcp
bond-mode balance-xor
bond-xmit_hash_policy layer2+3
bond-slaves none
auto eth1
iface eth1 inet manual
bond-master bond0
auto eth2
iface eth2 inet manual
bond-master bond0
- Start interfaces
.. code-block:: console
$ sudo ifup bond0
$ sudo ifup eth1
$ sudo ifup eth2
- Useful command to check bonding status
.. code-block:: console
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: load balancing (xor)
Transmit Hash Policy: layer2+3 (2)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1c:42:5b:07:6a
Slave queue ID: 0
Slave Interface: eth2
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:1c:42:1c:a1:7c
Slave queue ID: 0
.. caution::
**Dual-homed host should not be statically configured.**
Currently in ONOS, configured hosts are not updated when the ``connectPoint``
is lost. This is not a problem with single-homed hosts because there is no
other way to reach them anyway if their ``connectPoint`` goes down. But in
dual-homed scenarios, the controller should take corrective action if one
of the ``connectPoint`` go down – the trigger for this event does not happen
when the dual-homed host's connect points are configured (not discovered).
.. note::
We also support static routes with dual-homed next hop. The way to
configure it is exactly the same as regular single-homed next hop, as
described in :doc:`External Connectivity <external-connectivity>`.
ONOS will automatically recognize when the next-hop IP resolves to a
dual-homed host and program both switches (the host connects to)
accordingly.
The failure recovery mechanism for dual-homed hosts also applies to static
routes that point to the host as their next hop.
Dual External Routers
---------------------
.. image:: ../../images/config-dh-vr.png
.. image:: ../../images/config-dh-vr-logical.png
:width: 200px
In addition to what we describe in :doc:`External Connectivity
<external-connectivity>`, SD-Fabric also supports dual external routers, which
view the SD-Fabric as 2 individual routers, as shown above.
As before the vRouter control plane is implemented as a combination of Quagga,
which peers with the upstream routers, and ONOS which listens to Quagga (via
FPM) and programs the underlying fabric. **In dual-router scenarios, there are
two instances of Quagga required**.
As before the hardware fabric serves as the data-plane of vRouter. In
dual-router scenarios, the **external routers MUST be connected to
paired-ToRs**.
ToR connects to one upstream
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Lets consider the simpler case where the external routers are each connected to
a single leaf in a ToR pair. The figure on the left below shows the logical
view. The figure on the right shows the physical connectivity.
.. image:: ../../images/config-dh-vr-logical-simple.png
:width: 200px
.. image:: ../../images/config-dh-vr-physical-simple.png
:width: 400px
One of the upstream routers is connected to ``of:205`` and the other is
connected to ``of:206``. Note that ``of:205`` and ``of:206`` are paired ToRs.
The ToRs are connected via a physical port to separate Quagga VMs or
containers. These Quagga instances can be placed in any compute node. They do
not need to be in the same server, and are only shown to be co-located for
simplicity.
The two Quagga instances do NOT talk to each other.
Switch port configuration
"""""""""""""""""""""""""
The ToRs follow the same rules as single router case described in
:doc:`External Connectivity <external-connectivity>`. In the example shown
above, the switch port config would look like this:
.. code-block:: json
{
"ports": {
"of:0000000000000205/1" : {
"interfaces" : [{
"ips" : [ "10.0.100.3/29", "2000::6403/125" ],
"vlan-untagged": 100,
"name" : "internet-router-1"
}]
},
"of:0000000000000205/48" : {
"interfaces" : [{
"ips" : [ "10.0.100.3/29", "2000::6403/125" ],
"vlan-untagged": 100,
"name" : "quagga-1"
}]
},
"of:0000000000000206/1" : {
"interfaces" : [{
"ips" : [ "10.0.200.3/29", "2000::6503/125" ],
"vlan-untagged": 200,
"name" : "internet-router-2"
}]
},
"of:0000000000000206/48" : {
"interfaces" : [{
"ips" : [ "10.0.200.3/29", "2000::6503/125" ],
"vlan-untagged": 200,
"name" : "quagga2"
}]
}
}
}
.. note::
In the example shown above, switch ``of:205`` uses ``VLAN 100`` for
bridging the peering session between Quagga1 and ExtRouter1, while switch
``of:205`` uses ``VLAN 200`` to do the same for the other peering session.
But since these VLANs and bridging domains are defined on different
switches, the VLAN ids could have been the same.
This philosophy is consistent with the fabric use of :doc:`bridging
<../../configuration/network>`.
Quagga configuration
""""""""""""""""""""
Configuring Quagga for dual external routers are similar to what we described
in :doc:`External Connectivity <external-connectivity>`. However, it is worth
noting that:
- The two Zebra instances **should point to two different ONOS instances** for
their FPM connections. For example Zebra in Quagga1 could point to ONOS
instance with ``fpm connection ip 10.6.0.1 port 2620``, while the other Zebra
should point to a different ONOS instance with ``fpm connection ip 10.6.0.2
port 2620``. It does not matter which ONOS instances they point to as long
as they are different.
- The two Quagga BGP sessions should appear to come from different routers but
still use the same AS number – i.e. the two Quaggas' belong to the same AS,
the one used to represent the entire SD-Fabric.
- The two upstream routers can belong to the same or different AS, but these AS
numbers should be different from the one used to represent the SD-Fabric AS.
- Typically both Quagga instances advertise the same routes to the upstream.
These prefixes belonging to various infrastructure nodes in the deployment
should be reachable from either of the leaf switches connected to the
upstream routers.
- The upstream routers may or may not advertise the same routes. SD-Fabric will
ensure that traffic directed to a route reachable only one upstream router is
directed to the appropriate leaf.
ToR connects to both upstream
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Now lets consider the **more-complicated but more fault-tolerant** case of each
Quagga instance peering with BOTH external routers. Again the logical view is
shown on the left and the physical view on the right.
.. image:: ../../images/config-dh-vr-logical.png
:width: 200px
.. image:: ../../images/config-dh-vr-physical.png
:width: 500px
First lets talk about the physical connectivity
- Quagga instance 1 peers with external router R1 via port 1 on switch of:205
- Quagga instance 1 peers with external router R2 via port 2 on switch of:205
Similarly
- Quagga instance 2 peers with external router R1 via port 2 on switch of:206
- Quagga instance 2 peers with external router R2 via port 1 on switch of:206
To distinguish between the two peering sessions in the same physical switch,
say of:205, the physical ports 1 and 2 need to be configured in **different
VLANs and subnets**. For example, port 1 on of:205 is (untagged) in VLAN 100,
while port 2 is in VLAN 101.
Note that peering for **Quagga1 and R1** happens with IPs in the
``10.0.100.0/29`` subnet, and for **Quagga 1 and R2** in the **10.0.101.0/29**
subnet.
Furthermore, **pair link** (port 48) on of:205 carries both peering sessions to
Quagga1. Thus port 48 should now be configured as a **trunk port (vlan-tagged)
with both VLANs and both subnets**.
Finally the **Quagga interface** on the VM now needs **sub-interface
configuration for each VLAN ID**.
Similar configuration concepts apply to IPv6 as well. Here is a look at the
switch port config in ONOS for of:205
.. code-block:: json
{
"ports": {
"of:0000000000000205/1" : {
"interfaces" : [{
"ips" : [ "10.0.100.3/29", "2000::6403/125" ],
"vlan-untagged": 100,
"name" : "internet-router1"
}]
},
"of:0000000000000205/2" : {
"interfaces" : [{
"ips" : [ "10.0.101.3/29", "2000::7403/125" ],
"vlan-untagged": 101,
"name" : "internet-router2"
}]
},
"of:0000000000000205/48" : {
"interfaces" : [{
"ips" : [ "10.0.100.3/29", "2000::6403/125", "10.0.101.3/29", "2000::7403/125" ],
"vlan-tagged": [100, 101],
"name" : "quagga1"
}]
}
}
}