| .. SPDX-FileCopyrightText: 2021 Open Networking Foundation <info@opennetworking.org> |
| .. SPDX-License-Identifier: Apache-2.0 |
| |
| Dual Homing |
| =========== |
| |
| Overview |
| -------- |
| |
| .. image:: ../../images/config-dh.png |
| |
| The dual-homing feature includes several sub components |
| |
| - **Use of "paired" ToRs**: Each rack of compute nodes have exactly two |
| Top-of-Rack switches (ToRs), that are linked to each other via a single link |
| - such a link is referred to as a **pair link**. This pairing should NOT be |
| omitted. |
| |
| Currently there is support for only a single link between paired ToRs. In |
| future releases, we may include dual pair links. Note that the pair link is |
| only used in failure scenarios, and not in normal operation. |
| |
| - **Dual-homed servers (compute-nodes)**: Each server is connected to both |
| ToRs. The links to the paired ToRs are (Linux) bonded |
| |
| - **Dual-homed upstream routers**: The upstream routers MUST be connected to |
| the two ToRs that are part of a leaf-pair. You cannot connect them to leafs |
| that are not paired. This feature also requires two Quagga instances. |
| |
| - **Dual-homed access devices**. This component will be added in the future. |
| |
| Paired ToRs |
| ----------- |
| The reasoning behind two ToR (leaf) switches is simple. If you only have a |
| single ToR switch, and you lose it, the entire rack goes down. Using two ToR |
| switches increases your odds for continued connectivity for dual homed servers. |
| The reasoning behind pairing the two ToR switches is more involved, as is |
| explained in the Usage section below. |
| |
| Configure pair ToRs |
| ^^^^^^^^^^^^^^^^^^^ |
| Configuring paired-ToRs involves device configuration. Assume switches of:205 |
| and of:206 are paired ToRs. |
| |
| .. code-block:: json |
| |
| { |
| "devices" : { |
| "of:0000000000000205" : { |
| "segmentrouting" : { |
| "name" : "Leaf1-R2", |
| "ipv4NodeSid" : 205, |
| "ipv4Loopback" : "192.168.0.205", |
| "ipv6NodeSid" : 205, |
| "ipv6Loopback" : "2000::c0a8:0205", |
| "routerMac" : "00:00:02:05:00:01", |
| "pairDeviceId" : "of:0000000000000206", |
| "pairLocalPort" : 20, |
| "isEdgeRouter" : true, |
| "adjacencySids" : [] |
| } |
| }, |
| "of:0000000000000206" : { |
| "segmentrouting" : { |
| "name" : "Leaf2-R2", |
| "ipv4NodeSid" : 206, |
| "ipv4Loopback" : "192.168.0.206", |
| "ipv6NodeSid" : 206, |
| "ipv6Loopback" : "2000::c0a8:0206", |
| "routerMac" : "00:00:02:05:00:01", |
| "pairDeviceId" : "of:0000000000000205", |
| "pairLocalPort" : 30, |
| "isEdgeRouter" : true, |
| "adjacencySids" : [] |
| } |
| } |
| } |
| } |
| |
| There are two new pieces of device configuration. |
| |
| Each device in the ToR pair needs to specify the **deviceId of the leaf it is |
| paired to**, in the ``pairDeviceId`` field. For example, in ``of:205`` |
| configuration the ``pairDeviceId`` is specified as ``of:206``, and similarly in ``of:206`` |
| configuration the ``pairDeviceId`` is ``of:205``. Each device in the ToR pair needs to |
| specify the **port on the device used for the pair link** in the |
| ``pairLocalPort`` field. For example, the pair link in the config above show |
| that port 20 on of:205 is connected to port 30 on of:206. |
| |
| In addition, there is one crucial piece of config that needs to **match for |
| both ToRs** – the ``routerMac`` address. The paired-ToRs MUST have the same |
| ``routerMac`` - in the example above, they both have identical 00:00:02:05:00:01 |
| ``routerMac``. |
| |
| All other fields are the same as before, as explained in :doc:`Device |
| Configuration <../../configuration/network>` section. |
| |
| |
| Usage of pair link |
| ^^^^^^^^^^^^^^^^^^ |
| |
| .. image:: ../../images/config-dh-pair-link.png |
| |
| |
| Dual-Homed Servers |
| ------------------ |
| |
| There are a number of things to note when connecting dual-homed servers to paired-ToRs. |
| |
| - The switch ports on the two ToRs have to be configured the same way, when |
| connecting a dual-homed server to the two ToRs. |
| |
| - The server ports have to be Linux-bonded in a particular mode. |
| |
| Configure Switch Ports |
| ^^^^^^^^^^^^^^^^^^^^^^ |
| |
| The way to configure ports are similar as described in :doc:`Bridging and |
| Unicast <../../configuration/network>`. However, there are a couple of things to note. |
| |
| **First**, dual-homed servers should have the **identical configuration on each |
| switch port they connect to on the ToR pairs**. The example below shows that |
| the ``vlans`` and ``ips`` configured are the same on both switch ports |
| ``of:205/12`` and ``of:206/29``. They are both configured to be access ports |
| in ``VLAN 20``, the subnet ``10.0.2.0/24`` is assigned to these ports, and the |
| gateway-IP is ``10.0.2.254/32``. |
| |
| .. code-block:: json |
| |
| { |
| "ports" : { |
| "of:0000000000000205/12" : { |
| "interfaces" : [{ |
| "name" : "h3-intf-1", |
| "ips" : [ "10.0.2.254/24"], |
| "vlan-untagged": 20 |
| }] |
| }, |
| "of:0000000000000206/29" : { |
| "interfaces" : [{ |
| "name" : "h3-intf-2", |
| "ips" : [ "10.0.2.254/24"], |
| "vlan-untagged": 20 |
| }] |
| } |
| } |
| } |
| |
| It is worth noting the meaning behind the configuration above from a routing |
| perspective. Simply put, by configuring the same subnets on these switch |
| ports, the fabric now believes that the entire subnet ``10.0.2.0/24`` is |
| reachable by BOTH ToR switches ``of:205`` and ``of:206``. |
| |
| .. caution:: |
| Configuring different VLANs, or different subnets, or mismatches like |
| ``vlan-untagged`` in one switch port and ``vlan-tagged`` in the corresponding |
| switch port facing the dual-homed server, will result in incorrect |
| behavior. |
| |
| **Second**, we need to configure the **pair link ports on both ToR switches to |
| be trunk (``vlan-tagged``) ports that contains all dual-homed VLANs and subnets**. |
| This is an extra piece of configuration, the need for which will be removed in |
| future releases. In the example above, a dual-homed server connects to the ToR |
| pair on port 12 on of:205 and port 29 on of:206. Assume that the pair link |
| between the two ToRs is connected to port 5 of both of:205 and of:206. The |
| config for these switch ports is shown below: |
| |
| .. code-block:: json |
| |
| { |
| "ports": { |
| "of:0000000000000205/5" : { |
| "interfaces" : [{ |
| "name" : "205-pair-port", |
| "ips" : [ "10.0.2.254/24"], |
| "vlan-tagged": [20] |
| }] |
| }, |
| "of:0000000000000206/5" : { |
| "interfaces" : [{ |
| "name" : "206-pair-port", |
| "ips" : [ "10.0.2.254/24"], |
| "vlan-tagged": [20] |
| }] |
| } |
| } |
| } |
| |
| .. note:: |
| Even though the ports ``of:205/12`` and ``of:206/29`` facing the dual-homed |
| server are configured as ``vlan-untagged``, the same VLAN MUST be |
| configured as ``vlan-tagged`` on the pair-ports. |
| |
| If additional subnets and VLANs are configured facing other dual-homed |
| servers, they need to be similarly added to the ``ips`` and ``vlan-tagged`` |
| arrays in the pair port config. |
| |
| |
| Configure Servers |
| ^^^^^^^^^^^^^^^^^ |
| |
| Assuming the interfaces we are going to use for bonding are ``eth1`` and ``eth2``. |
| |
| - Bring down interfaces |
| |
| .. code-block:: console |
| |
| $ sudo ifdown eth1 |
| $ sudo ifdown eth2 |
| |
| - Modify ``/etc/network/interfaces`` |
| |
| .. code-block:: text |
| |
| auto bond0 |
| iface bond0 inet dhcp |
| bond-mode balance-xor |
| bond-xmit_hash_policy layer2+3 |
| bond-slaves none |
| |
| auto eth1 |
| iface eth1 inet manual |
| bond-master bond0 |
| |
| auto eth2 |
| iface eth2 inet manual |
| bond-master bond0 |
| |
| |
| - Start interfaces |
| |
| .. code-block:: console |
| |
| $ sudo ifup bond0 |
| $ sudo ifup eth1 |
| $ sudo ifup eth2 |
| |
| - Useful command to check bonding status |
| |
| .. code-block:: console |
| |
| # cat /proc/net/bonding/bond0 |
| Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) |
| |
| Bonding Mode: load balancing (xor) |
| Transmit Hash Policy: layer2+3 (2) |
| MII Status: up |
| MII Polling Interval (ms): 0 |
| Up Delay (ms): 0 |
| Down Delay (ms): 0 |
| |
| Slave Interface: eth1 |
| MII Status: up |
| Speed: 1000 Mbps |
| Duplex: full |
| Link Failure Count: 0 |
| Permanent HW addr: 00:1c:42:5b:07:6a |
| Slave queue ID: 0 |
| |
| Slave Interface: eth2 |
| MII Status: up |
| Speed: Unknown |
| Duplex: Unknown |
| Link Failure Count: 0 |
| Permanent HW addr: 00:1c:42:1c:a1:7c |
| Slave queue ID: 0 |
| |
| .. caution:: |
| **Dual-homed host should not be statically configured.** |
| |
| Currently in ONOS, configured hosts are not updated when the ``connectPoint`` |
| is lost. This is not a problem with single-homed hosts because there is no |
| other way to reach them anyway if their ``connectPoint`` goes down. But in |
| dual-homed scenarios, the controller should take corrective action if one |
| of the ``connectPoint`` go down – the trigger for this event does not happen |
| when the dual-homed host's connect points are configured (not discovered). |
| |
| .. note:: |
| We also support static routes with dual-homed next hop. The way to |
| configure it is exactly the same as regular single-homed next hop, as |
| described in :doc:`External Connectivity <external-connectivity>`. |
| |
| ONOS will automatically recognize when the next-hop IP resolves to a |
| dual-homed host and program both switches (the host connects to) |
| accordingly. |
| |
| The failure recovery mechanism for dual-homed hosts also applies to static |
| routes that point to the host as their next hop. |
| |
| Dual External Routers |
| --------------------- |
| |
| .. image:: ../../images/config-dh-vr.png |
| |
| .. image:: ../../images/config-dh-vr-logical.png |
| :width: 200px |
| |
| In addition to what we describe in :doc:`External Connectivity |
| <external-connectivity>`, SD-Fabric also supports dual external routers, which |
| view the SD-Fabric as 2 individual routers, as shown above. |
| |
| As before the vRouter control plane is implemented as a combination of Quagga, |
| which peers with the upstream routers, and ONOS which listens to Quagga (via |
| FPM) and programs the underlying fabric. **In dual-router scenarios, there are |
| two instances of Quagga required**. |
| |
| As before the hardware fabric serves as the data-plane of vRouter. In |
| dual-router scenarios, the **external routers MUST be connected to |
| paired-ToRs**. |
| |
| ToR connects to one upstream |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Lets consider the simpler case where the external routers are each connected to |
| a single leaf in a ToR pair. The figure on the left below shows the logical |
| view. The figure on the right shows the physical connectivity. |
| |
| .. image:: ../../images/config-dh-vr-logical-simple.png |
| :width: 200px |
| |
| .. image:: ../../images/config-dh-vr-physical-simple.png |
| :width: 400px |
| |
| One of the upstream routers is connected to ``of:205`` and the other is |
| connected to ``of:206``. Note that ``of:205`` and ``of:206`` are paired ToRs. |
| |
| The ToRs are connected via a physical port to separate Quagga VMs or |
| containers. These Quagga instances can be placed in any compute node. They do |
| not need to be in the same server, and are only shown to be co-located for |
| simplicity. |
| |
| The two Quagga instances do NOT talk to each other. |
| |
| Switch port configuration |
| """"""""""""""""""""""""" |
| |
| The ToRs follow the same rules as single router case described in |
| :doc:`External Connectivity <external-connectivity>`. In the example shown |
| above, the switch port config would look like this: |
| |
| .. code-block:: json |
| |
| { |
| "ports": { |
| "of:0000000000000205/1" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| "vlan-untagged": 100, |
| "name" : "internet-router-1" |
| }] |
| }, |
| |
| "of:0000000000000205/48" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| "vlan-untagged": 100, |
| "name" : "quagga-1" |
| }] |
| }, |
| |
| "of:0000000000000206/1" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.200.3/29", "2000::6503/125" ], |
| "vlan-untagged": 200, |
| "name" : "internet-router-2" |
| }] |
| }, |
| |
| "of:0000000000000206/48" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.200.3/29", "2000::6503/125" ], |
| "vlan-untagged": 200, |
| "name" : "quagga2" |
| }] |
| } |
| } |
| } |
| |
| .. note:: |
| In the example shown above, switch ``of:205`` uses ``VLAN 100`` for |
| bridging the peering session between Quagga1 and ExtRouter1, while switch |
| ``of:205`` uses ``VLAN 200`` to do the same for the other peering session. |
| But since these VLANs and bridging domains are defined on different |
| switches, the VLAN ids could have been the same. |
| |
| This philosophy is consistent with the fabric use of :doc:`bridging |
| <../../configuration/network>`. |
| |
| |
| Quagga configuration |
| """""""""""""""""""" |
| Configuring Quagga for dual external routers are similar to what we described |
| in :doc:`External Connectivity <external-connectivity>`. However, it is worth |
| noting that: |
| |
| - The two Zebra instances **should point to two different ONOS instances** for |
| their FPM connections. For example Zebra in Quagga1 could point to ONOS |
| instance with ``fpm connection ip 10.6.0.1 port 2620``, while the other Zebra |
| should point to a different ONOS instance with ``fpm connection ip 10.6.0.2 |
| port 2620``. It does not matter which ONOS instances they point to as long |
| as they are different. |
| |
| - The two Quagga BGP sessions should appear to come from different routers but |
| still use the same AS number – i.e. the two Quaggas' belong to the same AS, |
| the one used to represent the entire SD-Fabric. |
| |
| - The two upstream routers can belong to the same or different AS, but these AS |
| numbers should be different from the one used to represent the SD-Fabric AS. |
| |
| - Typically both Quagga instances advertise the same routes to the upstream. |
| These prefixes belonging to various infrastructure nodes in the deployment |
| should be reachable from either of the leaf switches connected to the |
| upstream routers. |
| |
| - The upstream routers may or may not advertise the same routes. SD-Fabric will |
| ensure that traffic directed to a route reachable only one upstream router is |
| directed to the appropriate leaf. |
| |
| ToR connects to both upstream |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Now lets consider the **more-complicated but more fault-tolerant** case of each |
| Quagga instance peering with BOTH external routers. Again the logical view is |
| shown on the left and the physical view on the right. |
| |
| .. image:: ../../images/config-dh-vr-logical.png |
| :width: 200px |
| |
| .. image:: ../../images/config-dh-vr-physical.png |
| :width: 500px |
| |
| First lets talk about the physical connectivity |
| |
| - Quagga instance 1 peers with external router R1 via port 1 on switch of:205 |
| - Quagga instance 1 peers with external router R2 via port 2 on switch of:205 |
| |
| Similarly |
| |
| - Quagga instance 2 peers with external router R1 via port 2 on switch of:206 |
| - Quagga instance 2 peers with external router R2 via port 1 on switch of:206 |
| |
| To distinguish between the two peering sessions in the same physical switch, |
| say of:205, the physical ports 1 and 2 need to be configured in **different |
| VLANs and subnets**. For example, port 1 on of:205 is (untagged) in VLAN 100, |
| while port 2 is in VLAN 101. |
| |
| Note that peering for **Quagga1 and R1** happens with IPs in the |
| ``10.0.100.0/29`` subnet, and for **Quagga 1 and R2** in the **10.0.101.0/29** |
| subnet. |
| |
| Furthermore, **pair link** (port 48) on of:205 carries both peering sessions to |
| Quagga1. Thus port 48 should now be configured as a **trunk port (vlan-tagged) |
| with both VLANs and both subnets**. |
| |
| Finally the **Quagga interface** on the VM now needs **sub-interface |
| configuration for each VLAN ID**. |
| |
| Similar configuration concepts apply to IPv6 as well. Here is a look at the |
| switch port config in ONOS for of:205 |
| |
| .. code-block:: json |
| |
| { |
| "ports": { |
| "of:0000000000000205/1" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| "vlan-untagged": 100, |
| "name" : "internet-router1" |
| }] |
| }, |
| |
| |
| "of:0000000000000205/2" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.101.3/29", "2000::7403/125" ], |
| "vlan-untagged": 101, |
| "name" : "internet-router2" |
| }] |
| }, |
| "of:0000000000000205/48" : { |
| "interfaces" : [{ |
| "ips" : [ "10.0.100.3/29", "2000::6403/125", "10.0.101.3/29", "2000::7403/125" ], |
| "vlan-tagged": [100, 101], |
| "name" : "quagga1" |
| }] |
| |
| } |
| } |
| } |