Border reboot

From Delft Solutions
Revision as of 01:24, 5 August 2024 by Dortund (talk | contribs)
Jump to navigation Jump to search

Note: Throughout this guide <ipv4> and <ipv6> are to be replaced by the correct IP's. If you don't know, you can press the 'tab' key (twice) on your keyboard after typing 'neighbors' to get shown the options.

Pre-flight checks

These checks are to be done on the OTHER border (So the border that will stay online), to ensure that when the border that's being rebooted is down the cluster won't lose network connectivity. The commands are to be invoked in `vtysh`.

  • Confirm our IPv4 block is announced over BGP with `show ip bgp neighbors <ipv4> advertised-routes`
  • Confirm our IPv6 block is announced over BGP with `show bgp neighbors <ipv6> advertised-routes`
  • Confirm that the border receives the ROUTED IPv4 routes from the router with `show ip route`
  • Confirm that the border received the ROUTED & LAN IPv6 routes from the router with `show ipv6 route`
  • Set a maintenance period for the host on Zabbix.
  • Post in the Zulip in the relevant topic (incident's topic / 'SRE - General' stream) that the border is going to be rebooted.

Disabling routing through a border

First, perform the pre-flight checks on the OTHER border

On a border in `vtysh`, update the running configuration by invoking the following:

  • config
  • router bgp
  • neighbor <ipv4> shutdown
  • neighbor <ipv6> shutdown
  • exit
  • router ospf
  • no default-information originate
  • exit
  • router ospf6
  • no default-information originate
  • exit
  • exit
  • exit

Reboot the border

  • After performing the pre-flight checks and disabling the routing, you can choose to wait until traffic has decreased (e.g. using `bmon` to check bandwidth used on interfaces)
  • Execute `reboot` command
  • When the border is back online, execute relavant items (system uptime, operating system, reboot required) to ensure these will not activate a trigger after disabling maintenance mode
  • If you do not expect any Zabbix alert related to the reboot to be fired, delete the maintenance period

Troubleshooting

Undoing the shutdown of the neighbors can be done by invoking `no neighbor <ipv4>/<ipv6> shutdown` in the `router bgp` part of the configuration.

And the `no default-information originate` can be undone by invoking `default-information originate` in the corect ospf part of the configuration (ospf or ospf6, depending on which one you wish to re-enable).

A reload/restart of the service will also reset to normal configuration.