Renato Westphal [Fri, 3 Mar 2023 16:09:20 +0000 (13:09 -0300)]
ospfd, ospf6d: introduce the "graceful-restart hello-delay" command
This command makes unplanned GR more reliable by manipulating the
sending of Grace-LSAs and Hello packets for a certain amount of time,
increasing the chance that the neighboring routers are aware of
the ongoing graceful restart before resuming normal OSPF operation.
Renato Westphal [Fri, 3 Mar 2023 16:09:20 +0000 (13:09 -0300)]
ospf6d: add support for unplanned graceful restart
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospf6d daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospf6d now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will be
uninstalled while ospf6d is exiting. Once ospf6d starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospf6d takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Renato Westphal [Fri, 3 Mar 2023 16:09:20 +0000 (13:09 -0300)]
ospfd: add support for unplanned graceful restart
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospfd daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospfd now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will
be uninstalled while ospfd is exiting. Once ospfd starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospfd takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Donald Sharp [Mon, 8 May 2023 12:06:30 +0000 (08:06 -0400)]
tests: ospf_metric_propagation should not look for a specific vrfId
There is no guarantee that the vrfId is going to be the same across
tests, as that the vrfId is chosen based upon the ifindex of the
vrf device. As such we should not be looking for the vrfId, but
the correct vrf name.
Donald Sharp [Mon, 8 May 2023 11:47:49 +0000 (07:47 -0400)]
tests: ospf_metric_propagation is looking for a specific ifindex
The test ospf_metric_propagation is looking for a specific ifindex
this ifindex is not guaranteed to be any particular value by the underlying
OS. So let's remove this test for it. As a side note I am seeing
tests fail in upstream CI because of this.
Jack.zhang [Fri, 5 May 2023 06:58:32 +0000 (14:58 +0800)]
bgpd: fix the issue of connected tag error when BGP subscribes to NHT from Zebra
Imagine the following scenario:
1.Create a multihop ebgp peer and config the ttl as 254 for both side.
2.Call bgp_start and start an active connection.
Bgp will send a nht register with non-connected flag.
3.The function bgp_accept be called by remote connection.
Bgp will create a accept peer as a passive connection with default ttl(1). And then will send a nht register again with connected flag. This register result will cover the first one.
4.The active connection come to establish first. In funciton "peer_xfer_conn", check for "PEER_FLAG_CONFIG_NODE" flag of "from_peer->doppelganger" will not be pass, so we can not repair the nht register error forever.
Then the bgp nexthop will be like this:
2000::60 invalid, #paths 0, peer 2000::60
Must be Connected
Last update: Thu May 4 09:35:14 2023
The route from this peer can not be treat with a vaild nexthop forever.
This change will fix this error.
Intermittently zebra and kernel are out of sync
when interface flaps and the add's/dels are in
same processing queue and zebra assumes no change in nexthop.
Hence we need to bring in a reinstall to kernel
of the nexthops and routes to sync their states.
Upon interface flap kernel would have deleted NHGs
associated to a interface (the one flapped),
zebra retains NHGs for 3 mins even though upper
layer protocol removes the nexthops (associated NHG).
As part of interface address add ,
re-add singleton NHGs associated to interface.
1. No any configuration in FRR, and `ip link add vrf1 type vrf ...`.
Currently, everything is ok.
2. `ip link del vrf1`.
`zebra` will wrongly/redundantly notify clients to add "vrf1" as a normal
interface after correct deletion of "vrf1".
```
ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-listen (NS 0) type RTM_DELLINK(17), len=588, seq=0, pid=0
ZEBRA: [TDJW2-B9KJW] RTM_DELLINK for vrf1(93) <- Wrongly as normal interface, not vrf
ZEBRA: [WEEJX-M4HA0] interface vrf1 vrf vrf1(93) index 93 is now inactive.
ZEBRA: [NXAHW-290AC] MESSAGE: ZEBRA_INTERFACE_DELETE vrf1 vrf vrf1(93)
ZEBRA: [H97XA-ABB3A] MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/DEL vrf1 VRF Id 93 -> 0
ZEBRA: [HP8PZ-7D6D2] MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/ADD vrf1 VRF Id 93 -> 0 <-
ZEBRA: [Y6R2N-EF2N4] interface vrf1 is being deleted from the system
ZEBRA: [KNFMR-AFZ53] RTM_DELLINK for VRF vrf1(93)
ZEBRA: [P0CZ5-RF5FH] VRF vrf1 id 93 is now inactive
ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE vrf1
ZEBRA: [ZMS2F-6K837] VRF vrf1 id 4294967295 deleted
OSPF: [JKWE3-97M3J] Zebra: interface add vrf1 vrf default[0] index 0 flags 480 metric 0 mtu 65575 speed 0 <- Wrongly add interface
```
`if_handle_vrf_change()` moved the interface from specific vrf to default
vrf. But it doesn't skip interface of vrf type. So, the wrong/redundant
add operation is done.
Note, the wrong add operation is regarded as an normal interface because
the `ifp->status` is cleared too early, so it is without VRF flag
( `ZEBRA_INTERFACE_VRF_LOOPBACK` ). Now, ospfd will initialize `ifp->type`
to `OSPF_IFTYPE_BROADCAST`.
3. `ip link add vrf1 type vrf ...`, add "vrf1" again. FRR will be with
wrong display:
```
interface vrf1
ip ospf network broadcast
exit
```
Here, zebra will send `ZEBRA_INTERFACE_ADD` again for "vrf1" with
correct `ifp->status`, so it will be updated into vrf type. But
it can't update `ifp->type` from `OSPF_IFTYPE_BROADCAST` to
`OSPF_IFTYPE_LOOPBACK` because it had been already configured in above
step 2.
Two changes to fix it:
1. Skip the procedure of switching VRF for interfaces of vrf type.
It means, don't send `ZEBRA_INTERFACE_ADD` to clients when deleting vrf.
2. Put the deletion of this flag at the last.
It means, clients should get correct `ifp->status`.
Christian Hopps [Fri, 28 Apr 2023 15:11:41 +0000 (11:11 -0400)]
tests: change topotest log timestamp precision to 6.
- Often millisecond precision is not good enough to differentiate things that
occur directly one after another, and things that have some pause in between,
increase to microsecond precision (reporting)
Verify activation and desactivation of per-vrf and per-af
sid export. Modify the configuration of r2 and verify that
changes are reflected in r1 and on connectivity between ce1 and c2.
The `bgp_srv6l3vpn_to_bgp_vrf3` topotest tests the SRv6 L3VPN
functionality. It applies the appropriate configuration in `bgpd` and
`zebra`, and then checks that the RIB is updated correctly.
The topotest expects to find the AS-Path in the RIB, which is only
present if the `bgp send-extra-data zebra` option is enabled in the
`bgpd` configuration.
The `bgp send-extra-data zebra` option has been accidentally commented
out in commit https://github.com/FRRouting/frr/commit/2007e2dbd0d5c42d9fe6cbe92b34be10654834ef.
This commit fixes the `bgp_srv6l3vpn_to_bgp_vrf3` topotest by re-adding
the missing `bgp send-extra-data zebra` option.
Louis Scalbert [Fri, 28 Apr 2023 09:20:26 +0000 (11:20 +0200)]
isisd: fix disabled flex-algo on race condition
A particular flex-algo algorithm may remain in disabled state after
configuring it if its flex-algo definition is being spread in the area.
It happens sometimes that, in isis_sr_flex_algo_topo1 topotest, r3
flex-algo 203 is disabled on test8. It depends on the following
sequence on r3:
1. a LSP containing the flex-algo 203 definition is received from
either r1 or r2 (or both).
2. the local LSP is rebuilt by lsp_build() because of the flex-algo 203
configuration
3. isis_run_spf() recomputes the algo 203 SPF tree
A 1. 2. 3. sequence results in a working test whereas 2. 1. 3. is not
working. The second case issue is because of an inconsistent flex-algo
definition state between the following:
- in lsp_build(), isis_flex_algo_elected_supported_local_fad() returns
false because no flex-algo definition is known.
- in isis_run_spf(), isis_flex_algo_elected_supported() returns true
because a flex-algo definition is found.
Set a flex-algo state lsp_build() depending on flex-algo definition
existence that is used later in isis_run_spf().
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Philippe Guibert [Wed, 19 Apr 2023 14:40:50 +0000 (16:40 +0200)]
bgpd: configure explicit-null for local paths per address family
Until now, the bgp local paths were using the default null label
defined. It was not possible to select the null label for the ipv4
or the ipv6 address families.
This commit addresses this issues by adding two extra-parameters
to the BGP labeled-unicast command.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Louis Scalbert [Thu, 27 Apr 2023 12:50:47 +0000 (14:50 +0200)]
isisd: fix a memory leak in isis_spftree_clear()
isis_spftree_clear() calls:
- _isis_spftree_del() to partially delete a spftree instance
without freeing spftree->route_table and
spftree->route_table_backup.
- then _isis_spftree_init() that allocates new spftree->route_table
and spftree->route_table_backup.
As a consequence, the previous table instances are not referenced and
not freed.
`setsockopt()` should be only called once with `MRT_TABLE`
in "enable" case, otherwise it will fail. In current code,
`mroute_socket` of "pim instance" with VRF can't be correctly
closed.
Skip it in the "disable" case to let `mroute_socket` safely
closed.
tests: Do not try establishing a connection from r1 to r2
If r1 becomes the "server" (= local port 179), then it initiates the connection
after sending BGP Notification (BFD Down) and r2 resets the last error code.
Telling r1 do not connect to r2, fixes the issue.
Tested with `pytest -s -n 48` at least 20 times - no failures.
Donald Sharp [Tue, 25 Apr 2023 19:35:19 +0000 (15:35 -0400)]
bgpd: Fix `received-routes detail`
The command `show bgp ipv4 uni neigh A.B.C.D received-routes detail`
was not displaying anything.
Fix the code to display the received routes from the ones that
have been filtered. In this case we need to fudge up a bgp_dest
and a bgp_path_info to make it work.
Old output:
janelle.pinkbelly.org# show bgp ipv4 uni neighbors 192.168.119.224 received-routes detail
BGP table version is 1711405, local router ID is 192.168.44.1, vrf id 0
Default local pref 100, local AS 64539
Total number of prefixes 3 (3 filtered)
janelle.pinkbelly.org#
New output:
janelle.pinkbelly.org# show bgp ipv4 uni neighbors 192.168.119.224 received-routes detail
BGP table version is 0, local router ID is 192.168.44.1, vrf id 0
Default local pref 100, local AS 64539
BGP routing table entry for 1.2.3.0/24, version 0
Paths: (1 available, no best path)
Not advertised to any peer
3291, (aggregated by 3291 192.168.122.1)
192.168.119.224 (inaccessible, import-check enabled) from 192.168.119.224 (192.168.122.1)
Origin IGP, metric 0, invalid, external, atomic-aggregate, rpki validation-state: not found
Community: 55:66
Last update: Fri Apr 14 08:46:48 2023
BGP routing table entry for 1.2.3.4/32, version 0
Paths: (1 available, no best path)
Not advertised to any peer
3291
192.168.119.224 (inaccessible, import-check enabled) from 192.168.119.224 (192.168.122.1)
Origin IGP, metric 0, invalid, external, rpki validation-state: not found
Community: 33:44
Last update: Fri Apr 14 08:46:48 2023
BGP routing table entry for 1.2.3.5/32, version 0
Paths: (1 available, no best path)
Not advertised to any peer
3291
192.168.119.224 (inaccessible, import-check enabled) from 192.168.119.224 (192.168.122.1)
Origin IGP, metric 0, invalid, external, rpki validation-state: not found
Community: 33:44
Last update: Fri Apr 14 08:46:48 2023
Total number of prefixes 3 (3 filtered)
janelle.pinkbelly.org# show bgp ipv4 uni
No BGP prefixes displayed, 0 exist
janelle.pinkbelly.org#
Donald Sharp [Mon, 24 Apr 2023 17:32:51 +0000 (13:32 -0400)]
tests: Increase the dead interval to be longer for neighbor testing
the ospf_basic_functionality/test_ospf_lan.py script is setting
up a lan env that will have 4 ospf routers on it and shutting/no
shutting interfaces with various priorities to see that ospf
is properly choosing roles. I am consistently seeing the
ospf_basic_functionality/test_ospf_lan.py script failing
where it is saying a neighbor is not in the correct state.
Upon examination of the logs we are seeing this:
2023/04/24 09:16:42 OSPF: [M7Q4P-46WDR] vty[7]@(config)# interface r0-s1-eth0 <----- This is where we no shut the interface
2023/04/24 09:16:47 OSPF: [M7Q4P-46WDR] vty[7]@> enable
2023/04/24 09:16:47 OSPF: [M7Q4P-46WDR] vty[7]@# show ip ospf neighbor all json
2023/04/24 09:16:53 OSPF: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task ospf_ism_event (556af08a5b4c) ran for 6038ms (cpu time 0ms)
2023/04/24 09:16:53 OSPF: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x556af19da020 arg=0x556af19c0dd0 timer r=-5.086 ospf_ase_calculate_timer() &ospf->t_ase_calc from ospfd/ospf_ase.c:635} was scheduled to pop greater than 4s ago
2023/04/24 09:16:53 OSPF: [M7Q4P-46WDR] vty[18]@> enable
2023/04/24 09:16:53 OSPF: [M7Q4P-46WDR] vty[18]@# show ip ospf neighbor all
2023/04/24 09:16:55 OSPF: [M7Q4P-46WDR] vty[7]@> enable
2023/04/24 09:16:55 OSPF: [M7Q4P-46WDR] vty[7]@# show ip ospf neighbor all json
2023/04/24 09:16:55 OSPF: [M7Q4P-46WDR] vty[7]@> enable
This test is setting the dead interval to 4 seconds, seeing a 6 second delay where the os has gone to town
(probably because of the high load on the system ) and not choosing the correct neighbor as the DR.
OSPF when coming up and after seeing the first neighbor, goes into a waiting period before
the DR is elected. If the neighbor does send it's hello packets but they are not processed
before the wait timer pops because of the starvation event, then the wrong neighbor
will be elected DR. Let's give this test a bit more time to decide who the
DR is in case everything goes a bit south.