Donald Sharp [Thu, 9 Jan 2025 20:26:46 +0000 (15:26 -0500)]
zebra: Uninstall NHG in some situations
If you have this series of events:
a) Decision to install a NHG is made in zebra, enqueue to DPLANE
b) Changes to NHG are made and we remove it in the master pthread
Since this NHG is not marked as installed it is not removed
but the NHG data structure is deleted
c) DPLANE installs the NHG
In the end the NHG stays installed but ZEBRA has lost track of it.
Modify the removal code to check to see if the NHG is queued.
There are 2 cases:
a) NHG is kept around for a bit before being deleted. In this case
just see that the NHG is Queued and keep it around too.
b) NHG is not kept around and we are just removing it. In this case
check to see if it is queued and send another deletion event.
Donald Sharp [Wed, 8 Jan 2025 14:42:49 +0000 (09:42 -0500)]
tests: bgp_srv6l3vpn_to_bgp_vrf3 needs more time
The test starts with checking for rib insertion
of routes that may take some time after system
startup to come up. Under heavy load this may
cause this test to just fail. Give it more time.
Donald Sharp [Thu, 9 Jan 2025 17:34:50 +0000 (12:34 -0500)]
zebra: Fix leaked nhe
During route processing in zebra, Zebra will create a nexthop
group that matches the nexthops passed down from the routing
protocol. Then Zebra will look to see if it can re-use a
nhe from a previous version of the route entry( say a interface
goes down ). If Zebra decides to re-use an nhe it was just dropping
the route entry created. Which led to nexthop group's that had
a refcount of 0 and in some cases these nexthop groups were installed
into the kernel.
Add a bit of code to see if the returned entry is not being used
and it has no reference count and if so, properly dispose of it.
Donald Sharp [Wed, 8 Jan 2025 14:41:21 +0000 (09:41 -0500)]
tests: bgp_srv6_sid_reachability should give more time
The test starts right in on check_pings with a 10 second
time out. Any type of delay on startup is going to cause
problems. Give the first check_ping significant time
for the test to be fully brought up.
Enke Chen [Thu, 9 Jan 2025 01:34:29 +0000 (17:34 -0800)]
bgpd: apply route-map for aggregate before attribute comparison
Currently when re-evaluating an aggregate route, the full attribute of
the aggregate route is not compared with the existing one in the BGP
table. That can result in unnecessary churns (un-install and then
install) of the aggregate route when a more specific route is added or
deleted, or when the route-map for the aggregate changes. The churn
would impact route installation and route advertisement.
The fix is to apply the route-map for the aggregate first and then
compare the attribute.
Martin Buck [Wed, 8 Jan 2025 09:38:56 +0000 (10:38 +0100)]
lib: Fix privs syscaps (pset_t) allocation
Don't over-allocate syscaps in zcaps2sys(): This is just a single struct
(pset_t) with a count and a pointer to an array of capabilities, not an
array. So only allocate a single pset_t, not num copies of it.
The allocation size of syscaps->caps then needs to be based on the number of
Linux capabilities (count), but that is already handled properly a few lines
below.
Note that this fix is mostly cosmetic and for correctness. There was no
potential for memory corruption, because num is guaranteed to be nonzero. So
at least the one required pset_t was always allocated (but potentially much
more).
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Martin Buck [Fri, 20 Dec 2024 17:55:26 +0000 (18:55 +0100)]
tests: ospf6_ecmp_inter_area, no shutdown r7/r8 eth3
Drop eth3 shutdown from ospf6d.conf - it doesn't do anything there. And it
actually shouldn't do anything: eth3 on r7/r8 are used as loopback-like
interfaces to inject the address on eth2 into OSPFv3. So they need to be up
for eth2 to work as expected.
Based on original PR#16811 commit:
eth3 shutdown is not applied because it is ospf6d.conf.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Christian Hopps [Sun, 24 Nov 2024 07:56:22 +0000 (02:56 -0500)]
lib: northbound: add new get() callback to add lyd_node direclty
This allows eliminating the superfluous yang_data object (which
is getting created used to call lyd_new_term then deleted). Instead
just call lyd_new_term() in the callback directly.
bgpd: Use unique value for BGP_NEXTHOP_EVPN_INCOMPLETE flag
This was reused with BGP_NEXTHOP_ULTIMATE by error.
Fixes: 93fd9cbb5022e0c40827cd6d6ef339624a8b5daa ("bgpd: Validate imported routes next-hop that is in a default VRF") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Jonathan Voss [Fri, 3 Jan 2025 03:19:30 +0000 (03:19 +0000)]
tools: Add missing rpki keyword to vrf in frr-reload
When reloading the following configuration:
```
vrf red
rpki
rpki cache tcp 172.65.0.2 8282 preference 1
exit
exit-vrf
```
frr-reload.py does not properly enter the `rpki` context
within a `vrf`. Because of this, it fails to apply RPKI
configurations.
Chirag Shah [Wed, 1 Nov 2023 05:11:04 +0000 (22:11 -0700)]
zebra:check DAD freeze action before notifying bgp
If Duplicate Address Detection action is freeze
(permanent or definite time means not warn only mode)
then locally duplicate detected MAC delete notification
is not require to inform,
instead ask BGP to sync previous remote MAC entry.
In freeze case local MAC event is not known to BGP,
instead BGP is pointing to remote VTEP for the MAC.
bgpd: bmp, define hook for route distinguisher updates
At startup, if bmp loc-rib is enabled, the peer_id of the
loc-rib per peer header message has the route distinguisher set to 0:0.
Actually, the route distinguisher has been updated after the peer up
message is sent, and the information is not refreshed.
Create a hook API to handle route distinguisher config events: pre and
post configuration. Use that hook in BMP module to send peer down, and
peer up events when necessary.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 30 Oct 2024 08:45:47 +0000 (09:45 +0100)]
bgpd: bmp, define hook for router-id updates
At startup, if bmp loc-rib is enabled, the peer_id of the
loc-rib per peer header message has the router-id set to 0.0.0.0.
Actually, the router-id has been updated after the peer up
message is sent, and the information is not refreshed.
Create a hook API to handle router id events: withdraw and
updates. Use that hook in BMP module to send peer down, and
peer up events when necessary.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 18 Dec 2024 15:31:05 +0000 (16:31 +0100)]
topotests: bgp_bmp, add a test to check for bgp vrf peer loc-rib message
Add a test where, when the vrf interface is flapping, a peer down and a
peer up message are sent. This test, when used with ASAN, detects the
memory leak of the open_tx and open_rx messages of the loc-rib.
Refresh the method of updating the SEQ value when reading the peer
messages: only update to the last matching SEQ value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Tue, 29 Oct 2024 15:20:18 +0000 (16:20 +0100)]
bgpd: fix do re-send post-policy bgp update when not valid
When a BGP listener configured with BMP receives the first BGP
IPv6 update from a connected BGP IPv6 peer, the BMP collector
receives a withdraw post-policy message.
Actually, the BGP update is not valid, and BMP considers it as a
withdraw message. The BGP upate is not valid, because the nexthop
reachability is unknown at the time of reception, and no other
BMP message is sent.
Fix this by re-sending a BMP post update message when nexthop
tracking becomes successfull. Generalise the re-sending of
messages when nexthop tracking changes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd: fix warning of compilation when using bgp_trace
The following warning can be seen:
> In file included from ./bgpd/bgp_trace.h:21,
> from bgpd/bgp_io.c:27:
> bgpd/bgp_io.c: In function ‘read_ibuf_work’:
> bgpd/bgp_io.c:202:53: warning: passing argument 1 of ‘lttng_ust_tracepoint_cb_frr_bgp___packet_read’ from incompatible pointer type [-Wincompatible-pointer-types]
> 202 | frrtrace(2, frr_bgp, packet_read, connection->peer, pkt);
> | ~~~~~~~~~~^~~~~~
> | |
> | struct peer *
> bgpd/bgp_io.c:202:9: note: in expansion of macro ‘frrtrace’
> 202 | frrtrace(2, frr_bgp, packet_read, connection->peer, pkt);
> | ^~~~~~~~
> In file included from ./bgpd/bgp_trace.h:21,
> from bgpd/bgp_io.c:27:
> ./bgpd/bgp_trace.h:57:43: note: expected ‘struct peer_connection *’ but argument is of type ‘struct peer *’
> 57 | TP_ARGS(struct peer_connection *, connection, struct stream *, pkt),
> | ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
Use the appropriate connection parameter when calling the trace.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Yaroslav Kholod [Mon, 23 Dec 2024 15:35:12 +0000 (17:35 +0200)]
BGP: Clean address-family config on daemon restart
When stopping and restarting BGP daemon part of the configuration
remains. It should be cleared.
Particulary those are address-family parametes, like: distance,
ead-es-frag, disable-ead-evi-rx, disable-ead-evi-tx.