Donald Sharp [Thu, 5 Dec 2024 15:16:03 +0000 (10:16 -0500)]
bgpd: When bgp notices a change to shared_network inform bfd of it
When bgp is started up and reads the config in *before* it has
received interface addresses from zebra, shared_network can
be set to false in this case. Later on once bgp attempts to
reconnect it will refigure out the shared_network again( because
it has received the data from zebra now ). In this case
tell bfd about it.
Donald Sharp [Wed, 4 Dec 2024 21:14:34 +0000 (16:14 -0500)]
lib: Speed up reconnection attempts for zapi
Currently the zapi reconnection is once every 10 seconds
for the first 3 times and then once every 60 seconds from then
on out. We are seeing interesting behavior under loaded systems
where zebra is just slow to come up and daemons are spending a long
time waiting to connect. Let's just make things a bit more aggressive.
Change the code to attempt to reconnect once every second for 30 seconds
and then change to once every 5 seconds from then on out.
This should help with non-integrated configuration on system startup.
Donald Sharp [Wed, 4 Dec 2024 15:47:33 +0000 (10:47 -0500)]
pimd: Prevent crash of pim when auto-rp's socket is not initialized
If the socket associated with the auto-rp fails to initialize then
the memory for the auto-rp is just dropped on the floor. Additionally
any type of attempt at using the feature will just cause pimd to crash,
when the pointer is derefed. Since it is derefed all over the place
without checking.
Clearly if you cannot bind/use the socket let's allow continuation.
Fixes: #17540 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Introduced the idea of setting the socket buffer
send/receive sizes. BSD's in general have the fun
issue of not allowing nearly as large as a size as
linux. Since the above commit was developed on linux
and not run on bsd it was never tested. Modify the
codebase to use the backoff setsockopt that we have
in the code base and use the returned values to allow
us to notice what was set and respond appropriately.
Donald Sharp [Tue, 3 Dec 2024 17:08:12 +0000 (12:08 -0500)]
lib: Fix session re-establishment
Currently if you have this sequence of events:
a) BGP starts
b) BGP reads cli that has bfd configuration
c) BGP attempts to install bfd configuration but fails because
zebra is not connected to yet
d) BGP connects to zebra
e) BGP receives resend bfd code from bfdd
f) BGP was not sending down the unsent data to bfd, never causing
the bfd session to be established.
So effectively bfd was attempting to install but failed
and then when it was asked to replay everything it decided
that the bfd information for a particular peer was actually
installed and does not need to be resent. Modify the code
such that the bfd code now tracks failed installation and
allows the resend of data to bfdd.
Mark Stapp [Wed, 30 Oct 2024 15:02:17 +0000 (11:02 -0400)]
zebra: separate zebra ZAPI server open and accept
Separate zebra's ZAPI server socket handling into two phases:
an early phase that opens the socket, and a later phase that
starts listening for client connections.
Philippe Guibert [Mon, 28 Oct 2024 17:20:13 +0000 (18:20 +0100)]
topotests: fix bmp_collector handling of empty as-path
When configuring the bgp_bmp test with an additional
peer that sends empty AS-PATH, the bmp collector is stopping:
> [2024-10-28 17:41:51] Finished dissecting data from ('192.0.2.1', 33922)
> [2024-10-28 17:41:52] Data received from ('192.0.2.1', 33922): length 195
> [2024-10-28 17:41:52] Got message type: <class 'bmp.BMPRouteMonitoring'>
> [2024-10-28 17:41:52] unpack_from requires a buffer of at least 2 bytes for unpacking 2 bytes at offset 0 (actual buffer size is 0)
> [2024-10-28 17:41:52] TCP session closed with ('192.0.2.1', 33922)
> [2024-10-28 17:41:52] Server shutting down on 192.0.2.10:1789
The parser attempts to read an empty AS-path and considers the length
value as a length in bytes, whereas RFC mentions this value as
definining the number of AS-PAths. Fix this in the parser.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 25 Oct 2024 20:06:11 +0000 (22:06 +0200)]
topotests: bmp, create shared library for bmp
The bgp_bmp and bgp_bmp_vrf tests use similar routines
to test the bmp, but are duplicates. Gather the bgp_bmp
and the bgp_bmp_vrf tests in a single bgp_bmp folder.
- Create a bgpbmp.py library under the bgp_bmp test folder
- The bgp_bmp and bgp_bmp_vrf test are renamed to bgp_bmp_1
and bgp_bmp_2 test.
- Maintain separate folder for config and output results. Adapt
the bgp_bmp library accordingly.
- The json output for bgp_bmp_2 test has no referenc to hostame.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Mon, 11 Mar 2024 10:51:55 +0000 (11:51 +0100)]
bgpd: fix use real SID in BGP nexthop tracking
When receiving an SRv6 BGP update, the nexthop tracking is used
to find out the reachability of the BGP update.
> # show bgp ipv6 vpn fd00:200::/64
> Paths: (1 available, best #1)
> [..]
> 4:4::4:4 from 4:4::4:4 (4.4.4.4)
> Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
> Extended Community: RT:52:100
> Remote label: 16
> Remote SID: 2001:db8:f4::
> Last update: Mon Mar 11 11:50:04 2024
The IPv6 address used is the "Remote SID". Actually, this value is
incomplete. Remote SID stands for the attribute value received in BGP,
while the label value determines a complement of SRv6 SID value. The
transposition technique authorises that in BGP, and in the above case,
the incoming BGP update has used the transposition length.
When there is a transposition in the SID attribute available, use the
real SID address. The nexthop tracking will use that forged address.
> # show bgp nexthop
> Current BGP nexthop cache:
> 4:4::4:4 valid [IGP metric 30], #paths 0, peer 4:4::4:4
> gate fe80::dced:1ff:fed6:878c, if ntfp3
> Last update: Mon Mar 11 11:50:02 2024
> 2001:db8:f4:1:: valid [IGP metric 0], #paths 2
> gate fe80::dced:1ff:fed6:878c, if ntfp3
Fixes: 26c747ed6c0b ("bgpd: extend make_prefix to form srv6-based prefix") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 22 Nov 2024 14:57:25 +0000 (15:57 +0100)]
topotests: bgp_evpn_rt5, add test for advertise route-map service
Use the advertise route-map command, and check that it
filters out correctly the undesirable prefixes. Reversely,
check that undoing that route-map recovers all prefixes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Tue, 26 Nov 2024 13:19:34 +0000 (14:19 +0100)]
bgpd: fix use single whitespace when displaying flowspec entries
There is an extra space in the 'Displayed' line of show bgp command,
that should not be present.
Fix this by being consistent with the output of the other address
families.
Fixes: ("a1baf9e84f71") bgpd: Use single whitespace when displaying show bgp summary Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Mark Stapp [Mon, 25 Nov 2024 20:37:39 +0000 (15:37 -0500)]
zebra: avoid a race during FPM dplane plugin shutdown
During zebra shutdown, the main pthread and the FPM pthread can
deadlock if the FPM pthread is in fpm_reconnect(). Each pthread
tries to use event_cancel_async() to cancel tasks that may be
scheduled for the other pthread - this leads to a deadlock as
neither thread can progress.
This adds an atomic boolean that's managed as each pthread
enters and leaves the cleanup code in question, preventing the
two threads from running into the deadlock.
Donald Sharp [Thu, 7 Sep 2023 12:06:28 +0000 (08:06 -0400)]
tests: fix max med on startup
The test is failing because on r2 we are looking for a metric of 777
on startup, but the start of looking for this happens to be after
the 5 second delay that is setup in the config.
Notice that the 5 second delay for the max med expires at 29 seconds but the show routes
on r2 does not even begin until 34 seconds, long after the max med has expired and the
test has moved on.
Let's relax the max-med timer to 30 seconds and modify the test to wait a bit longer for
both finding it and expiring timer.
Donald Sharp [Thu, 7 Sep 2023 11:57:26 +0000 (07:57 -0400)]
tests: Fix ospfapi client to clear ospf process
Test is failing locally:
2023-09-06 18:39:56,865 DEBUG: r1: vtysh result:
Hello, this is FRRouting (version 9.1-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
r1# conf t
r1(config)# router ospf
r1(config-router)# ospf router-id 1.1.1.1
For this router-id change to take effect, use "clear ip ospf process" command
r1(config-router)#
2023-09-06 18:39:56,865 DEBUG: root: GOT LINE: 'SUCCESS: 1.0.0.0'
2023-09-06 18:39:56,866 DEBUG: root: GOT LINE: '2023-09-06 18:39:55,982 INFO: TESTER: root: Waiting for 1.1.1.1'
2023-09-06 18:39:56,867 DEBUG: root: GOT LINE: '2023-09-06 18:39:55,982 DEBUG: TESTER: root: expected '1.1.1.1' != '1.0.0.0''
2023-09-06 18:39:56,867 DEBUG: root: GOT LINE: 'waiting on notify'
Sure looks like the router-id is not allowed to be changed because
neighbors have already been formed. If we are changing the router-id
then let's clear the process to allow it to correctly change.
Donald Sharp [Tue, 24 Sep 2024 14:46:11 +0000 (10:46 -0400)]
tests: Add some test cases for snmp
Noticed that we were not really attempting to even test
large swaths of our snmp infrastructure. Let's load
up some very simple configs for those daemons that
FRR supports and ensure that SNMP is working to
some extent.
Donald Sharp [Fri, 20 Sep 2024 01:39:50 +0000 (21:39 -0400)]
zebra: Remove some unused functions on linux build
The functions:
if_get_flags
if_flags_update
if_flags_mangle
are never invoked from a linux netlink build. Put a #ifdef
around those functions so that they are not included on the
linux build as that they are not needed there.