The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.
> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01
Fix this by suppressing the USP information from the output.
Fixes: e496b4203055 ("bgpd: prefix-sid srv6 l3vpn service tlv") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Fri, 14 Feb 2025 12:55:09 +0000 (07:55 -0500)]
bgpd: When removing the prefix list drop the pointer
We are very very rarely seeing this crash:
0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7f36ba439bd7 in qfree lib/memory.c:131
2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
8 0x7f36ba4ec239 in event_call lib/event.c:2019
9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Let's just stop trying to save the pointer around in the peer->orf_plist
data structure. There are other design problems but at least lets
stop the crash from possibly happening.
Donald Sharp [Thu, 5 Dec 2024 15:16:03 +0000 (10:16 -0500)]
bgpd: When bgp notices a change to shared_network inform bfd of it
When bgp is started up and reads the config in *before* it has
received interface addresses from zebra, shared_network can
be set to false in this case. Later on once bgp attempts to
reconnect it will refigure out the shared_network again( because
it has received the data from zebra now ). In this case
tell bfd about it.
Donald Sharp [Wed, 20 Nov 2024 21:07:34 +0000 (16:07 -0500)]
bgpd: Allow bfd to work if peer known but interface address not yet
If bgp is coming up and bgp has not received the interface address yet
but bgp has knowledge about a bfd peering, allow it to set the peering
data appropriately.
With this patch, we reset the session only if it's a _REAL_ BFD down event, which
means we trigger session reset if BFD session is established earlier than BGP.
Louis Scalbert [Wed, 12 Feb 2025 11:50:42 +0000 (12:50 +0100)]
bgpd: fix incorrect json in bgp_show_table_rd
In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.
However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.
Updates the condition to also verify the existence of the next bgp_dest
table.
Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cf0269649cdd09b8d3f2dd8815caf6ecf9cdeef9)
Philippe Guibert [Mon, 10 Feb 2025 15:15:44 +0000 (16:15 +0100)]
nhrpd: fix dont consider incomplete L2 entry
Sometimes, NHRP receives L2 information on a cache entry with the
0.0.0.0 IP address. NHRP considers it as valid and updates the binding
with the new IP address.
> Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4
> Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec)
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4
> Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4
Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is
wrong because the nud state value means that value is incomplete and
should not be handled as a valid entry. Instead of considering it, fix
this by by invalidating the current binding. This step is necessary in
order to permit NHRP to trigger resolution requests again.
David Lamparter [Fri, 7 Feb 2025 12:22:25 +0000 (13:22 +0100)]
lib: crash handlers must be allowed on threads
Blocking all signals on non-main threads is not the way to go, at least
the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run
so we get backtraces. Otherwise the process just exits.
Donald Sharp [Mon, 27 Jan 2025 15:34:31 +0000 (10:34 -0500)]
tests: Add a test that shows the v6 recursive nexthop problem
Currently FRR does not handle v6 recurisive resolution properly
when the route being recursed through changes and the most
significant bits of the route are not changed.
David Lamparter [Wed, 22 Jan 2025 10:19:04 +0000 (11:19 +0100)]
lib: guard against padding garbage in ZAPI read
When reading in a nexthop from ZAPI, only set the fields that actually
have meaning. While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.
David Lamparter [Wed, 22 Jan 2025 10:17:21 +0000 (11:17 +0100)]
zebra: guard against junk in nexthop->rmap_src
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry. Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.
David Lamparter [Wed, 22 Jan 2025 10:13:21 +0000 (11:13 +0100)]
bgpd: don't reuse nexthop variable in loop/switch
While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop. This would leave nexthop with partial data from the previous
use. Make it local where needed instead.
Issue:
When there is no traffic for a group, the LHR and RP take the default KAT+Join timer expiry of
a maximum of 480 seconds to clear the S,G . However, in the FHR, we update the state from JOINED
to NOT Joined, downstream state from PPto NOINFO. This restarts the ET timer, causing S,G on FHR to
take more than 10 minutes to age out.
In other words,
Consider a case where (S,G) is in Join state. When the traffic stops and the KAT (210) expires,
the Join expiry timer restarts. At this time, if we receive a prune, the expectation is to set
PPT to 0 (RFC 4601 sec 4.5.2).
When the PPT expires, we move to the noinfo state and restart the expiry timer one more time. We remove the
(S,G) entry only after ~10 minutes when there is no active traffic.
Summary:
KAT Join ET 210 + PP ET 210 + NOINFO ET 210.
Solution:
Delete the ifchannel when in noinfo state, and KAT is not running.
bgpd: fix duplicate BGP instance created with unified config
When running the bgp_evpn_rt5 setup with unified config, memory leak
about a non deleted BGP instance happens.
> root@ubuntu2204hwe:~/frr/tests/topotests/bgp_evpn_rt5# cat /tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1.asan.bgpd.1164105
>
> =================================================================
> ==1164105==ERROR: LeakSanitizer: detected memory leaks
>
> Indirect leak of 12496 byte(s) in 1 object(s) allocated from:
> #0 0x7f358eeb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7f358e877233 in qcalloc lib/memory.c:106
> #2 0x55d06c95680a in bgp_create bgpd/bgpd.c:3405
> #3 0x55d06c95a7b3 in bgp_get bgpd/bgpd.c:3805
> #4 0x55d06c87a9b5 in bgp_get_vty bgpd/bgp_vty.c:603
> #5 0x55d06c68dc71 in bgp_evpn_local_l3vni_add bgpd/bgp_evpn.c:7032
> #6 0x55d06c92989b in bgp_zebra_process_local_l3vni bgpd/bgp_zebra.c:3204
> #7 0x7f358e9e3feb in zclient_read lib/zclient.c:4626
> #8 0x7f358e98082d in event_call lib/event.c:1996
> #9 0x7f358e848931 in frr_run lib/libfrr.c:1232
> #10 0x55d06c60eae1 in main bgpd/bgp_main.c:557
> #11 0x7f358e229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Actually, a BGP VRF Instance is created in auto mode when creating the
global BGP instance for the L3 VNI. And again, an other BGP VRF instance
is created. Fix this by ensuring that a non existing BGP instance is not
present. If it is present, and with auto mode or in hidden mode, then
override the AS value.
Fixes: f153b9a9b636 ("bgpd: Ignore auto created VRF BGP instances") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When peering with an EVPN device from other vendor, FRR acting as route
reflector is not able to read nor transmit the label value.
Actually, EVPN AD routes completely ignore the label value in the code,
whereas in some functionalities like evpn-vpws, it is authorised to
carry and propagate label value.
Fix this by handling the label value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Chirag Shah [Mon, 3 Feb 2025 20:00:41 +0000 (12:00 -0800)]
bgpd: fix route-distinguisher in vrf leak json cmd
For auto configured value RD value comes as NULL,
switching back to original change will ensure to cover
for both auto and user configured RD value in JSON.
Chirag Shah [Fri, 31 Jan 2025 01:26:46 +0000 (17:26 -0800)]
zebra: evpn svd hash avoid double free
Upon zebra shutdown hash_clean_and_free is called
where user free function is passed,
The free function should not call hash_release
which lead to double free of hash bucket.
Fix:
The fix is to avoid calling hash_release from
free function if its called from hash_clean_and_free
path.
10 0x00007f0422b7df1f in free () from /lib/x86_64-linux-gnu/libc.so.6
11 0x00007f0422edd779 in qfree (mt=0x7f0423047ca0 <MTYPE_HASH_BUCKET>,
ptr=0x55fc8bc81980) at ../lib/memory.c:130
12 0x00007f0422eb97e2 in hash_clean (hash=0x55fc8b979a60,
free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
../lib/hash.c:290
13 0x00007f0422eb98a1 in hash_clean_and_free (hash=0x55fc8a675920
<svd_nh_table>, free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
../lib/hash.c:305
14 0x000055fc8a5323a5 in zebra_vxlan_terminate () at
../zebra/zebra_vxlan.c:6099
15 0x000055fc8a4c9227 in zebra_router_terminate () at
../zebra/zebra_router.c:276
16 0x000055fc8a4413b3 in zebra_finalize (dummy=0x7fffb881c1d0) at
../zebra/main.c:269
17 0x00007f0422f44387 in event_call (thread=0x7fffb881c1d0) at
../lib/event.c:2011
18 0x00007f0422ecb6fa in frr_run (master=0x55fc8b733cb0) at
../lib/libfrr.c:1243
19 0x000055fc8a441987 in main (argc=14, argv=0x7fffb881c4a8) at
../zebra/main.c:584
Donatas Abraitis [Tue, 28 Jan 2025 15:11:58 +0000 (17:11 +0200)]
bgpd: Do not ignore auto generated VRF instances when deleting
When VRF instance is going to be deleted inside bgp_vrf_disable(), it uses
a helper method that skips auto created VRF instances and that leads to STALE
issue.
When creating a VNI for a particular VRF vrfX with e.g. `advertise-all-vni`,
auto VRF instance is created, and then we do `router bgp ASN vrf vrfX`.
But when we do a reload bgp_vrf_disable() is called, and we miss previously
created auto instance.
The more the vrf green is referenced in the import bgp command, the more
there are instances created. The below configuration shows that the vrf
green is referenced twice, and two BGP instances of vrf green are
created.
The below configuration:
> router bgp 99
> [..]
> import vrf green
> exit
> router bgp 99 vrf blue
> [..]
> import vrf green
> exit
> router bgp 99 vrf green
> [..]
> exit
>
> r4# show bgp vrfs
> Type Id routerId #PeersCfg #PeersEstb Name
> L3-VNI RouterMAC Interface
> DFLT 0 10.0.3.4 0 0 default
> 0 00:00:00:00:00:00 unknown
> VRF 5 10.0.40.4 0 0 blue
> 0 00:00:00:00:00:00 unknown
> VRF 6 0.0.0.0 0 0 green
> 0 00:00:00:00:00:00 unknown
> VRF 6 10.0.94.4 0 0 green
> 0 00:00:00:00:00:00 unknown
Fix this at import command, by looking at an already present bgp
instance.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Tue, 31 Dec 2024 13:38:11 +0000 (14:38 +0100)]
bgpd: fix duplicate BGP instance created with unified config
When running the bgp_evpn_rt5 setup with unified config, memory leak
about a non deleted BGP instance happens.
> root@ubuntu2204hwe:~/frr/tests/topotests/bgp_evpn_rt5# cat /tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1.asan.bgpd.1164105
>
> =================================================================
> ==1164105==ERROR: LeakSanitizer: detected memory leaks
>
> Indirect leak of 12496 byte(s) in 1 object(s) allocated from:
> #0 0x7f358eeb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7f358e877233 in qcalloc lib/memory.c:106
> #2 0x55d06c95680a in bgp_create bgpd/bgpd.c:3405
> #3 0x55d06c95a7b3 in bgp_get bgpd/bgpd.c:3805
> #4 0x55d06c87a9b5 in bgp_get_vty bgpd/bgp_vty.c:603
> #5 0x55d06c68dc71 in bgp_evpn_local_l3vni_add bgpd/bgp_evpn.c:7032
> #6 0x55d06c92989b in bgp_zebra_process_local_l3vni bgpd/bgp_zebra.c:3204
> #7 0x7f358e9e3feb in zclient_read lib/zclient.c:4626
> #8 0x7f358e98082d in event_call lib/event.c:1996
> #9 0x7f358e848931 in frr_run lib/libfrr.c:1232
> #10 0x55d06c60eae1 in main bgpd/bgp_main.c:557
> #11 0x7f358e229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Actually, a BGP VRF Instance is created in auto mode when creating the
global BGP instance for the L3 VNI. And again, an other BGP VRF instance
is created. Fix this by ensuring that a non existing BGP instance is not
present. If it is present, and with auto mode or in hidden mode, then
override the AS value.
Fixes: f153b9a9b636 ("bgpd: Ignore auto created VRF BGP instances") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Donald Sharp [Fri, 31 Jan 2025 23:53:30 +0000 (18:53 -0500)]
bgpd: With suppress-fib-pending ensure withdrawal is sent
When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.
Here is the timing that I believe is happening:
a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
the code to see if we should advertise the path. It sees the
BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info. Dest now
has no path infos.
h) BGP receives the route install(from step b) and unsets the
BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
unsets the flag again.
We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.
Nobuhiro MIKI [Wed, 29 Jan 2025 04:31:53 +0000 (04:31 +0000)]
tools: Fix frr-reload for ebgp-multihop TTL reconfiguration.
In ebgp-multihop, there is a difference in reload behavior when TTL is
unspecified (meaning default 255) and when 255 is explicitly specified.
For example, when reloading with 'neighbor <neighbor> ebgp-multihop
255' in the config, the following difference is created. This commit
fixes that.
Lines To Delete
===============
router bgp 65001
no neighbor 10.0.0.4 ebgp-multihop
exit
TL;DR; Handling BGP AddPath capability is not trivial (possible) dynamically.
When the sender is AddPath-capable and sends NLRIs encoded with AddPath ID,
and at the same time the receiver sends AddPath capability "disable-addpath-rx"
(flag update) via dynamic capabilities, both peers are out of sync about the
AddPath state. The receiver thinks already he's not AddPath-capable anymore,
hence it tries to parse NLRIs as non-AddPath, while they are actually encoded
as AddPath.
AddPath capability itself does not provide (in RFC) any mechanism on backward
compatible way to handle NLRIs if they come mixed (AddPath + non-AddPath).
This explains why we have failures in our CI periodically.
Donald Sharp [Thu, 24 Oct 2024 21:44:31 +0000 (17:44 -0400)]
bgpd: Fix wrong pthread event cancelling
0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258
5 <signal handler called>
6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470
14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355
15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610
16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152
17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639
18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111
19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252
20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214
21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180
22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) f 12
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
1428 assert(m->owner == pthread_self());
In this decode the attempt to cancel the connection's events from
the wrong thread is causing the crash. Modify the code to create an
event on the bm->master to cancel the events for the connection.
Donald Sharp [Thu, 24 Oct 2024 18:17:51 +0000 (14:17 -0400)]
bgpd: Fix deadlock in bgp_keepalive and master pthreads
(gdb) bt
0 futex_wait (private=0, expected=2, futex_word=0x5c438e9a98d8) at ../sysdeps/nptl/futex-internal.h:146
1 __GI___lll_lock_wait (futex=futex@entry=0x5c438e9a98d8, private=0) at ./nptl/lowlevellock.c:49
2 0x00007af16d698002 in lll_mutex_lock_optimized (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:48
3 ___pthread_mutex_lock (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:93
4 0x00005c4369c17e70 in _frr_mtx_lock (mutex=0x5c438e9a98d8, func=0x5c4369dc2750 <__func__.265> "bgp_notify_send_internal") at ./lib/frr_pthread.h:258
5 0x00005c4369c1a07a in bgp_notify_send_internal (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000', data=0x0, datalen=0, use_curr=true) at bgpd/bgp_packet.c:928
6 0x00005c4369c1a707 in bgp_notify_send (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_packet.c:1069
7 0x00005c4369bea422 in bgp_stop_with_notify (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1597
8 0x00005c4369c18480 in bgp_packet_add (connection=0x5c438e9a98c0, peer=0x5c438e9b6010, s=0x7af15c06bf70) at bgpd/bgp_packet.c:151
9 0x00005c4369c19816 in bgp_keepalive_send (peer=0x5c438e9b6010) at bgpd/bgp_packet.c:639
10 0x00005c4369bf01fd in peer_process (hb=0x5c438ed05520, arg=0x7af16bdffaf0) at bgpd/bgp_keepalives.c:111
11 0x00007af16dacd8e6 in hash_iterate (hash=0x7af15c000be0, func=0x5c4369bf005e <peer_process>, arg=0x7af16bdffaf0) at lib/hash.c:252
12 0x00005c4369bf0679 in bgp_keepalives_start (arg=0x5c438e0db110) at bgpd/bgp_keepalives.c:214
13 0x00007af16dac9932 in frr_pthread_inner (arg=0x5c438e0db110) at lib/frr_pthread.c:180
14 0x00007af16d694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
15 0x00007af16d726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)
The bgp keepalive pthread gets deadlocked with itself and consequently
the bgp master pthread gets locked when it attempts to lock
the peerhash_mtx, since it is also locked by the keepalive_pthread
The keepalive pthread is locking the peerhash_mtx in
bgp_keepalives_start. Next the connection->io_mtx mutex in
bgp_keepalives_send is locked and then when it notices a problem it invokes
bgp_stop_with_notify which relocks the same mutex ( and of course
the relock causes it to get stuck on itself ). This generates a
deadlock condition.
Modify the code to only hold the connection->io_mtx as short as
possible.
Krishnasamy R [Tue, 21 Jan 2025 09:06:53 +0000 (01:06 -0800)]
bgpd: Fix for local interface MAC cache issue in 'bgp mac hash' table
Issue:
During FRR restart, we fail to add some of the local interface's MAC
to the 'bgp mac hash'. Not having local MAC in the hash table can cause
lookup issues while receiving EVPN RT-2.
Currently, we have code to add local MAC(bgp_mac_add_mac_entry) while handling
interface add/up events in BGP(bgp_ifp_up/bgp_ifp_create). But the code
'bgp_mac_add_mac_entry' in bgp_ifp_create is not getting invoked as it
is placed under a specific check(vrf->bgp link check).
Fix:
We can skip this check 'vrf->bgp link existence' as the tenant VRF might
not have BGP instance but still we want to cache the tenant VRF local
MACs. So keeping this check in bgp_ifp_create inline with bgp_ifp_up.
Enke Chen [Thu, 9 Jan 2025 01:34:29 +0000 (17:34 -0800)]
bgpd: apply route-map for aggregate before attribute comparison
Currently when re-evaluating an aggregate route, the full attribute of
the aggregate route is not compared with the existing one in the BGP
table. That can result in unnecessary churns (un-install and then
install) of the aggregate route when a more specific route is added or
deleted, or when the route-map for the aggregate changes. The churn
would impact route installation and route advertisement.
The fix is to apply the route-map for the aggregate first and then
compare the attribute.
Louis Scalbert [Thu, 9 Jan 2025 17:28:53 +0000 (18:28 +0100)]
bgpd: fix crash in displaying json orf prefix-list
bgpd crashes when there is several entries in the prefix-list. No
backtrace is provided because the issue was catched from a code review.
Fixes: 856ca177c4 ("Added json formating support to show-...-neighbors-... bgp commands.") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 8ccf60921b85893d301186a0f8156fb702da379f)
Louis Scalbert [Thu, 9 Jan 2025 17:24:39 +0000 (18:24 +0100)]
bgpd: fix bgp orf prefix-list json prefix
0x<address>FX was displayed instead of the prefix.
Fixes: b219dda129 ("lib: Convert usage of strings to %pFX and %pRN") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit b7e843d7e8afe57d3815dbb44e30307654e73711)
Jonathan Voss [Fri, 3 Jan 2025 03:19:30 +0000 (03:19 +0000)]
tools: Add missing rpki keyword to vrf in frr-reload
When reloading the following configuration:
```
vrf red
rpki
rpki cache tcp 172.65.0.2 8282 preference 1
exit
exit-vrf
```
frr-reload.py does not properly enter the `rpki` context
within a `vrf`. Because of this, it fails to apply RPKI
configurations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 54ec9f38884fb63e045732537c4c1f4a94608987)
Donatas Abraitis [Mon, 23 Dec 2024 21:07:38 +0000 (23:07 +0200)]
FRR Release 10.1.2
- babeld
- Do not remove route when replacing
- Send the route's metric down to zebra.
- bfdd
- Add no variants to interval configurations
- Retain remote dplane client socket
- bgpd
- Actually make ` --v6-with-v4-nexthops` it work
- Add `bgp ipv6-auto-ra` command
- Allow value 0 in aigp-metric setting
- Avoid use-after-free when doing `no router bgp` with auto created instances
- Fix to pop items off zebra_announce FIFO for few EVPN triggers
- Clear all paths including addpath once GR expires
- Compare aigp after local route check in bgp_path_info_cmp()
- Do not filter no-export community for BGP OAD (one administration domain)
- Do not reset peers on suppress-fib toggling
- EVPN fix per rd specific type-2 json output
- Fix bgp core with a possible Intf delete
- Fix blank line in running-config with bmp listener cmd
- Fix crash when polling bgp4v2PathAttrTable
- Fix display of local label in show bgp
- Fix `enforce-first-as` per peer-group removal
- Fix evpn bestpath calculation when path is not established
- Fix evpn mh esi flap remove local routes
- Fix for match source-protocol in route-map for redistribute cmd
- Fix memory leak when creating BMP connection with a source interface
- Fix memory leak when reconfiguring a route distinguisher
- Fix printfrr_bp for non initialized peers
- Fix resolvedPrefix in show nexthop json output
- Fix route selection with AIGP
- Fix several issues in sourcing AIGP attribute
- Fix unconfigure asdot neighbor
- Fix use single whitespace when displaying flowspec entries
- Fix version attribute is an int, not a string
- Include structure when installing End.DT4/6 SID
- Include structure when installing End.DT46 SID
- Include structure when removing End.DT4/6 SID
- Include structure when removing End.DT46 SID
- Move some non BGP-specific route-map functions to lib
- Set LLGR stale routes for all the paths including addpath
- Treat numbered community-list only if it's in a range 1-500
- Validate both nexthop information (NEXTHOP and NLRI)
- Validate only affected RPKI prefixes instead of a full RIB
- isisd
- Fix change flex-algorithm number from uint32 to uint8
- Fix memory leaks when the transition of neighbor state from non-UP to DOWN
- Fix rcap tlv double-free crash
- Fix wrong check for MT commands
- lib
- Attach stdout to child only if --log=stdout and stdout FD is a tty
- Include SID structure in seg6local nexthop
- Take ge/le into consideration when checking the prefix with the prefix-list
- Keep `zebra on-rib-process script` in frr.conf
- nhrpd
- Fixes duplicate auth extension
- ospfd
- Add a hidden command for old `no router-id`
- Fix heap corruption vulnerability when parsing SR-Algorithm TLV
- Fix missing '[no]ip ospf graceful-restart hello-delay <N>' commands
- Interface 'ip ospf neighbor-filter' startup config not applied.
- Use router_id what Zebra has if we remove a static router_id
- pimd
- Allow resolving bsr via directly connected secondary address
- Fix access-list memory leak in pimd
- vrrpd
- Iterate over all ancillary messages
- zebra
- Add missing new line for help string
- Add missing proto translations
- Correctly report metrics
- Fix crash during reconnect
- Fix heap-use-after free on ns shutdown
- Fix snmp walk of zebra rib
- Let's use memset instead of walking bytes and setting to 0
- Separate zebra ZAPI server open and accept
- Unlock node only after operation in zebra_free_rnh()
bgpd: Validate only affected RPKI prefixes instead of a full RIB
Before this fix, if rpki_sync_socket_rtr socket returns EAGAIN, then ALL routes
in the RIB are revalidated which takes lots of CPU and some unnecessary traffic,
e.g. if using BMP servers. With a full feed it would waste 50-80Mbps.
Instead we should try to drain an existing pipe (another end), and revalidate
only affected prefixes.
Philippe Guibert [Wed, 18 Dec 2024 15:53:48 +0000 (16:53 +0100)]
bgpd: fix memory leak when reconfiguring a route distinguisher
A memory leak happens when reconfiguring an already configured route
distinguisher on an L3VPN BGP instance. Fix this by freeing the previous
route distinguisher.
Donald Sharp [Thu, 5 Dec 2024 18:12:00 +0000 (13:12 -0500)]
bgpd: Fix evpn bestpath calculation when path is not established
If you have a bestpath list that looks something like this:
<local evpn mac route>
<learned from peer out swp60>
<learned from peer out swp57>
And a network event happens that causes the peer out swp60
to not be in an established state, yet we still have the
path_info for the destination for swp60, bestpath
will currently end up with this order:
<learned from peer out swp60>
<local evpn mac route>
<learned from peer out swp57>
This causes the local evpn mac route to be deleted in zebra( Wrong! ).
This is happening because swp60 is skipped in bestpath calculation and
not considered to be a path yet it stays at the front of the list.
Modify bestpath calculation such that when pulling the unsorted_list
together to pull path info's into that list when they are also
not in a established state.
Donatas Abraitis [Tue, 10 Dec 2024 14:28:26 +0000 (16:28 +0200)]
lib: Take ge/le into consideration when checking the prefix with the prefix-list
Without the fix:
```
show ip prefix-list test_1 10.20.30.96/27 first-match
<no result>
show ip prefix-list test_2 192.168.1.2/32 first-match
<no result>
```
With the fix:
```
ip prefix-list test_1 seq 10 permit 10.20.30.64/26 le 27
!
end
donatas# show ip prefix-list test_1 10.20.30.96/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 1, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 2, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/28
donatas# show ip prefix-list test_1 10.20.30.126/26
seq 10 permit 10.20.30.64/26 le 27 (hit count: 3, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.126/30
donatas#
```
Rajasekar Raja [Tue, 10 Dec 2024 21:45:02 +0000 (13:45 -0800)]
bgpd: Fix bgp core with a possible Intf delete
Although trigger unknown, based on the backtrace in one of the internal
testing, we do see some delete in the Intf where we can have the peer
ifp pointer null and we try to dereference it while trying to install
the route leading to a crash
Skip updating the ifindex in such cases and since the nexthop is not
properly updated, BGP skips sending it to zebra.
BackTrace:
0 0x00007faef05e7ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007faef0598fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007faef09900dc in core_handler (signo=11, siginfo=0x7ffdde8cb4b0, context=<optimized out>) at lib/sigevent.c:274
3 <signal handler called>
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
5 bgp_zebra_announce_actual (dest=dest@entry=0x5560afcfa950, info=0x5560b21187d0, bgp=0x5560ace04df0) at bgpd/bgp_zebra.c:1521
6 0x00005560aad4bc85 in bgp_handle_route_announcements_to_zebra (e=<optimized out>) at bgpd/bgp_zebra.c:1896
7 0x00007faef09a1c0d in thread_call (thread=thread@entry=0x7ffdde8d7580) at lib/thread.c:2008
8 0x00007faef095a598 in frr_run (master=0x5560ac7e5190) at lib/libfrr.c:1223
9 0x00005560aac65db6 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:557
(gdb) f 4
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
1273 in bgpd/bgp_zebra.c
(gdb) p pi->peer->ifp
$26 = (struct interface *) 0x0
Mark Stapp [Wed, 30 Oct 2024 15:02:17 +0000 (11:02 -0400)]
zebra: separate zebra ZAPI server open and accept
Separate zebra's ZAPI server socket handling into two phases:
an early phase that opens the socket, and a later phase that
starts listening for client connections.