isisd: Fix memory leaks when the transition of neighbor state from non-UP to DOWN
When receiving a hello packet, if the neighbor state transitions directly from a non-ISIS_ADJ_UP state (such as ISIS_ADJ_INITIALIZING) to ISIS_ADJ_DOWN state, the neighbor entry cannot be deleted. If the neighbor is removed or the neighbor's System ID changes, it may result in memory leakage in the neighbor entry.
Test Scenario:
LAN link between Router A and Router B is established. Router A does not configure neighbor authentication, while Router B is configured with neighbor authentication. When the neighbor entry on Router B ages out, the neighbor state on Router A transitions to INIT. If Router B is then removed, the neighbor state on Router A transitions to DOWN and persists.
Donald Sharp [Fri, 14 Feb 2025 12:55:09 +0000 (07:55 -0500)]
bgpd: When removing the prefix list drop the pointer
We are very very rarely seeing this crash:
0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7f36ba439bd7 in qfree lib/memory.c:131
2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
8 0x7f36ba4ec239 in event_call lib/event.c:2019
9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Let's just stop trying to save the pointer around in the peer->orf_plist
data structure. There are other design problems but at least lets
stop the crash from possibly happening.
David Lamparter [Fri, 7 Feb 2025 12:22:25 +0000 (13:22 +0100)]
lib: crash handlers must be allowed on threads
Blocking all signals on non-main threads is not the way to go, at least
the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run
so we get backtraces. Otherwise the process just exits.
David Lamparter [Wed, 22 Jan 2025 10:19:04 +0000 (11:19 +0100)]
lib: guard against padding garbage in ZAPI read
When reading in a nexthop from ZAPI, only set the fields that actually
have meaning. While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.
David Lamparter [Wed, 22 Jan 2025 10:17:21 +0000 (11:17 +0100)]
zebra: guard against junk in nexthop->rmap_src
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry. Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.
David Lamparter [Wed, 22 Jan 2025 10:13:21 +0000 (11:13 +0100)]
bgpd: don't reuse nexthop variable in loop/switch
While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop. This would leave nexthop with partial data from the previous
use. Make it local where needed instead.
Donald Sharp [Mon, 27 Jan 2025 15:34:31 +0000 (10:34 -0500)]
tests: Add a test that shows the v6 recursive nexthop problem
Currently FRR does not handle v6 recurisive resolution properly
when the route being recursed through changes and the most
significant bits of the route are not changed.
Louis Scalbert [Thu, 9 Jan 2025 17:28:53 +0000 (18:28 +0100)]
bgpd: fix crash in displaying json orf prefix-list
bgpd crashes when there is several entries in the prefix-list. No
backtrace is provided because the issue was catched from a code review.
Fixes: 856ca177c4 ("Added json formating support to show-...-neighbors-... bgp commands.") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 8ccf60921b85893d301186a0f8156fb702da379f)
Louis Scalbert [Thu, 9 Jan 2025 17:24:39 +0000 (18:24 +0100)]
bgpd: fix bgp orf prefix-list json prefix
0x<address>FX was displayed instead of the prefix.
Fixes: b219dda129 ("lib: Convert usage of strings to %pFX and %pRN") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit b7e843d7e8afe57d3815dbb44e30307654e73711)
Donatas Abraitis [Fri, 27 Dec 2024 20:57:30 +0000 (22:57 +0200)]
FRR Release 9.1.3
- bfdd
- Add no variants to interval configurations
- bgpd
- Actually make ` --v6-with-v4-nexthops` it work
- Add `bgp ipv6-auto-ra` command
- Allow value 0 in aigp-metric setting
- Clear all paths including addpath once GR expires
- Compare aigp after local route check in bgp_path_info_cmp()
- EVPN fix per rd specific type-2 json output
- Fix addressing information of non established outgoing sessions
- Fix bgp core with a possible Intf delete
- Fix blank line in running-config with bmp listener cmd
- Fix crash when polling bgp4v2PathAttrTable
- Fix display of local label in show bgp
- Fix for match source-protocol in route-map for redistribute cmd
- Fix memory leak when creating BMP connection with a source interface
- Fix printfrr_bp for non initialized peers
- Fix route selection with AIGP
- Fix several issues in sourcing AIGP attribute
- Fix unconfigure asdot neighbor
- Include structure when installing End.DT4/6 SID
- Include structure when installing End.DT46 SID
- Include structure when removing End.DT4/6 SID
- Include structure when removing End.DT46 SID
- Move some non BGP-specific route-map functions to lib
- Remove useless control checks about TCP connection
- Set LLGR stale routes for all the paths including addpath
- Treat numbered community-list only if it's in a range 1-500
- Validate both nexthop information (NEXTHOP and NLRI)
- isisd
- Fix rcap tlv double-free crash
- lib
- Include SID structure in seg6local nexthop
- Take ge/le into consideration when checking the prefix with the prefix-list
- Keep `zebra on-rib-process script` in frr.conf
- nhrpd
- Fixes duplicate auth extension
- ospfd
- Fix missing '[no]ip ospf graceful-restart hello-delay <N>' commands
- pimd
- Allow resolving bsr via directly connected secondary address
- Fix access-list memory leak in pimd
- vrrpd
- Iterate over all ancillary messages
- zebra
- Add missing new line for help string
- Add missing proto translations
- Correctly report metrics
- Fix crash during reconnect
- Fix snmp walk of zebra rib
- Let's use memset instead of walking bytes and setting to 0
- Separate zebra ZAPI server open and accept
- Unlock node only after operation in zebra_free_rnh()
Donatas Abraitis [Tue, 10 Dec 2024 14:28:26 +0000 (16:28 +0200)]
lib: Take ge/le into consideration when checking the prefix with the prefix-list
Without the fix:
```
show ip prefix-list test_1 10.20.30.96/27 first-match
<no result>
show ip prefix-list test_2 192.168.1.2/32 first-match
<no result>
```
With the fix:
```
ip prefix-list test_1 seq 10 permit 10.20.30.64/26 le 27
!
end
donatas# show ip prefix-list test_1 10.20.30.96/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 1, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 2, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/28
donatas# show ip prefix-list test_1 10.20.30.126/26
seq 10 permit 10.20.30.64/26 le 27 (hit count: 3, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.126/30
donatas#
```
Rajasekar Raja [Tue, 10 Dec 2024 21:45:02 +0000 (13:45 -0800)]
bgpd: Fix bgp core with a possible Intf delete
Although trigger unknown, based on the backtrace in one of the internal
testing, we do see some delete in the Intf where we can have the peer
ifp pointer null and we try to dereference it while trying to install
the route leading to a crash
Skip updating the ifindex in such cases and since the nexthop is not
properly updated, BGP skips sending it to zebra.
BackTrace:
0 0x00007faef05e7ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007faef0598fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007faef09900dc in core_handler (signo=11, siginfo=0x7ffdde8cb4b0, context=<optimized out>) at lib/sigevent.c:274
3 <signal handler called>
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
5 bgp_zebra_announce_actual (dest=dest@entry=0x5560afcfa950, info=0x5560b21187d0, bgp=0x5560ace04df0) at bgpd/bgp_zebra.c:1521
6 0x00005560aad4bc85 in bgp_handle_route_announcements_to_zebra (e=<optimized out>) at bgpd/bgp_zebra.c:1896
7 0x00007faef09a1c0d in thread_call (thread=thread@entry=0x7ffdde8d7580) at lib/thread.c:2008
8 0x00007faef095a598 in frr_run (master=0x5560ac7e5190) at lib/libfrr.c:1223
9 0x00005560aac65db6 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:557
(gdb) f 4
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
1273 in bgpd/bgp_zebra.c
(gdb) p pi->peer->ifp
$26 = (struct interface *) 0x0
Mark Stapp [Wed, 30 Oct 2024 15:02:17 +0000 (11:02 -0400)]
zebra: separate zebra ZAPI server open and accept
Separate zebra's ZAPI server socket handling into two phases:
an early phase that opens the socket, and a later phase that
starts listening for client connections.
Donatas Abraitis [Sun, 17 Nov 2024 09:27:31 +0000 (11:27 +0200)]
bgpd: Validate both nexthop information (NEXTHOP and NLRI)
If we receive an IPv6 prefix e.g.: 2001:db8:100::/64 with nextop: 0.0.0.0, and
mp_nexthop: fc00::2, we should not treat this with an invalid nexthop because
of 0.0.0.0. We MUST check for MP_REACH attribute also and decide later if we
have at least one a valid nexthop.
Rajasekar Raja [Mon, 21 Oct 2024 17:53:27 +0000 (10:53 -0700)]
bgpd: Fix for match source-protocol in route-map for redistribute cmd
A redistribute cmd can have a route-map attached to it and adding the
match source-protocol to that route-map means BGP to filter which
protocol routes to accept among the bunch of routes zebra is sending.
bgpd: fix addressing information of non established outgoing sessions
When trying to connect to a BGP peer that does not respons, the 'show
bgp neighbors' command does not give any indication on the local and
remote addresses used:
> # show bgp neighbors
> BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
> Local Role: undefined
> Remote Role: undefined
> BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
> BGP state = Connect
> [..]
> Connections established 0; dropped 0
> Last reset 00:00:04, Waiting for peer OPEN (n/a)
> Internal BGP neighbor may be up to 255 hops away.
> BGP Connect Retry Timer in Seconds: 120
> Next connect timer due in 117 seconds
> Read thread: off Write thread: off FD used: 27
The addressing information (address and port) are only available
when TCP session is established, whereas this information is present
at the system level:
Add the display for outgoing BGP session, as the information in
the getsockname() API provides information for connected streams.
When getpeername() API does not give any information, use the peer
configuration (destination port is encoded in peer->port).
> # show bgp neighbors
> BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
> Local Role: undefined
> Remote Role: undefined
> BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
> BGP state = Connect
> [..]
> Connections established 0; dropped 0
> Last reset 00:00:16, Waiting for peer OPEN (n/a)
> Local host: 192.0.2.1, Local port: 46084
> Foreign host: 192.0.2.150, Foreign port: 179
bgpd: remove useless control checks about TCP connection
When attempting to get the src and destination addresses of a given
connection, the API may return the NULL pointer, but further code
in bgp_zebra_nexthop_set() already does a check about the given
pointer.
Relaxing the error code for all the returned adressing.
Fixes: 1ff9a340588a ("bgpd: bgpd-fsm-fix.patch") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit ba7130309954fbe8d58854339ca43259149e603a)
bgpd: Set LLGR stale routes for all the paths including addpath
Without this patch we set only the first path for the route (if multiple exist)
as LLGR stale and stop doing that for the rest of the paths, which is wrong.
Donatas Abraitis [Thu, 31 Oct 2024 08:47:48 +0000 (10:47 +0200)]
zebra: Add missing new line for help string
```
-A, --asic-offload FRR is interacting with an asic underneath the linux kernel
--v6-with-v4-nexthops Underlying dataplane supports v6 routes with v4 nexthops -s, --nl-bufsize Set netlink receive buffer size
```
Fixes: 1f5611c06d1c243b42279748788f0627793ead9c ("zebra: Allow zebra cli to accept v6 routes with v4 nexthops") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 25ae643996d338b8230fb15a9064843fe85de224)
Louis Scalbert [Fri, 25 Oct 2024 15:54:07 +0000 (17:54 +0200)]
bgpd: fix display of local label in show bgp
Fix the display of the local label in show bgp.
> r1# show bgp ipv4 labeled-unicast 172.16.2.2/32
> BGP routing table entry for 172.16.2.2/32, version 2
> Local label: 16 <---- MISSING
> Paths: (1 available, best #1, table default, vrf (null))
> Advertised to non peer-group peers:
> 192.168.1.2
> 65501
> 192.168.1.2 from 192.168.1.2 (172.16.2.2)
> Origin IGP, metric 0, valid, external, best (First path received)
> Remote label: 3
> Last update: Fri Oct 25 17:55:45 2024
Fixes: 67f67ba481 ("bgpd: Drop label_ntop/label_pton functions") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit e7b3276ace65d59edb4d614158d4f2959f12f868)
pimd: allow resolving bsr via directly connected secondary address
This only matters to single hop nodes that are adjacent to the bsr. More common
with IPv6 where LL address is used in PIM as the primary address. If the BSR IP
happens to be an address on the same interface, the receiving pim router
rejects the BSR address because it expects the BSR IP to resolve via the LL address
even if we have a connected route for the same BSR IP subnet. Effectively, we want to
allow rpf to be resolved via secondary IPs with connected routes on the same interface,
and not limit them to primary addresses.
Enke Chen [Sun, 20 Oct 2024 19:25:46 +0000 (12:25 -0700)]
bgpd: allow value 0 in aigp-metric setting
The value of 0 is accepted from peers, and can also be set by the
route-map "set aigp-metric igp-metric". For coonsistency, it should
be allowed in "set aigp-metric <value>" as well.
Enke Chen [Wed, 16 Oct 2024 18:15:28 +0000 (11:15 -0700)]
bgpd: fix several issues in sourcing AIGP attribute
Fix several issues in sourcing AIGP attribute:
1) AIGP should not be set as default for a redistributed route or a
static network. It should be set by config instead.
2) AIGP sourced by "set aigp-metric igp-metric" in a route-map does
not set the correct value for a redistributed route.
3) When redistribute a connected route like loopback, the AGIP (with
value 0) is sourced by "set aigp-metric igp-metric", but the
attribute is not propagated as the attribute flag is not set.
Enke Chen [Tue, 15 Oct 2024 01:47:59 +0000 (18:47 -0700)]
tests: fix and adjust topotest/bgp_aigp
Fix and adjust the topotest post the fix for route selection with
AIGP.
When there are multiple IGP domains (OSPF in this case), the nexthop
for a BGP route with the AIGP attribute must be resolved in its own
IGP domain.
The changes in r2/bgpd.conf and r3/bgpd.conf are needed as incorrect
IGP metrics are received from NHT for the recursive nexthops. Once
the issue is resolved, the changes can be reverted.
Igor Zhukov [Fri, 4 Oct 2024 06:16:02 +0000 (13:16 +0700)]
zebra: Fix crash during reconnect
fpm_enqueue_rmac_table expects an fpm_rmac_arg* as its argument.
The issue can be reproduced by dropping the TCP session using:
ss -K dst 127.0.0.1 dport = 2620
I used Fedora 40 and frr 9.1.2 and I got the gdb backtrace:
(gdb) bt
0 0x00007fdd7d6997ea in fpm_enqueue_rmac_table (bucket=0x2134dd0, arg=0x2132b60) at zebra/dplane_fpm_nl.c:1217
1 0x00007fdd7dd1560d in hash_iterate (hash=0x21335f0, func=0x7fdd7d6997a0 <fpm_enqueue_rmac_table>, arg=0x2132b60) at lib/hash.c:252
2 0x00007fdd7dd1560d in hash_iterate (hash=0x1e5bf10, func=func@entry=0x7fdd7d698900 <fpm_enqueue_l3vni_table>,
arg=arg@entry=0x7ffed983bef0) at lib/hash.c:252
3 0x00007fdd7d698b5c in fpm_rmac_send (t=<optimized out>) at zebra/dplane_fpm_nl.c:1262
4 0x00007fdd7dd6ce22 in event_call (thread=thread@entry=0x7ffed983c010) at lib/event.c:1970
5 0x00007fdd7dd20758 in frr_run (master=0x1d27f10) at lib/libfrr.c:1213
6 0x0000000000425588 in main (argc=10, argv=0x7ffed983c2e8) at zebra/main.c:492
bgpd: Actually make ` --v6-with-v4-nexthops` it work
It was using `-v` which is actually a _version_.
Fixes: 0435b31bb8ed55377f83d0e19bc085abc3c71b44 ("bgpd: Allow bgp to specify if it will allow v6 routing with v4 nexthops") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 0495cac837ad0f6ff1082746c37e4a48c1068035)
paths key is not there for
'show bgp l2vpn evpn route rd <rd-id> mac <mac> json' uses
evpn prefix as key for each path.
Replace the evpn prefix with "paths".
This aligned with overall EVPN RIB json output like
'show bgp l2vpn evpn route json'
'show bgp l2vpn evpn route rd <> type 2 json'