isisd: Fix memory leaks when the transition of neighbor state from non-UP to DOWN
When receiving a hello packet, if the neighbor state transitions directly from a non-ISIS_ADJ_UP state (such as ISIS_ADJ_INITIALIZING) to ISIS_ADJ_DOWN state, the neighbor entry cannot be deleted. If the neighbor is removed or the neighbor's System ID changes, it may result in memory leakage in the neighbor entry.
Test Scenario:
LAN link between Router A and Router B is established. Router A does not configure neighbor authentication, while Router B is configured with neighbor authentication. When the neighbor entry on Router B ages out, the neighbor state on Router A transitions to INIT. If Router B is then removed, the neighbor state on Router A transitions to DOWN and persists.
isisd: The hold time of hello packets on a P2P link does not match the sending interval.
The hold time filled in the hello packets of a P2P link is calculated based on the level 1 configuration, while the hello timer is based on the level 2 configuration. If the hello interval times in level 1 and level 2 configurations are inconsistent, it may lead to neighbor establishment failure.
Philippe Guibert [Thu, 28 Mar 2024 14:56:19 +0000 (15:56 +0100)]
topotests: fix ignore routes with linkdown
In topotest, a given interface has only the ignore routes bit turned
on for IPv6 only, whereas topotest is expected to turn it on for all
address families.
> # show interface
> Interface r2-r3-eth2 is up, line protocol is up
> [..]
> flags: <UP,BROADCAST,RUNNING,MULTICAST>
> Ignore all v6 routes with linkdown
> Type: Ethernet
> [..]
This is because the only the 'default' ipv6 ignore sysctl is set to
1. Set also the /proc/sys/net/conf/<family>/default/ignore_routes_with_linkdown
flag, to have same behaviour for ipv4 and ipv6.
Fixes: 4958158787ce ("tests: micronet: update infra") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
---------------------------------------------------------------------------- live log call -----------------------------------------------------------------------------
2024-03-29 18:12:22,608 INFO: r1: checking if daemons are running
2024-03-29 18:12:22,802 INFO: r2: checking if daemons are running
2024-03-29 18:12:22,911 INFO: r3: checking if daemons are running
2024-03-29 18:12:23,015 INFO: topo: Remove bgp peer-group PG1 remote-as neighbor should be retained
2024-03-29 18:12:25,605 INFO: topo: Re-add bgp peer-group PG1 remote-as neighbor should be established
----------------------------------------------------------- generated xml file: /tmp/topotests/topotests.xml -----------------------------------------------------------
========================================================================== 2 passed in 17.63s ==========================================================================
Philippe Guibert [Fri, 29 Mar 2024 11:07:14 +0000 (12:07 +0100)]
bgpd: add resolved_prefix visibility on nht
The nexthop tracking never displays the prefix that
has been used in ZEBRA to resolve its nexthop. This
information will be useful if some decision has to be
taken regarding any loops, that is to say if for instance
a BGP prefix is resolved over a prefix in ZEBRA that is
exactly the same.
Store the value in bgp nexthop context, and display it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Piotr Suchy [Thu, 28 Mar 2024 11:55:35 +0000 (12:55 +0100)]
vtysh, zebra: Fix malformed json output for multiple vrfs in command 'show ip route vrf all json'
Command 'show ip route vrf <vrf_name> json' returns a valid json object,
however if instead of <vrf_name> we specify 'all', we get an invalid json
object, like:
Philippe Guibert [Fri, 29 Mar 2024 07:35:34 +0000 (08:35 +0100)]
bgpd: fix srv6 memory leak detection
The asan memory leak has been detected:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
> #0 0x7f9066dadd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
> #1 0x7f9066779b5d in qcalloc lib/memory.c:105
> #2 0x556d6ca527c2 in vpn_leak_zebra_vrf_sid_update_per_af bgpd/bgp_mplsvpn.c:389
> #3 0x556d6ca530e1 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:451
> #4 0x556d6ca64b3b in vpn_leak_postchange bgpd/bgp_mplsvpn.h:311
> #5 0x556d6ca64b3b in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3751
> #6 0x556d6cb9f116 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3337
> #7 0x7f906685a6b6 in zclient_read lib/zclient.c:4490
> #8 0x7f9066826a32 in event_call lib/event.c:2011
> #9 0x7f906675c444 in frr_run lib/libfrr.c:1217
> #10 0x556d6c980d52 in main bgpd/bgp_main.c:545
> #11 0x7f9065784c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Fix this by freeing the previous memory chunk.
Fixes: b72c9e14756f ("bgpd: cli for SRv6 SID alloc to redirect to vrf (step4)") Fixes: 527588aa78b2 ("bgpd: add support for per-VRF SRv6 SID") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donatas Abraitis [Wed, 27 Mar 2024 17:08:38 +0000 (19:08 +0200)]
bgpd: Prevent from one more CVE triggering this place
If we receive an attribute that is handled by bgp_attr_malformed(), use
treat-as-withdraw behavior for unknown (or missing to add - if new) attributes.
Donald Sharp [Thu, 28 Mar 2024 16:27:38 +0000 (12:27 -0400)]
bgpd: Arrange peer notification to after zebra announce
Currently BGP attempts to send route change information
to it's peers *before* the route is installed into zebra.
This creates a bug in suppress-fib-pending in the following
scenario:
a) bgp suppress-fib-pending and bgp has a route with
2 way ecmp.
b) bgp receives a route withdraw from peer 1. BGP
will send the route to zebra and mark the route as
FIB_INSTALL_PENDING.
c) bgp receives a route withdraw from peer 2. BGP
will see the route has the FIB_INSTALL_PENDING and
not send the withdrawal of the route to the peer.
bgp will then send the route deletion to zebra and
clean up the bgp_path_info's.
At this point BGP is stuck where it has not sent
a route withdrawal to downstream peers.
Let's modify the code in bgp_process_main_one to
send the route notification to zebra first before
attempting to announce the route. The route withdrawal
will remove the FIB_INSTALL_PENDING flag from the dest
and this will allow group_announce_route to believe
it can send the route withdrawal.
For the master branch this is ok because the recent
backpressure commits are in place and nothing is going
to change from an ordering perspective in that regards.
Ostensibly this fix is also for operators of Sonic and
will be backported to the 8.5 branch as well. This will
change the order of the send to peers to be after the
zebra installation but sonic users are using suppress-fib-pending
anyways so updates won't go out until rib ack has been
received anyways.
Donald Sharp [Thu, 28 Mar 2024 16:25:05 +0000 (12:25 -0400)]
bgpd: Note when receiving but not understanding a route notification
When BGP has been asked to wait for FIB installation, on route
removal a return call is likely to not have the dest since BGP
will have cleaned up the node, entirely. Let's just note that
the prefix cannot be found if debugs are turned on and move on.
Donatas Abraitis [Wed, 27 Mar 2024 16:42:56 +0000 (18:42 +0200)]
bgpd: Fix error handling when receiving BGP Prefix SID attribute
Without this patch, we always set the BGP Prefix SID attribute flag without
checking if it's malformed or not. RFC8669 says that this attribute MUST be discarded.
Also, this fixes the bgpd crash when a malformed Prefix SID attribute is received,
with malformed transitive flags and/or TLVs.
BGP is now keeping a list of dests with the dest having a pointer
to the bgp_path_info that it will be working on.
1) When bgp receives a prefix, process it, add the bgp_dest of the
prefix into the new Fifo list if not present, update the flags (Ex:
earlier if the prefix was advertised and now it is a withdrawn),
increment the ref_count and DO NOT advertise the install/withdraw
to zebra yet.
2) Schedule an event to wake up to invoke the new function which will
walk the list one by one and installs/withdraws the routes into zebra.
a) if BUFFER_EMPTY, process the next item on the list
b) if BUFFER_PENDING, bail out and the callback in
zclient_flush_data() will invoke the same function when BUFFER_EMPTY
Changes
- rename old bgp_zebra_announce to bgp_zebra_announce_actual
- rename old bgp_zebra_withdrw to bgp_zebra_withdraw_actual
- Handle new fifo list cleanup in bgp_exit()
- New funcs: bgp_handle_route_announcements_to_zebra() and
bgp_zebra_route_install()
- Define a callback function to invoke
bgp_handle_route_announcements_to_zebra() when BUFFER_EMPTY in
zclient_flush_data()
The current change deals with bgp installing routes via
bgp_process_main_one()
Since installing/withdrawing routes into zebra is going to be changed
around to be dest based in a list,
- Retrieve the afi/safi to use based upon the dest's afi/safi
instead of passing it in.
- Prefix is known by the dest. Remove this arg as well
Louis Scalbert [Fri, 22 Mar 2024 09:15:45 +0000 (10:15 +0100)]
zebra: fix rejected route due to wrong nexthop-group
A specific sequence of actions involving the addition and removal of IP
routes and network interfaces can lead to a route installation failure.
The issue occurs under the following conditions:
- Initially, there is no route present via the ens3 interface.
- Adds a route: ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the same route: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the ens3 interface.
- Re-adds the ens3 interface.
- Again adds the same route: ip route 10.0.0.0/24 192.168.0.100 ens3
- And again removes it: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Shuts down the ens3 interface
- Reactivates the interface
- Adds the route once more: ip route 10.0.0.0/24 192.168.0.100 ens3
The route appears to be rejected.
> # show ip route nexthop
> S>r 10.0.0.0/24 [1/0] (6) via 192.168.0.100, ens3, weight 1, 00:00:01
The commit 35729f38fa ("zebra: Add a timer to nexthop group deletion")
introduced a feature to keep a nexthop-group in Zebra for a certain
period even when it is no longer in use. But if a nexthop-group
interface is removed during this period, the association between the
nexthop-group and the interface is lost in zebra memory. If the
interface is later added back and a route is re-established, the
nexthop-group interface dependency is not correctly reestablished.
As a consequence, the nexthop-group flags remain unset when the
interface is down. Upon the interface's reactivation, zebra does not
reinstall the nexthop-group in the kernel because it is marked as valid
and installed, but in reality, it does not exist in the kernel (it was
removed when the interface was down). Thus, attempts to install a route
via this nexthop-group ID fail.
Stop maintaining a nexthop-group when its associated interface is no
longer present.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
bgpd: Enable BGP dynamic capability by default for datacenter profile
Dynamic capability provides more value without resetting the sessions for some
important other capabilities to exchange, like: graceful-restart, addpath, orf,
fqdn, etc.
Since we support it already, enable it by default.
Chirag Shah [Sat, 16 Mar 2024 01:18:42 +0000 (18:18 -0700)]
bgpd: do not del peer upon pg remote as change
Currently, when peer-group remote-as is removed, it
deletes all associated neighbors.
Upon re configuring peer-group remote-as, all neighbors
needs to be reconfigured.
Instead, when peer-group remote-as is remove,
cease associated peer's connection and keep in Idle state.
When the peer-group remote-as is (re)configured, trigger
BGP Peer FSM to form neighbor.
Note the connection will be initiated after start timer
expiry.
Acee Lindem [Thu, 14 Mar 2024 10:05:27 +0000 (10:05 +0000)]
ospfd: Send LS Updates in response to LS Request as unicast.
With this fix, OSPF LS Updates sent in response to OSPF LS Requests during the DB Exchange process will be sent as unicasts. Unless the timing of multiple database exchanges coincides, there is little chance that the LSAs in the LS Update are required by OSPF routers other than the one which elicited the LS Update.
This is somewhat ambigous in RFC 2328 and two errata have been filed for clarification:
FRR OSPFv3 (ospf6d) already does it correctly - see ospf6_lsupdate_send_neighbor(struct event *thread). Also, if there is any doubt, one can refer to the C++ code at ospf.org (John Moy's seminal OSPF reference implementation).
Igor Ryzhov [Sun, 17 Mar 2024 20:44:28 +0000 (22:44 +0200)]
lib: remove nb/yang memory cleanup when daemonizing
We're not calling any other termination functions to free allocated
memory when daemonizing except these two. There's no reason for such an
exception, and because of these calls we have the following libyang
warnings every time FRR is started:
```
MGMTD: libyang: String "15" not freed from the dictionary, refcount 2
MGMTD: libyang: String "200" not freed from the dictionary, refcount 2
MGMTD: libyang: String "mrib-then-urib" not freed from the dictionary, refcount 2
MGMTD: libyang: String "1000" not freed from the dictionary, refcount 2
MGMTD: libyang: String "10" not freed from the dictionary, refcount 2
MGMTD: libyang: String "5" not freed from the dictionary, refcount 2
```
Remove these calls to get rid of the unnecessary warnings.
Donald Sharp [Fri, 15 Mar 2024 16:10:58 +0000 (12:10 -0400)]
lib: Prevent crash then another crash from happening
When a memory operation (malloc/free/... ) causes a crash
and the call to core_handler causes another crash then
instead of actually writing a core dump the alarm is
hit and the daemon in trouble will not cause a core dump.
Modify the shutdown code to just try to dump the buffers
and leave instead of cleaning up after itself.
Back Trace:
(gdb) bt
0 0x00007f17082ec056 in __lll_lock_wait_private () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007f17082fc8bd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007f17082fee8f in free () from /lib/x86_64-linux-gnu/libc.so.6
3 0x00007f170866c2ea in qfree (mt=<optimized out>, ptr=<optimized out>) at lib/memory.c:141
4 0x00007f17086c156a in zlog_tls_free (arg=0x55584f816fb0) at lib/zlog.c:390
5 zlog_tls_buffer_fini () at lib/zlog.c:346
6 0x00007f1708695e5f in core_handler (signo=11, siginfo=0x7ffd173229f0, context=<optimized out>) at lib/sigevent.c:264
7 <signal handler called>
8 0x00007f17082fd7bc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
9 0x00007f17082ff6e2 in calloc () from /lib/x86_64-linux-gnu/libc.so.6
10 0x00007f1708451e78 in lh_table_new () from /lib/x86_64-linux-gnu/libjson-c.so.5
11 0x00007f170844c979 in json_object_new_object () from /lib/x86_64-linux-gnu/libjson-c.so.5
12 0x000055584e002fd9 in evpn_show_all_routes (vty=vty@entry=0x55584fb5ea00, bgp=bgp@entry=0x55584f82c600, type=<optimized out>, json=json@entry=0x55584f998130, detail=<optimized out>,
self_orig=<optimized out>) at bgpd/bgp_evpn_vty.c:3192
13 0x000055584e009ed6 in show_bgp_l2vpn_evpn_route (self=<optimized out>, vty=0x55584fb5ea00, argc=6, argv=0x55584f998970) at bgpd/bgp_evpn_vty.c:5048
14 0x00007f170863af60 in cmd_execute_command_real (vline=vline@entry=0x55584fa87cb0, vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x0, up_level=up_level@entry=0, filter=FILTER_RELAXED)
at lib/command.c:1030
15 0x00007f170863b2be in cmd_execute_command (vline=vline@entry=0x55584fa87cb0, vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x0, vtysh=vtysh@entry=0) at lib/command.c:1089
16 0x00007f170863b550 in cmd_execute (vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x55584fb65160 "sh bgp l2vpn evpn route json", matched=matched@entry=0x0, vtysh=vtysh@entry=0)
at lib/command.c:1257
17 0x00007f17086acc77 in vty_command (vty=vty@entry=0x55584fb5ea00, buf=0x55584fb65160 "sh bgp l2vpn evpn route json") at lib/vty.c:503
18 0x00007f17086ad444 in vty_execute (vty=vty@entry=0x55584fb5ea00) at lib/vty.c:1266
19 0x00007f17086b06c8 in vtysh_read (thread=<optimized out>) at lib/vty.c:2165
20 0x00007f17086a798d in thread_call (thread=thread@entry=0x7ffd17325ce0) at lib/thread.c:2008
21 0x00007f1708660568 in frr_run (master=0x55584f22a120) at lib/libfrr.c:1223
22 0x000055584dfc8c96 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:555
Donatas Abraitis [Fri, 15 Mar 2024 11:49:06 +0000 (13:49 +0200)]
bgpd: Update default-originate route-map actual map structure
If using with `bgp listen range ... peer-group x`, default_rmap[afi][safi] is not
updated, and after the hard-reset in other side, this is flushed and never updated
again without restarting the sender BGP daemon.
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>