- bfdd: fix ipv4 socket source selection
- bgpd : fix crash for `set ipv4/ipv6 vpn next-hop` command
- bgpd: stop overriding nexthop when bgp unnumbered
- bgpd: fix aggregated routes are withdrawn abnormally
- bgpd: fix a few memory leaks
- build: enable pim6d by default
- build: fix sed regex in lua macro
- doc : add freebsd 13 build docs
- isisd: fix memory leak
- lib: disable vrf before terminating interfaces
- lib: do not log `echo ping` commands from watchfrr
- ospf6d: fix infinite loop when adding asbr route
- ospfd: fix rfc conformance test cases 25.19 and 27.6
- ospfd: fix typo and report the P2P link name in the warning
- ospfd: report the router IP with opaque capability mismatch
- ospfd: fixing memory leak
- pimd: consistently ignore prefix list mask len
- staticd: do not crash when modifying an existing static route with color
- zebra: free all memory associated ctx->u.iptable.interface_name_list
- zebra: fix tracepoint changes for lttng
- zebra: free up route map name memory on vrf deletion event
- zebra: use `mpls enable`, not `mpls` when generating a config
- tools: Ignore agentx command for frr-reload.py
David Lamparter [Fri, 6 Jan 2023 15:59:55 +0000 (16:59 +0100)]
lib: disable xref ELF note on mips64el
mips64el does not have a 64-bit PC-relative relocation, which is needed
to emit the ELF note for xrefs. Disabling the ELF note means clippy
takes the fallback path using section headers, so everything does still
work (... unless you strip the section headers.)
David Lamparter [Fri, 6 Jan 2023 15:55:38 +0000 (16:55 +0100)]
zebra: do not load/store wider-than-ptr atomics
Most 32-bit architectures cannot do atomic loads and stores of data
wider than their pointer size, i.e. 32 bit. Funnily enough they
generally *can* do a CAS2, i.e., 64-bit compare-and-swap, but while a
CAS can emulate atomic add/bitops, loads and stores aren't available.
Replace with a mutex; since this is 99% used from the zserv thread, the
mutex should take the local-to-thread fast path anyway. And while one
atomic might be faster than a mutex lock/unlock, we're doing several
here, and at some point a mutex wins on speed anyway.
This fixes build on armel, mipsel, m68k, powerpc, and sh4.
Donald Sharp [Wed, 21 Dec 2022 13:04:34 +0000 (08:04 -0500)]
vtysh: Remove double retrieve of env VTYSH_HISTFILE
The code is double checking the VTYSH_HISTFILE env variable,
additionally clang-15 when running SA over it doesn't fully
understand the code pattern. Reduce the double check to
one check to reduce program run-time (ha!) and make SA happy.
liuze03 [Wed, 7 Dec 2022 04:22:43 +0000 (12:22 +0800)]
bgpd: Fix aggregated routes are withdrawn abnormally.
The withdraw message and announcement message of a prefix are received continuously within 50ms, which may lead to abnormal aggregation route reference count.
Steps to reproduce:
--------------------------
step1:
local config aggregate route 111.0.0.0/24
received route:111.0.0.1/32 111.0.0.02/32
ref_count:2
step2:
peer withdraw 111.0.0.1/32 and network 111.0.0.1/32 in 50ms
received route:111.0.0.1/32 111.0.0.02/32
ref_count:1
step3:
peer withdraw 111.0.0.1/32
received route:111.0.0.02/32
ref_count:0
aggregate route will be withdrawn abnormally
David Lamparter [Mon, 2 Jan 2023 11:05:22 +0000 (12:05 +0100)]
debian: ditch git-buildpackage, clean changelog
Not enough time to figure out git-buildpackage right now.
Also the changelog contained upstream items that never existed in
Debian, those should never have gone into debian/changelog to begin
with. Strip them out.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Thu, 22 Dec 2022 00:26:58 +0000 (19:26 -0500)]
bgpd: bgp_connected_add memory was being leaked in some cases
On shutdown, bgp calls an unlock for the bnc connected table,
via the bgp_connected_cleanup function. This function is
only ever called on shutdown, so we know that bgp is going
away. The refcount for the connected data can be more than
1. Let's not worry about the refcount on shutdown and
just delete the nodes instead of leaving them around.
Donald Sharp [Wed, 21 Dec 2022 17:11:56 +0000 (12:11 -0500)]
bgpd: static routes are leaked on shutdown
Shutdown of bgp results in both the bgp_path_info,
bgp_dest and bgp_table's not being freed because
the bgp_path_info remains locked.
Effectively static routes are scheduled for deletion but bgp_process
skips the work because the work queue sees that the bgp router
is marked for deletion. Effectively not doing any work and leaving
data on the floor.
Modify the code when attempting to put into the work queue to
notice and not do so but just unlock the path info.
This is effectively the same as what goes on for normal peering
as that it checks for shutdown and just calls bgp_path_info_free
too.
Donatas Abraitis [Thu, 22 Dec 2022 15:55:40 +0000 (17:55 +0200)]
tools: Ignore agentx command for frr-reload.py
agentx can't be disabled once enabled, so we should ignore it for frr-reload.py.
```
$ /usr/lib/frr/frr-reload.py --reload /etc/frr/bgpd.conf --bindir /usr/local/bin
"no agentx" we failed to remove this command
SNMP AgentX support cannot be disabled once enabled
```
Ryoga Saito [Tue, 6 Dec 2022 17:27:16 +0000 (02:27 +0900)]
tests: Add topotest bgp_vrf_leaking_5549_routes
To verify previous changes, this PR introduces topotest to verify
whether imported routes learnt from BGP unnumbered peers will be active
on VPN RIB and other VRF RIB.
Ryoga Saito [Sun, 4 Dec 2022 14:51:45 +0000 (23:51 +0900)]
bgpd: Stop overriding nexthop when BGP unnumberred
When we use vrf-to-vrf export, the nexthop has already not been
overridden when the peer is BGP unnumberred. However, when we use normal
export, the nexthop will be oberridden. This behavior will make the VPN
routes invalid in VPN RIB.
This PR stops overriding nexthop even if we use normal export.
rgirada [Tue, 22 Nov 2022 09:59:40 +0000 (09:59 +0000)]
ospfd: Fixing memleak.
Description:
As part of signal handler ospf_finish_final(), lsas are originated
and added to refresh queues are not freed.
One such leak is :
==2869285== 432 (40 direct, 392 indirect) bytes in 1 blocks are definitely lost in loss record 159 of 221
==2869285== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2869285== by 0x4910EC3: qcalloc (memory.c:116)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:4017)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:3979)
==2869285== by 0x19A37F: ospf_network_lsa_install (ospf_lsa.c:2680)
==2869285== by 0x19A37F: ospf_lsa_install (ospf_lsa.c:2941)
==2869285== by 0x19C18F: ospf_network_lsa_update (ospf_lsa.c:1099)
==2869285== by 0x1931ED: ism_change_state (ospf_ism.c:556)
==2869285== by 0x1931ED: ospf_ism_event (ospf_ism.c:596)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19FBC6: nsm_change_state (ospf_nsm.c:695)
==2869285== by 0x19FBC6: ospf_nsm_event (ospf_nsm.c:861)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19020B: ospf_if_cleanup (ospf_interface.c:322)
==2869285== by 0x192D0C: ism_interface_down (ospf_ism.c:393)
==2869285== by 0x193028: ospf_ism_event (ospf_ism.c:584)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x190F10: ospf_if_down (ospf_interface.c:851)
==2869285== by 0x1911D6: ospf_if_free (ospf_interface.c:341)
==2869285== by 0x1E6E98: ospf_finish_final (ospfd.c:748)
==2869285== by 0x1E6E98: ospf_deferred_shutdown_finish (ospfd.c:578)
==2869285== by 0x1E7727: ospf_finish (ospfd.c:682)
==2869285== by 0x1E7727: ospf_terminate (ospfd.c:652)
==2869285== by 0x18852B: sigint (ospf_main.c:105)
==2869285== by 0x493BE12: frr_sigevent_process (sigevent.c:130)
==2869285== by 0x494DCD4: thread_fetch (thread.c:1775)
==2869285== by 0x4905022: frr_run (libfrr.c:1197)
==2869285== by 0x187891: main (ospf_main.c:235)
Added a fix to cleanup all these queue pointers and corresponing lsas in it.
Rafael Zalamena [Thu, 24 Nov 2022 14:16:18 +0000 (11:16 -0300)]
bfdd: fix IPv4 socket source selection
The imported BFD code had some logic to ignore the source address when
using single hop IPv4. The BFD peer socket function should allow the
source to be selected so we can:
1. Select the source address in the outgoing packets
2. Only receive packets from that specific source
Renato Westphal [Thu, 24 Nov 2022 01:14:51 +0000 (22:14 -0300)]
ospf6d: fix infinite loop when adding ASBR route
Commit 8f359e1593c414322 removed a check that prevented the same route
from being added twice. In certain topologies, that change resulted in
the following infinite loop when adding an ASBR route:
ospf6_route_add
ospf6_top_brouter_hook_add
ospf6_abr_examin_brouter
ospf6_abr_examin_summary
ospf6_route_add
(repeat until stack overflow)
Revert the offending commit and update `ospf6_route_is_identical()` to
not do comparison using `memcmp()`.
Donald Sharp [Sat, 26 Nov 2022 14:23:50 +0000 (09:23 -0500)]
lib: Do not log `echo PING` commands from watchfrr
Since the `echo PING` commands are from watchfrr and are sent
a whole bunch when an operator has `log commands` on the amount
of logging done is quite significant.
ospfd: Fix RFC conformance test cases 25.19 and 27.6
Steps to reproduce:
--------------------------
1. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-A on DIface-0 with DUT as DR.
2. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
3. DUT: Send <OSPF-LSU> packet.
4. ANVL: Verify that the received <OSPF-LSU> packet contains a Network- LSA for network N1
originated by DUT, and the LS Sequence Number is set to <InitialSequenceNumber>.
5. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-B on DIface-0 with DUT as DR.
6. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
7. DUT: Send <OSPF-LSU> packet.
8. ANVL: Verify that the received <OSPF-LSU> packet contains a new instance of the
Network-LSA for network N1 originated by DUT, and the LS Sequence Number
is set to (<InitialSequenceNumber> + 1).
Both the test cases were failing while verifying the initial sequence number for network LSA.
This is because currently OSPF does not reset its LSA sequence number when it is going down.
Stephen Worley [Tue, 22 Nov 2022 22:18:02 +0000 (17:18 -0500)]
lib: disable vrf before terminating interfaces
We must disable the vrf before we start terminating interfaces.
On termination, we free the 'zebra_if' struct from the interface ->info
pointer. We rely on that for subsystems like vxlan for cleanup when
shutting down.
'''
==497406== Invalid read of size 8
==497406== at 0x47E70A: zebra_evpn_del (zebra_evpn.c:1103)
==497406== by 0x47F004: zebra_evpn_cleanup_all (zebra_evpn.c:1363)
==497406== by 0x4F2404: zebra_evpn_vxlan_cleanup_all (zebra_vxlan.c:1158)
==497406== by 0x4917041: hash_iterate (hash.c:267)
==497406== by 0x4F25E2: zebra_vxlan_cleanup_tables (zebra_vxlan.c:5676)
==497406== by 0x4D52EC: zebra_vrf_disable (zebra_vrf.c:209)
==497406== by 0x49A247F: vrf_disable (vrf.c:340)
==497406== by 0x49A2521: vrf_delete (vrf.c:245)
==497406== by 0x49A2E2B: vrf_terminate_single (vrf.c:533)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== Address 0x6d68c68 is 200 bytes inside a block of size 272 free'd
==497406== at 0x48470E4: free (vg_replace_malloc.c:872)
==497406== by 0x4942CF0: qfree (memory.c:141)
==497406== by 0x49196A9: if_delete (if.c:293)
==497406== by 0x491C54C: if_terminate (if.c:1031)
==497406== by 0x49A2E22: vrf_terminate_single (vrf.c:532)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== by 0x499A5F0: thread_fetch (thread.c:1775)
==497406== by 0x492850E: frr_run (libfrr.c:1197)
==497406== by 0x441746: main (main.c:476)
==497406== Block was alloc'd at
==497406== at 0x4849464: calloc (vg_replace_malloc.c:1328)
==497406== by 0x49429A5: qcalloc (memory.c:116)
==497406== by 0x491D971: if_new (if.c:174)
==497406== by 0x491ACC8: if_create_name (if.c:228)
==497406== by 0x491ABEB: if_get_by_name (if.c:613)
==497406== by 0x427052: netlink_interface (if_netlink.c:1178)
==497406== by 0x43BC18: netlink_parse_info (kernel_netlink.c:1188)
==497406== by 0x4266D7: interface_lookup_netlink (if_netlink.c:1288)
==497406== by 0x42B634: interface_list (if_netlink.c:2368)
==497406== by 0x4ABF83: zebra_ns_enable (zebra_ns.c:127)
==497406== by 0x4AC17E: zebra_ns_init (zebra_ns.c:216)
==497406== by 0x44166C: main (main.c:408)
'''
David Lamparter [Fri, 4 Nov 2022 15:55:55 +0000 (16:55 +0100)]
pimd: consistently ignore prefix list mask len
... the prefix length wasn't ignored as expected. Both S and G are
always /32. But expecting "le 32" in prefix-list config is unexpected &
counterintuitive.
- zebra: reuse netinet/if_ether.h to avoid redefinition of struct ethhdr
- zebra: fix build without AF_MPLS
- ospfd: get route-map name for default-information originate
- ospfd: allow unnumbered and numbered addresses to co-exist better
- ospfd: prevent from crashing when processing external lsa
- bgpd: fix "storing the address of local variable"
- bgpd: rpki was decrementing the node lock one time too many
- bgpd: do not send Deconfig/Shutdown message when restarting
- pimd: convert zlog_warn to debug
Ryoga Saito [Sat, 12 Nov 2022 08:45:19 +0000 (17:45 +0900)]
bgpd: fix invalid ipv4-vpn nexthop for IPv6 peer
Given that two routers are connected each other and they have IPv6
addresses and they establish BGP peer with extended-nexthop capability
and one router tries to advertise locally-generated IPv4-VPN routes to
other router.
In this situation, bgpd on the router that tries to advertise IPv4-VPN
routes will be crashed with "invalid MP nexthop length (AFI IP6)".
This issue is happened because MP_REACH_NLRI path attribute is not
generated correctly when ipv4-vpn routes are advertised to IPv6 peer.
When IPv4 routes are leaked from VRF RIB, the nexthop of these routes
are also IPv4 address (0.0.0.0/0 or specific addresses). However,
bgp_packet_mpattr_start only covers the case of IPv6 nexthop (for IPv6
peer).
ipv4-unicast routes were not affected by this issue because the case of
IPv4 nexthop is covered in `else` block.
Louis Scalbert [Thu, 10 Nov 2022 16:18:01 +0000 (17:18 +0100)]
isisd: fix area NULL pointer in isis_route_update
Fix the case area is NULL.
Fixes: acc0029779 ("isisd: fix potential access to NULL pointer in isis_route_update") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 41c2bf853f072008330a929480bf235ac840212a)
bgpd: authorise to select bgp self peer prefix on rr case
This commit addresses an issue that happens when using bgp
peering with a rr client, with a received prefix which is the
local ip address of the bgp session.
When using bgp ipv4 unicast session, the local prefix is
received by a peer, and finds out that the proposed prefix
and its next-hop are the same. To avoid a route loop locally,
no nexthop entry is referenced for that prefix, and the route
will not be selected.
When the received peer is a route reflector, the prefix has
to be selected, even if the route can not be installed locally.
Fixes: ("fb8ae704615c") bgpd: prevent routes loop through itself Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit f874552557cf2577ae91b297f422e307084f22d3)
Donald Sharp [Mon, 14 Nov 2022 13:28:45 +0000 (08:28 -0500)]
zebra: Fix dplane_fpm_nl to allow for fast configuration
If you have this order in your configuration file:
no fpm use-next-hop-groups
fpm address 127.0.0.1
the dplane code was using the same event thread t_event and the second
add event in the code was going, you already have an event scheduled
and as such the second event does not overwrite it. Leaving
no code to actually start the whole processing. There are probably
other cli iterations that will cause this fun as well, but I'm
not going to spend the time sussing them out at the moment.