Donald Sharp [Thu, 1 Dec 2022 18:12:40 +0000 (13:12 -0500)]
bgpd: Fix several use after free's in bgp for the peer
Three fixes:
a) When calling bgp_fsm_change_status with `Deleted` do
not add a new event to the peer's t_event because
we are already in the process of deleting everything
b) When bgp_stop decides to delete a peer return a notification
that it is happening to bgp_event_update so that it does not
set the peer state back to idle or do other processing.
c) bgp_event_update can cause a peer deletion, because
the peer can be deleted in the fsm function but the peer
data structure is still pointed to and used after words.
So lock the peer before entering and prevent a use after
free.
Donald Sharp [Tue, 29 Nov 2022 14:00:39 +0000 (09:00 -0500)]
bgpd: peer creation now takes care of the su
At some point in the past the peer creation was not
properly setting the su and the code had the release
and re-add when setting the su. Since peer_create
got a bit of code to handle the su properly the
need to release then add it back in is negated
so remove the code.
The `behavior usid` command is installed under the SRv6 Locator node in
the zebra VTY. However, in the SRv6 config write function this command
is wrongly put on the same line as the `prefix X:X::X:X/M` command.
This causes a failure when an SRv6 uSID locator is configured in zebra
and `frr-reload.py` is used to reload the FRR configuration.
This commit prepends a newline character to the `behavior usid` command
in the SRv6 config write function. The output of `show running-config`
before and after this commit is shown below.
Before:
```
Building configuration...
Current configuration:
!
frr version 8.5-dev
!
segment-routing
srv6
locators
locator loc1
prefix fc00:0:1::/48 block-len 32 node-len 16 behavior usid
exit
!
exit
!
exit
!
exit
!
end
```
Ryoga Saito [Sun, 4 Dec 2022 07:51:24 +0000 (16:51 +0900)]
tests: Fix topotests for bgp_srv6l3vpn
In bgp_srv6l3vpn tests, check_ping checks reachability. However, this
function have a bug and if we set expect_connected to True, check will
pass even if all ping packets are lost. This commit fixes this issue.
```
r3# sh ip bgp ipv4 labeled-unicast neighbors 192.168.34.4 advertised-routes
BGP table version is 3, local router ID is 192.168.34.3, vrf id 0
Default local pref 100, local AS 65003
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.0.0.1/32 0.0.0.0 0 65001 ?
The `bgp_srv6l3vpn_to_bgp_vrf2` topotest tests the SRv6 IPv4 L3VPN
functionality. It applies the appropriate configuration in `bgpd` and
`zebra`, and then checks that the RIB is updated correctly.
The topotest expects to find the AS-Path in the RIB, which is only
present if the `bgp send-extra-data zebra` option is enabled in the
`bgpd` configuration.
Currently, the `bgp send-extra-data zebra` option is not set in the
`bgpd` configuration, which always causes the topotest to fail.
This commit fixes the `bgp_srv6l3vpn_to_bgp_vrf2` topotest by enabling
the `bgp send-extra-data zebra` option for both routers `r1` and `r2`.
show ip/ipv6 nht vrf <all | name> json support added.
Commands enhanced with JSON:
----------------------------
show ip nht json
show ip nht <addr> json
show ipv6 nht json
show ipv6 nht <addr> json
show ip nht vrf <name> json
show ip nht vrf all json
show ipv6 nht vrf <name> json
show ipv6 nht vrf all json
show ip nht vrf default <addr> json
show ipv6 nht vrf default <addr> json
Donald Sharp [Mon, 28 Nov 2022 14:41:03 +0000 (09:41 -0500)]
ospf6d: ospf6_route_cmp_nexthops make return sane
The ospf6_route_cmp_nexthops function was returning 0 for same
and 1 for not same. Let's reverse the polarity and actually make
the returns useful long term.
Donald Sharp [Sat, 26 Nov 2022 14:23:50 +0000 (09:23 -0500)]
lib: Do not log `echo PING` commands from watchfrr
Since the `echo PING` commands are from watchfrr and are sent
a whole bunch when an operator has `log commands` on the amount
of logging done is quite significant.
Renato Westphal [Thu, 24 Nov 2022 01:14:51 +0000 (22:14 -0300)]
ospf6d: fix infinite loop when adding ASBR route
Commit 8f359e1593c414322 removed a check that prevented the same route
from being added twice. In certain topologies, that change resulted in
the following infinite loop when adding an ASBR route:
ospf6_route_add
ospf6_top_brouter_hook_add
ospf6_abr_examin_brouter
ospf6_abr_examin_summary
ospf6_route_add
(repeat until stack overflow)
Revert the offending commit and update `ospf6_route_is_identical()` to
not do comparison using `memcmp()`.
Rafael Zalamena [Thu, 24 Nov 2022 14:16:18 +0000 (11:16 -0300)]
bfdd: fix IPv4 socket source selection
The imported BFD code had some logic to ignore the source address when
using single hop IPv4. The BFD peer socket function should allow the
source to be selected so we can:
1. Select the source address in the outgoing packets
2. Only receive packets from that specific source
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Abhishek N R [Wed, 16 Nov 2022 08:08:05 +0000 (00:08 -0800)]
pimd, pim6d: Using ttable for displaying "show ip/ipv6 pim nexthop" command output
Before:
R4(config)# do show ipv6 pim nexthop
Number of registered addresses: 6
Address Interface Nexthop
---------------------------------------------
3700:1234:1234:1234:1234:1234:1234:1234 ens161 fe80::250:56ff:feb7:d8d5
5101::10 ens224.51 5101::10
3300::5555 ens161 fe80::250:56ff:feb7:d8d5
4400::1 ens161 fe80::250:56ff:feb7:d8d5
1020::10 ens257 1020::10
3000::1 ens192.4010 fe80::250:56ff:feb7:493b
3000::1 ens193.4015 fe80::250:56ff:feb7:b12a
After:
frr# show ipv6 pim nexthop
Number of registered addresses: 2
Address Interface Nexthop
---------------------------------------------
105::105 lo 105::105
12::1 ens192 fe80::250:56ff:feb7:38de
Stephen Worley [Tue, 22 Nov 2022 22:18:02 +0000 (17:18 -0500)]
lib: disable vrf before terminating interfaces
We must disable the vrf before we start terminating interfaces.
On termination, we free the 'zebra_if' struct from the interface ->info
pointer. We rely on that for subsystems like vxlan for cleanup when
shutting down.
'''
==497406== Invalid read of size 8
==497406== at 0x47E70A: zebra_evpn_del (zebra_evpn.c:1103)
==497406== by 0x47F004: zebra_evpn_cleanup_all (zebra_evpn.c:1363)
==497406== by 0x4F2404: zebra_evpn_vxlan_cleanup_all (zebra_vxlan.c:1158)
==497406== by 0x4917041: hash_iterate (hash.c:267)
==497406== by 0x4F25E2: zebra_vxlan_cleanup_tables (zebra_vxlan.c:5676)
==497406== by 0x4D52EC: zebra_vrf_disable (zebra_vrf.c:209)
==497406== by 0x49A247F: vrf_disable (vrf.c:340)
==497406== by 0x49A2521: vrf_delete (vrf.c:245)
==497406== by 0x49A2E2B: vrf_terminate_single (vrf.c:533)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== Address 0x6d68c68 is 200 bytes inside a block of size 272 free'd
==497406== at 0x48470E4: free (vg_replace_malloc.c:872)
==497406== by 0x4942CF0: qfree (memory.c:141)
==497406== by 0x49196A9: if_delete (if.c:293)
==497406== by 0x491C54C: if_terminate (if.c:1031)
==497406== by 0x49A2E22: vrf_terminate_single (vrf.c:532)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== by 0x499A5F0: thread_fetch (thread.c:1775)
==497406== by 0x492850E: frr_run (libfrr.c:1197)
==497406== by 0x441746: main (main.c:476)
==497406== Block was alloc'd at
==497406== at 0x4849464: calloc (vg_replace_malloc.c:1328)
==497406== by 0x49429A5: qcalloc (memory.c:116)
==497406== by 0x491D971: if_new (if.c:174)
==497406== by 0x491ACC8: if_create_name (if.c:228)
==497406== by 0x491ABEB: if_get_by_name (if.c:613)
==497406== by 0x427052: netlink_interface (if_netlink.c:1178)
==497406== by 0x43BC18: netlink_parse_info (kernel_netlink.c:1188)
==497406== by 0x4266D7: interface_lookup_netlink (if_netlink.c:1288)
==497406== by 0x42B634: interface_list (if_netlink.c:2368)
==497406== by 0x4ABF83: zebra_ns_enable (zebra_ns.c:127)
==497406== by 0x4AC17E: zebra_ns_init (zebra_ns.c:216)
==497406== by 0x44166C: main (main.c:408)
'''
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Ryoga Saito [Tue, 22 Nov 2022 13:57:24 +0000 (22:57 +0900)]
bgpd: Fix the other of SR locator parameters
The latest FRR's frr-reload.py is broken and we can't reload FRR
gracefully with segment routing locator configuration (if we
execute frr-reload.py, FRR will stop suddenly).
The root cause of this issue is very simple. FRR will display the
current configuration like this (the below is the result of
"show running-configuration").
rgirada [Tue, 22 Nov 2022 09:59:40 +0000 (09:59 +0000)]
ospfd: Fixing memleak.
Description:
As part of signal handler ospf_finish_final(), lsas are originated
and added to refresh queues are not freed.
One such leak is :
==2869285== 432 (40 direct, 392 indirect) bytes in 1 blocks are definitely lost in loss record 159 of 221
==2869285== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2869285== by 0x4910EC3: qcalloc (memory.c:116)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:4017)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:3979)
==2869285== by 0x19A37F: ospf_network_lsa_install (ospf_lsa.c:2680)
==2869285== by 0x19A37F: ospf_lsa_install (ospf_lsa.c:2941)
==2869285== by 0x19C18F: ospf_network_lsa_update (ospf_lsa.c:1099)
==2869285== by 0x1931ED: ism_change_state (ospf_ism.c:556)
==2869285== by 0x1931ED: ospf_ism_event (ospf_ism.c:596)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19FBC6: nsm_change_state (ospf_nsm.c:695)
==2869285== by 0x19FBC6: ospf_nsm_event (ospf_nsm.c:861)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19020B: ospf_if_cleanup (ospf_interface.c:322)
==2869285== by 0x192D0C: ism_interface_down (ospf_ism.c:393)
==2869285== by 0x193028: ospf_ism_event (ospf_ism.c:584)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x190F10: ospf_if_down (ospf_interface.c:851)
==2869285== by 0x1911D6: ospf_if_free (ospf_interface.c:341)
==2869285== by 0x1E6E98: ospf_finish_final (ospfd.c:748)
==2869285== by 0x1E6E98: ospf_deferred_shutdown_finish (ospfd.c:578)
==2869285== by 0x1E7727: ospf_finish (ospfd.c:682)
==2869285== by 0x1E7727: ospf_terminate (ospfd.c:652)
==2869285== by 0x18852B: sigint (ospf_main.c:105)
==2869285== by 0x493BE12: frr_sigevent_process (sigevent.c:130)
==2869285== by 0x494DCD4: thread_fetch (thread.c:1775)
==2869285== by 0x4905022: frr_run (libfrr.c:1197)
==2869285== by 0x187891: main (ospf_main.c:235)
Added a fix to cleanup all these queue pointers and corresponing lsas in it.
pim6d, pimd: Discard (*,G) prune if WC bit is set but RPT bit is unset.
As per RFC 7761, Section 4.9.1
The RPT (or Rendezvous Point Tree) bit is a 1-bit value for use
with PIM Join/Prune messages (see Section 4.9.5.1). If the
WC bit is 1, the RPT bit MUST be 1.
ANVL conformance test case is trying to verify this and is failing.
pim6d, pimd: Discard (*,G) join if WC bit is set but RPT bit is unset.
As per RFC 7761, Section 4.9.1
The RPT (or Rendezvous Point Tree) bit is a 1-bit value for use
with PIM Join/Prune messages (see Section 4.9.5.1). If the
WC bit is 1, the RPT bit MUST be 1.
ANVL conformance test case is trying to verify this and is failing.