Soman K S [Sun, 18 Oct 2020 11:49:32 +0000 (17:19 +0530)]
ospfd: External LSA not flushed when area is configured as nssa or stub
Issue:
When the ospf area is changed from default to nssa or stub, the previously
advertised external LSAs are not removed from the neighbor.
The LSAs remain in database till maxage timeout.
Fix:
Advertise the external LSAs with age set to maxage and flood to the
nssa or stub area.
Donald Sharp [Fri, 16 Oct 2020 17:51:52 +0000 (13:51 -0400)]
zebra: Fix use after free in debug path
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:
2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
#0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
#1 0x555555705b54 in re_list_first zebra/rib.h:222
#2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
#3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
#4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
#5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
#6 0x7ffff7450e9c in thread_call lib/thread.c:1581
#7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#8 0x55555561a578 in main zebra/main.c:455
#9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
#10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
#0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
#1 0x7ffff739ccad in qfree lib/memory.c:129
#2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
#3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
#4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
#5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
#6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
#7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
#8 0x7ffff7450e9c in thread_call lib/thread.c:1581
#9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#10 0x55555561a578 in main zebra/main.c:455
#11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
#1 0x7ffff739cb98 in qcalloc lib/memory.c:110
#2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
#3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
#4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
#5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
#6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
#7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
#8 0x5555555e584a in connected_up zebra/connected.c:254
#9 0x5555555e42ff in connected_announce zebra/connected.c:94
#10 0x5555555e4fd3 in connected_update zebra/connected.c:195
#11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
#12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
#13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
#14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
#15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
#16 0x7ffff7450e9c in thread_call lib/thread.c:1581
#17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#18 0x55555561a578 in main zebra/main.c:455
#19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first
This is happening because we are using the dest pointer after a call into
rib_gc_dest. In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage. We must reload the
dest pointer in this case.
Donald Sharp [Wed, 14 Oct 2020 15:19:45 +0000 (11:19 -0400)]
bgpd: More bgp_node -> bgp_dest cleanup
Some more of the bgp_node usage snuck in from big commits in
the past month or so from feature work. Do some work
to put it back to bgp_dest for incoming future work.
Pat Ruddy [Thu, 15 Oct 2020 11:24:51 +0000 (12:24 +0100)]
lib: align prefixevpn2str output with bgp_evpn_route2str
We have 2 different routines to turn an evpn route into a string.
This commit aligns the two to the latest maintained version as a
first step in removing one of them.
Babis Chalios [Thu, 1 Oct 2020 09:07:54 +0000 (11:07 +0200)]
ospfd: fix invocation of ospfTrapNbrStateChange
ospfNbrStateChange is generated when the state of neighbor regresses or
it progresses to a terminal state. When transitioning to or from Full
state on non-broadcast multi-access and broadcast networks the trap
should be sent by the designated router. This last condition was not
taken into account when checking for the conditions of generating the
trap.
Igor Ryzhov [Wed, 14 Oct 2020 20:01:49 +0000 (23:01 +0300)]
isisd: fix check for area-tag modification
Interface area-tag is not supposed to be modified once defined, but the
necessary check is currently broken, because the circuit is never in
init_circ_list if the area-tag is already configured for the interface.
Renato Westphal [Thu, 20 Aug 2020 22:55:42 +0000 (19:55 -0300)]
isisd: add support for Topology Independent LFA (TI-LFA)
TI-LFA is a modern fast-reroute (FRR) solution that leverages Segment
Routing to pre-compute backup nexthops for all destinations in the
network, helping to reduce traffic restoration times whenever a
failure occurs. The backup nexthops are expected to be installed
in the FIB so that they can be activated as soon as a failure
is detected, making sub-50ms recovery possible (assuming an
hierarchical FIB).
TI-LFA is a huge step forward compared to prior IP-FRR solutions,
like classic LFA and Remote LFA, as it guarantees 100% coverage
for all destinations. This is possible thanks to the source routing
capabilities of SR, which allows the backup nexthops to steer traffic
around the failures (using as many SIDs as necessary). In addition
to that, the repair paths always follow the post-convergence SPF
tree, which prevents transient congestions and suboptimal routing
from happening.
Deploying TI-LFA is very simple as it only requires a single
configuration command for each interface that needs to be protected
(both link protection and node protection are available). In addition
to IPv4 and IPv6 routes, SR Prefix-SIDs and Adj-SIDs are also
protected by the backup nexthops computed by the TI-LFA algorithms.
Olivier Dugeon [Wed, 14 Oct 2020 12:17:58 +0000 (14:17 +0200)]
ospfd: Store neighbor Adjacency SID in SR database
For TI-LFA, it is necessay to known the Adjacency SID advetise by the nieghbor
routers. However, the current Segment Routing code skip neighbor Adjacency SID
and thus, don't store them into the Segment Routing database.
This PR takes care of neighbor Adjacency SID by allowing to store them in the
Segment Routing database. Corresponding MPLS table entry is only configured if
the advertised Adjacency SID is global i.e. with L-Flag unset.
Donald Sharp [Sun, 11 Oct 2020 15:21:33 +0000 (11:21 -0400)]
*: Consolidate on first git blame ignore revs
The file .git-blame-ignore-revs was put first into
the system and is what was advertised in multiple
places. Since .ignore-revs was just created and
no announcement was made about the creation, let's
consolidate onto the first one created.
tests: Enable evpn_type5_test_topo1 suite to run in CI
1. Suite: evpn_type5_test_topo1 was added to pytest.ini during triaging phase as
there was bug: https://github.com/FRRouting/frr/issues/6867, which is fixed. Enabling
suite to be run in CI.
Donald Sharp [Tue, 13 Oct 2020 12:16:15 +0000 (08:16 -0400)]
ospfd: Prevent crash if transferring config amongst instances
If we enter:
int eth0
ip ospf area 0
ip ospf 10 area 0
!
This will crash ospf. Prevent this from happening.
OSPF instances:
a) Cannot be mixed with non-instance
b) Are their own process.
Since in multi-instance world ospf instances are their own process,
when an ospf processes receives an instance command we must remove
our config( if present ) and allow the new config to be active
in the new process. The problem here is that if you have not
done a `router ospf` above the lookup of the ospf pointer will
fail and we will just crash. Put some code in to prevent a crash
in this case.
Igor Ryzhov [Tue, 13 Oct 2020 11:03:42 +0000 (14:03 +0300)]
ospfd: fix "no ip ospf area"
This commit fixes the following behavior:
```
nfware(config)# interface enp2s0
nfware(config-if)# ip ospf area 0
nfware(config-if)# no ip ospf area 0
% [ospfd]: command ignored as it targets an instance that is not running
```
We should be able to use the command without configuring the instance.
Roy Marples [Mon, 12 Oct 2020 19:53:14 +0000 (20:53 +0100)]
zebra: ifi_link_state is the link state
SIOCGIFMEDIA returns the media state.
SIOCGIFDATA returns interface data which includes the link state.
While the status of the former is usually indicitive of the latter,
this is not always the case.
Ifact some recent net80211 changes in at least NetBSD and OpenBSD
have MONITOR media set to active but the link status set to DOWN.
All interfaces will return link state with SIOCGIFDATA, unlike
SIOCGIFMEDIA. However not all BSD's support SIOCGIFDATA - it has
recently been accepted into FreeBSD-13.
However, all BSD's do report the same structure in ifa_data for
AF_LINK addresses from getifaddrs(3) so the information has always
been available.
Chirag Shah [Sat, 10 Oct 2020 22:09:11 +0000 (15:09 -0700)]
bgpd: fix crash in bgp instance creation
In bgp global commands northbound local-as modify callback
check for backend db for checking existing bgp instance.
In an instance where no router bgp with old ASN cleaned up
followed by new bgp instance with new AS is created,
the nb_running_get_entry in validation phase returns stale
bgp reference, which leads to rejection of the router bgp command.
Trey Aspelund [Mon, 12 Oct 2020 19:39:11 +0000 (15:39 -0400)]
bgpd: fix show bgp neighbor routes for labeled-unicast
bgp_show_neighbor_route() was rewriting safi from LU to uni
before checking if the peer was enabled for LU. This resulted
in the peer's address-family check looking for unicast, which
would always fail for LU peers since unicast + LU are
mutually-exclusive AFIs.
This moves this safi reassignment after the peer AFI check,
ensuring that the peer's address-family check looks for LU
while the call to bgp_show() still uses uni.
spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 labeled-unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
after:
spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 label neighbors 1.1.1.1 routes
BGP table version is 1, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 11.11.11.11/32 1.1.1.1 0 0 1 i
Displayed 1 routes and 1 total paths
Donald Sharp [Mon, 12 Oct 2020 14:36:37 +0000 (10:36 -0400)]
bgpd: Correctly calculate threshold being reached
if (pcout > (pcount * peer->max_threshold[afi][safi] / 100 ))
is always true. So the very first route received will always
trigger the warning. We actually want the warning to happen
when we hit the threshold.
Donald Sharp [Sun, 11 Oct 2020 16:56:02 +0000 (12:56 -0400)]
ospfd: When failing to set socket options just note the failure
Instead of closing the socket, just note the failure and
continue on. If we actually failed here so many other
things would not be working at all, that actually
closing the fd won't matter.
Donald Sharp [Sun, 11 Oct 2020 16:38:42 +0000 (12:38 -0400)]
ripngd: Intentionally ignore return code for str2prefix_ipv6
We are calling str2prefix_ipv6 for a default route. Since
we know this will always succeed we can safely tell the compiler
that we are ok ignoring the return code.
Donald Sharp [Sun, 11 Oct 2020 15:13:33 +0000 (11:13 -0400)]
ospf6d: Make ospf6_lsa_lock follow normal FRR pattern
The normal ospf6_lsa_lock call should return the pointer
to the lock data structure we are holding. This is the
normal pattern for locking a data structure in FRR.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sat, 15 Aug 2020 17:41:18 +0000 (13:41 -0400)]
zebra: zevpn cannot be null passed into zebra_evpn_es_evi_show_one_evpn
In zebra_evpn_es_evi_show_vni the zevpn pointer if passed into
zebra_evpn_es_evi_show_one_evi will crash if it is null and
we have code that checks that it is non null and then immediately
calls the function. Add a return to prevent a crash.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
vdhingra [Fri, 9 Oct 2020 16:23:14 +0000 (09:23 -0700)]
staticd: To set the default value of blackhole type correctly
When nexthop is allocated, default value of blockhole type
was not getting set, this leads to below problem. The default
value should be in-sync with the deafult value in yang model.
c t
ip route 131.1.1.0/24 Null0
do show running-config
...
!
ip route 131.1.1.0/24 blackhole
!
end
Donald Sharp [Mon, 5 Oct 2020 20:19:09 +0000 (16:19 -0400)]
ospfclient: Provide some protection against blindly trusting input
Coverity rightly points out that blindly trusting the lsalen
from received data may not be the smartest thing to do. Add
a bit of code to prevent us from blindly malloc'ing
too much memory.
Donald Sharp [Fri, 9 Oct 2020 12:00:44 +0000 (08:00 -0400)]
isisd: circuit->area->isis to circuit->isis
The code in isisd uses `circuit->area->isis` all the time
but we know that circuit now has a valid `circuit->isis` pointer
so let's use that and cleanup the long dereference.
Igor Ryzhov [Fri, 9 Oct 2020 12:14:58 +0000 (15:14 +0300)]
rip(ng)d: fix interfaces cleaning
rip(ng)d_instance_disable unlinks the vrf from the instance which means
that rip(ng)_interfaces_clean never works, because rip(ng)->vrf is
always NULL there. This leads to the crash #6477.
Clean interfaces before disabling the instance to fix the issue.