Igor Ryzhov [Wed, 27 Jan 2021 23:41:07 +0000 (02:41 +0300)]
ospfd: don't rely on instance existence in vty
Store instance index at startup and use it when processing vty commands.
The instance itself may be created and deleted by the user in runtime
using `[no] router ospf X` command.
Igor Ryzhov [Tue, 16 Feb 2021 09:57:01 +0000 (12:57 +0300)]
lib: register dependency between control plane protocol and vrf nb nodes
When the control plane protocol is created, the vrf structure is
allocated, and its address is stored in the northbound node.
The vrf structure may later be deleted by the user, which will lead to
a stale pointer stored in this node.
Instead of this, allow daemons that use the vrf pointer to register the
dependency between the control plane protocol and vrf nodes. This will
guarantee that the nodes will always be created and deleted together, and
there won't be any stale pointers.
Donald Sharp [Thu, 18 Feb 2021 00:58:19 +0000 (19:58 -0500)]
lib: Reduce getrusage/monotime for event handling
When handling a large number of events at one time
FRR will call monotime and getrusage 2 times for each
event. With this change modify the code to change
this to (X events / 2) + 1 calls of getrusage and monotime
David Schweizer [Mon, 22 Feb 2021 09:31:57 +0000 (10:31 +0100)]
tests: JSON comparison command for LabN topotests
The changes add the "jsoncmp_pass" and the "jsoncmp_fail" commands to
compare VTY shell's JSON output to an expected JSON object during
topotests using the LabN testing framework. This helps to eliminate
false negative test results (i.e. due to routes beeing out of order
after convergence or cosmetic changes in VTY shell's text output).
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
zebra: fix problem with SVI MAC not being sent to BGP
For MH the SVI MAC is advertised to prevent flooding of ARP replies.
But because of a bug the SVI MAC was being added to the zebra database
but not sent to bgpd for advertising.
zebra: drop the SVI MAC cleanup done as a part of interface delete
As a part of FRR shutdown interfaces are force flushed (in an arbitary
order). Interfaces are already down at that point i.e. resources like
SVI-MAC have already been released. Attempting to clean it up again
as a part of the force-flush was resulting in access of freed up memory -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
==26457== Thread 1:
==26457== Invalid read of size 8
==26457== at 0x1AE6B0: zebra_evpn_acc_bd_svi_set (zebra_evpn_mh.c:606)
==26457== by 0x1B1460: zebra_evpn_if_cleanup (zebra_evpn_mh.c:1040)
==26457== by 0x13CA69: if_zebra_delete_hook (interface.c:244)
==26457== by 0x48A0E34: hook_call_if_del (if.c:59)
==26457== by 0x48A0E34: if_delete_retain (if.c:290)
==26457== by 0x48A2F94: if_delete (if.c:313)
==26457== by 0x48A3169: if_terminate (if.c:1217)
==26457== by 0x48E0024: vrf_delete (vrf.c:254)
==26457== by 0x48E0024: vrf_delete (vrf.c:225)
==26457== by 0x48E02FE: vrf_terminate (vrf.c:551)
==26457== by 0x1442E1: sigint (main.c:203)
==26457== by 0x1442E1: sigint (main.c:141)
==26457== by 0x48CF862: quagga_sigevent_process (sigevent.c:103)
==26457== by 0x48DD324: thread_fetch (thread.c:1404)
==26457== by 0x48A926A: frr_run (libfrr.c:1122)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(gdb) bt
(gdb) fr 5
1037 zebra/zebra_evpn_mh.c: No such file or directory.
(gdb) p zif->ifp->name
$2 = "vlan131", '\000' <repeats 12 times>
(gdb) p zif->link->info
$5 = (void *) 0x1
(gdb) p/x zif->ifp->flags
$7 = 0x1002
(gdb)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Chirag Shah [Tue, 24 Nov 2020 20:22:49 +0000 (12:22 -0800)]
zebra: prevent crash in evpn if cleanup
zebra crash is seen while cleaning up evpn interface
during shutdown event.
evpn interface clean up is called from vrf_delete callback
(gdb) frame 4
(is_up=false, br_zif=0x0, vlan_zif=0x557f31fb36f0) at zebra/zebra_evpn_mh.c:614
614 zebra/zebra_evpn_mh.c: No such file or directory.
(gdb) p tmp_br_zif
$1 = (struct zebra_if *) 0x0
(gdb) p vlan_zif->link
$2 = (struct interface *) 0x557f31fb2d40
(gdb) p vlan_zif->link->info
$3 = (void *) 0x0
(gdb) p zebra_if->ifp->name
No symbol "zebra_if" in current context.
(gdb) p vlan_zif->ifp->name
$4 = "peerlink-3.4094\000\000\000\000"
zebra: changes to advertise SVI mac by default if evpn-mh is enabled
Added support for advertising SVI MAC if EVPN-MH is enabled.
In the case of EVPN MH arp replies from an attached server can be sent to
the ES-peer. To prevent flooding of the reply the SVI MAC needs to be
advertised by default.
Note:
advertise-svi-ip could have been used as an alternate way to advertise
SVI MAC. However that config cannot be turned on if SVI IPs are
re-used (which is done to avoid wasting IP addresses in a subnet).
zebra: fix problem with SVI IP being advertised even if disabled
SVI IP is being advertised unconditionally i.e. even if disabled (and
that is the default config). This can be problematic when the SVI address
is re-used across racks.
Added the user config condition in all the relevant places where the
SVI advertisement is triggered.
Philippe Guibert [Fri, 19 Feb 2021 13:17:05 +0000 (14:17 +0100)]
bgpd: add attribute-unchanged attribute to flowspec
flowspec address family can now use attribute-unchanged attribute.
This parameter is necessary when it comes to play with
route-server-client, as that latter command forces to change
attribute-unchanged nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
lynne [Wed, 17 Feb 2021 14:21:43 +0000 (09:21 -0500)]
ospf6d: Update logs that indicate why ospf6 adjacency is not coming up.
Add more details to these logs to help make it easier to determine why
ospf6 adjacency is not coming up. Also make these logs show up without
having to turn on debug logging, again making it easier to debug the
misconfiguration.
Mark Stapp [Thu, 18 Feb 2021 14:37:09 +0000 (09:37 -0500)]
lib: don't awaken from poll for every timer
Only ask the event-loop poll() to awaken if a newly-added timer
actually might have changed the required timeout. Also compute
timer deadline outside of mutex locks.
Donald Sharp [Thu, 18 Feb 2021 11:55:29 +0000 (06:55 -0500)]
bgpd: Fix crash when we don't have a nexthop
Recent changes to allow bgpd to handle v6 LL slightly
differently in the nexthop tracking code has not
interacted well with the blackhole nexthop change
for peers. Modify the code to do the right thing
Pat Ruddy [Sun, 14 Feb 2021 12:09:55 +0000 (12:09 +0000)]
bgpd, lib: add oid2in6_addr utility and use it
The existing code was using the oid2in_addr API to copy IPv6
addresses passing an IPv6 length. Create a utility to do this
properly and avoid annoying coverity with type checking.
Donald Sharp [Thu, 17 Dec 2020 14:46:30 +0000 (09:46 -0500)]
bgpd: Switch LL nexthop tracking to be interface based
bgp is currently registering v6 LL as nexthops to be tracked
from zebra. This presents several problems.
a) zebra does not properly track multiple prefixes that match
the same route properly at this point in time.
b) BGP was receiving nexthops that were just incorrect because
of (a).
c) When a nexthop changed that really didn't affect the v6 LL
we were responding incorrectly because of this
Modify the code such that bgp nexthop tracking notices that
we are trying to register a v6 LL. When we do so, shortcut
and watch interface up/down events for this v6 LL and do
the work when an interface goes up / down for this type
of tracking.
Igor Ryzhov [Wed, 17 Feb 2021 12:06:20 +0000 (15:06 +0300)]
staticd: fix vrf enabling
When enabling the VRF, we should not install the nexthops that rely on
non-existent VRF.
For example, if we have route "1.1.1.0/24 2.2.2.2 vrf red nexthop-vrf blue",
and VRF red is enabled, we should not install it if VRF blue doesn't exist.
Igor Ryzhov [Wed, 17 Feb 2021 11:19:40 +0000 (14:19 +0300)]
staticd: fix nexthop creation and installation
Currently, staticd creates a VRF for the nexthop it is trying to install.
Later, when this nexthop is deleted, the VRF stays in the system and can
not be deleted by the user because "no vrf" command doesn't work for this
VRF because it was not created through northbound code.
There is no need to create the VRF. Just set nh_vrf_id to VRF_UNKNOWN
when the VRF doesn't exist.
Donald Sharp [Tue, 16 Feb 2021 20:54:08 +0000 (15:54 -0500)]
zebra: use AF_INET for protocol family
When looking up the conversion from kernel protocol to
internal protocol family make sure we use the correct
AF_INET( what the kernel uses ) instead of AFI_IP (which
is what FRR uses ).
Routes from OSPF will show up from the kernel as OSPF6 instead of
OSPF. Which will cause mayhem
Ticket: CM-33306 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Mark Stapp [Mon, 15 Feb 2021 18:59:02 +0000 (13:59 -0500)]
build: detect ICC, only try ICC options if ICC
Some ICC command-line options can cause confusion for other
compilers; test for ICC specifically, and only try to use those
options if ICC is being used.
Donald Sharp [Sun, 14 Feb 2021 21:08:07 +0000 (16:08 -0500)]
bgpd: Prevent store but never read of i
The new bgp_mplsvpn_snmp.c code increments i and
never uses that value. Comment the code out
to make the clang SA analyzer happy. Shouldn't
be a problem since this code will never probably
be touched again(ha!).
Donald Sharp [Sun, 14 Feb 2021 21:04:16 +0000 (16:04 -0500)]
nhrpd: Fix clang SA about null deref
Clang was complaining when running SA that the nhrpd_privs.change
function was null. It just does not fully understand how things
are setup. Add a assert to make it happy.
Donatas Abraitis [Sun, 14 Feb 2021 15:49:19 +0000 (17:49 +0200)]
bgpd: Check for peer->su_remote if not NULL when handling IPv6 nexthop
```
(gdb) bt
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
1 0x00007fe57ca4a42a in __GI_abort () at abort.c:89
2 0x00007fe57ddd1935 in core_handler (signo=6, siginfo=0x7ffc81067570, context=<optimized out>) at lib/sigevent.c:255
3 <signal handler called>
4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
5 0x00007fe57ca4a42a in __GI_abort () at abort.c:89
6 0x00007fe57ddd1935 in core_handler (signo=11, siginfo=0x7ffc81067e30, context=<optimized out>) at lib/sigevent.c:255
7 <signal handler called>
8 0x000055a7b25b923f in bgp_path_info_to_ipv6_nexthop (ifindex=ifindex@entry=0x7ffc810683c0, path=<optimized out>, path=<optimized out>) at bgpd/bgp_zebra.c:909
9 0x000055a7b25bb2e5 in bgp_zebra_announce (dest=dest@entry=0x55a7b5239c10, p=p@entry=0x55a7b5239c10, info=info@entry=0x55a7b5239cd0, bgp=bgp@entry=0x55a7b518b090, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1358
10 0x000055a7b256af6a in bgp_process_main_one (bgp=0x55a7b518b090, dest=0x55a7b5239c10, afi=AFI_IP6, safi=SAFI_UNICAST) at bgpd/bgp_route.c:2918
11 0x000055a7b256b0ee in bgp_process_wq (wq=<optimized out>, data=0x55a7b5221800) at bgpd/bgp_route.c:3027
12 0x00007fe57ddea2e0 in work_queue_run (thread=0x7ffc8106cd60) at lib/workqueue.c:291
13 0x00007fe57dde0781 in thread_call (thread=thread@entry=0x7ffc8106cd60) at lib/thread.c:1684
14 0x00007fe57dda84b8 in frr_run (master=0x55a7b48aaf00) at lib/libfrr.c:1126
15 0x000055a7b250a7da in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:540
(gdb)
```
This crashes with configs like:
```
router bgp 65534
no bgp ebgp-requires-policy
no bgp network import-check
!
address-family ipv6 unicast
import vrf donatas <<<<<< Crashes when entering this command
exit-address-family
!
router bgp 65534 vrf donatas
no bgp ebgp-requires-policy
no bgp network import-check
neighbor fe80::c15a:ddab:1689:db86 remote-as 65025
neighbor fe80::c15a:ddab:1689:db86 interface eth2
neighbor fe80::c15a:ddab:1689:db86 update-source eth2
neighbor fe80::c15a:ddab:1689:db86 capability extended-nexthop
!
address-family ipv6 unicast
network 2a02:face::/32 <<<<<< Crashes due to static networks
neighbor fe80::c15a:ddab:1689:db86 activate
exit-address-family
!
```
Locally configured routes do not have peer->su_remote.
```
exit1-debian-9# show bgp ipv6 unicast
BGP table version is 3, local router ID is 192.168.100.1, vrf id 0
Default local pref 100, local AS 65534
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2a02:abc::/64 fe80::c15a:ddab:1689:db86@5<
0 65025 i
2a02:face::/32 ::@5< 0 32768 i
Displayed 2 routes and 2 total paths
exit1-debian-9#
Donald Sharp [Thu, 11 Feb 2021 14:54:34 +0000 (09:54 -0500)]
bgpd: Blackhole nexthops are not reachable
When bgp registers for a nexthop that is not reachable due
to the nexthop pointing to a blackhole, bgp is never going
to be able to reach it when attempting to open a connection.
Broken behavior:
<show bgp nexthop>
192.168.161.204 valid [IGP metric 0], #paths 0, peer 192.168.161.204
blackhole
Last update: Thu Feb 11 09:46:10 2021
eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.161.204 0 0 never Waiting for peer OPEN
The log file fills up with this type of message:
2021-02-09T18:53:11.653433+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:21.654005+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:31.654381+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:41.654729+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:51.655147+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
As that the connect to a blackhole is correctly rejected by the kernel
Fixed behavior:
eva# show bgp ipv4 uni summ
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
annie(192.168.161.2) 4 64539 126264 39 0 0 0 00:01:36 38 40 N/A
192.168.161.178 4 0 0 0 0 0 0 never Active 0 N/A
Total number of neighbors 2
eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.161.178 0 0 never Waiting for NHT
Total number of neighbors 2
eva# show bgp nexthop
Current BGP nexthop cache:
192.168.161.2 valid [IGP metric 0], #paths 38, peer 192.168.161.2
if enp39s0
Last update: Thu Feb 11 09:52:05 2021
192.168.161.131 valid [IGP metric 0], #paths 0, peer 192.168.161.131
if enp39s0
Last update: Thu Feb 11 09:52:05 2021
192.168.161.178 invalid, #paths 0, peer 192.168.161.178
Must be Connected
Last update: Thu Feb 11 09:53:37 2021
eva#