Ondřej Surý [Thu, 6 Aug 2020 08:00:28 +0000 (10:00 +0200)]
debian: Work around the sphinx-build error that doesn't copy images to texinfo
The sphinx-build (since version 2.0.0) doesn't install the images into the
texinfo build directory. Workaround the issue, by copying the required
images from the source directory.
Ondřej Surý [Fri, 6 Nov 2020 21:03:33 +0000 (22:03 +0100)]
debian: Use wrap-and-sort -a to unify debian/ wrapping and sorting
While it's ok to use individual wrapping/sorting in the debian/ source files,
it's often simpler to just go with the formatting supported by tools. One such
tool is wrap-and-sort, so this commit re-wraps and re-sorts the debian/ files to
be unified and (-a) always wrapped.
Martin Winter [Tue, 3 Nov 2020 22:47:40 +0000 (23:47 +0100)]
FRRouting Release 7.5
BFD
Profile support
Minimum ttl support
BGP
rpki VRF support
GR fixes
Add wide option to display of routes
Add `maximum-prefix <num> force`
Add `bestpath-routes` to neighbor command
Add `bgp shutdown message MSG...` command
Add v6 Flowspec support
Add `neighbor <neigh> shutdown rtt` command
Allow update-delay to be applied globaly
EVPN
Beginning of MultiHoming Support
ISIS
Segment Routing Support
VRF Support
Guard against adj timer display overflow
Add support for Anycast-SIDs
Add support for Topology Independent LFA (TI-LFA)
Add `lsp-gen-interval 2` to isis configuration
OSPF
Segment Routing support for ECMP
Various LSA fixes
Prevent crash if transferring config amongst instances
PBR
Adding json support to commands
DSCP/ECN based PBR Matching
PIM
Add more json support to commands
Fix missing mesh-group commands
MSDP SA forwarding
Clear (s,g,rpt) ifchannel on (*, G) prune received
Fix igmp querier election and IP address mapping
Crash fix when RP is removed
STATIC
Northbound Support
YANG
Filter and route-map Support
OSPF model definition
BGP model definition
VTYSH
Speed up output across daemons
Fix build-time errors for some --enable flags
Speed up output of configuration across daemons
ZEBRA
nexthop group support for FPM
northbound support for rib model
Backup nexthop support
netlink batching support
Allow upper level protocols to request ARP
Add json output for zebra ES, ES-EVI and access vlan dumps
Upgrade to using libyang1.0.184
RPM
Moved RPKI to subpackage
Added SNMP subpackage
As always there are too many bugfixes to list individually. This release
compromises just over 1k of commits by the community, with contributors from
70 people.
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Igor Ryzhov [Tue, 13 Oct 2020 22:46:27 +0000 (01:46 +0300)]
ospfd: don't initialize ospf every time "router ospf" is used
Move ospf initialization to the actual place where it is created.
We don't need to do that every time "router ospf" is entered.
Also remove a couple of useless checks that can never be true.
Donald Sharp [Tue, 27 Oct 2020 13:59:10 +0000 (09:59 -0400)]
isisd: Fix usage of uninited memory
valgrind is showing a usage of uninited memory:
==935465== Conditional jump or move depends on uninitialised value(s)
==935465== at 0x159E17: tlvs_area_addresses_to_adj (isis_tlvs.c:4430)
==935465== by 0x15A4BD: isis_tlvs_to_adj (isis_tlvs.c:4568)
==935465== by 0x1377F0: process_p2p_hello (isis_pdu.c:203)
==935465== by 0x1391FD: process_hello (isis_pdu.c:781)
==935465== by 0x13BDBE: isis_handle_pdu (isis_pdu.c:1700)
==935465== by 0x13BECD: isis_receive (isis_pdu.c:1744)
==935465== by 0x49210FF: thread_call (thread.c:1585)
==935465== by 0x48CFACB: frr_run (libfrr.c:1099)
==935465== by 0x1218C9: main (isis_main.c:272)
==935465==
==935465== Conditional jump or move depends on uninitialised value(s)
==935465== at 0x483EEC5: bcmp (vg_replace_strmem.c:1111)
==935465== by 0x15A290: tlvs_ipv4_addresses_to_adj (isis_tlvs.c:4512)
==935465== by 0x15A4EB: isis_tlvs_to_adj (isis_tlvs.c:4570)
==935465== by 0x1377F0: process_p2p_hello (isis_pdu.c:203)
==935465== by 0x1391FD: process_hello (isis_pdu.c:781)
==935465== by 0x13BDBE: isis_handle_pdu (isis_pdu.c:1700)
==935465== by 0x13BECD: isis_receive (isis_pdu.c:1744)
==935465== by 0x49210FF: thread_call (thread.c:1585)
==935465== by 0x48CFACB: frr_run (libfrr.c:1099)
==935465== by 0x1218C9: main (isis_main.c:272)
Effectively we are reallocing memory to hold data. realloc does not
set the new memory to anything. So whatever happens to be in the memory
is what is there. after the realloc happens we are iterating over the
memory just realloced and doing memcmp's to values in it causing these
use of uninitialized memory.
Donald Sharp [Tue, 27 Oct 2020 19:16:32 +0000 (15:16 -0400)]
bgpd: Prevent ecomm memory leak
There are some situations where we create a ecommunity for
comparing to internal state when we are deleting, but in the
failure cases we would not free up the created memory.
Donald Sharp [Tue, 27 Oct 2020 16:40:46 +0000 (12:40 -0400)]
isisd: Fix memory leak in copy_tlv_router_cap
There exists a code path where we would allocate memory
then test a variable and then immediately return NULL.
Prevent memory from leaking in this situation.
Having an NSSA ABR redistributing statics, the type-7 LSA are being
continuously refreshed (every ~14 secs). The LSA Seq number keeps
incrementing and the LSA age is going back to 0 when reaching ~14s.
This PR fixes the issue by not forcing the LSA update
However I ignore if the "force" parameter was used in purpose. With this
PR updates are sent in case the metric or metric type are changed
Sep 24 08:54:48 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:02 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:16 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:30 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:44 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:58 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:12 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:26 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:40 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
ip route 2.2.2.2/32 blackhole
router ospf
network 10.0.23.0/24 area 1
area 1 nssa
!
r2# conf t
r2(config)# router ospf
r2(config-router)# redistribute static
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
2.2.2.2 10.0.25.2 13 0x8000000f 0x3f17 E2 2.2.2.2/32 [0x0] <<< Seq: f, age 13
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
2.2.2.2 10.0.25.2 0 0x80000010 0x3d18 E2 2.2.2.2/32 [0x0] <<< Seq: 10, age 0
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
2.2.2.2 10.0.25.2 3 0x8000001b 0x2723 E2 2.2.2.2/32 [0x0] <<< Seq: 1b, age 3
Soman K S [Sun, 18 Oct 2020 11:49:32 +0000 (17:19 +0530)]
ospfd: External LSA not flushed when area is configured as nssa or stub
Issue:
When the ospf area is changed from default to nssa or stub, the previously
advertised external LSAs are not removed from the neighbor.
The LSAs remain in database till maxage timeout.
Fix:
Advertise the external LSAs with age set to maxage and flood to the
nssa or stub area.
Chirag Shah [Tue, 27 Oct 2020 05:18:46 +0000 (22:18 -0700)]
bgpd: fix mem leak in router bgp import vrf check
==916511== 18 bytes in 2 blocks are definitely lost in loss record 7 of 147
==916511== at 0x483877F: malloc (vg_replace_malloc.c:307)
==916511== by 0x4BE0F0A: strdup (strdup.c:42)
==916511== by 0x48D66CE: qstrdup (memory.c:122)
==916511== by 0x1E6E31: bgp_vpn_leak_export (bgp_mplsvpn.c:2690)
==916511== by 0x28E892: bgp_router_create (bgp_nb_config.c:124)
==916511== by 0x48E05AB: nb_callback_create (northbound.c:869)
==916511== by 0x48E0FA2: nb_callback_configuration (northbound.c:1183)
==916511== by 0x48E13D0: nb_transaction_process (northbound.c:1308)
==916511== by 0x48E0137: nb_candidate_commit_apply (northbound.c:741)
==916511== by 0x48E024B: nb_candidate_commit (northbound.c:773)
==916511== by 0x48E6B21: nb_cli_classic_commit (northbound_cli.c:64)
==916511== by 0x48E757E: nb_cli_apply_changes (northbound_cli.c:281)
Don Slice [Wed, 21 Oct 2020 14:46:49 +0000 (07:46 -0700)]
bgpd: delay local routes until update-delay is over
Problem found that turning an update-delay would only delay prefixes
learned from peers by delaying bestpath, but would allow local routes
(network statements or redistributed) to be immediately advertised,
followed by an End of Rib indicator. This fix delays sending local
routes until the update-delay process is completed, which matches
what testing shows other vendors do..
Ticket: CM-31743 Signed-off-by: Don Slice <dslice@nvidia.com>
This problem was accidentally introduced as a part of another fixup -
[
commit e378f5020d1af1aee587659e5602ee8c07eb05f4 (anuradhak/mh-misc-fixes, mh-misc-fixes)
Author: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Date: Tue Sep 15 16:50:14 2020 -0700
zebra: fix use of freed es during zebra shutdown
]
zif->es_info.es is cleared as a part of zebra_evpn_es_local_info_clear so it
cannot be passed around as a pointer from zebra_evpn_local_es_update/del.
Because of this bug removing ES from an interface resulted in
a zebra crash.
==1263169== 304 (40 direct, 264 indirect) bytes in 1 blocks are definitely lost in loss record 61 of 78
==1263169== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1263169== by 0x48DD878: qcalloc (memory.c:110)
==1263169== by 0x5116D5: bgp_table_init (bgp_table.c:110)
==1263169== by 0x4EB5C4: bgp_static_set_safi (bgp_route.c:5927)
==1263169== by 0x4C3382: vpnv4_network (bgp_mplsvpn.c:1911)
==1263169== by 0x489FBEC: cmd_execute_command_real (command.c:916)
==1263169== by 0x489F7CB: cmd_execute_command (command.c:976)
==1263169== by 0x489FD04: cmd_execute (command.c:1138)
==1263169== by 0x493AF73: vty_command (vty.c:517)
==1263169== by 0x493AA07: vty_execute (vty.c:1282)
==1263169== by 0x4939B54: vtysh_read (vty.c:2115)
==1263169== by 0x492E63C: thread_call (thread.c:1585)
The bgp_static_delete function was not unlocking the right bgp_dest. This
problem goes away after fixing this.
Donald Sharp [Fri, 23 Oct 2020 00:51:24 +0000 (20:51 -0400)]
isisd: Fix memory leak on shutdown
==935465== 40 bytes in 1 blocks are definitely lost in loss record 71 of 546
==935465== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==935465== by 0x48D6611: qcalloc (memory.c:110)
==935465== by 0x48CFE02: list_new (linklist.c:32)
==935465== by 0x15DBF0: isis_new (isisd.c:213)
==935465== by 0x15DAC4: isis_global_instance_create (isisd.c:179)
==935465== by 0x121892: main (isis_main.c:264)
==935465== 64 (40 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 101 of 546
==935465== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==935465== by 0x48D6611: qcalloc (memory.c:110)
==935465== by 0x48CFE02: list_new (linklist.c:32)
==935465== by 0x15DBE3: isis_new (isisd.c:212)
==935465== by 0x15DAC4: isis_global_instance_create (isisd.c:179)
==935465== by 0x121892: main (isis_main.c:264)
On isis shutdown we are seeing the above memory leaks. Modify
the code to start cleaning this up.
Stephen Worley [Fri, 23 Oct 2020 18:57:29 +0000 (14:57 -0400)]
zebra: fix unitialized msg header reading at startup
Fixes the valgrind error we were seeing on startup due to
initializing the msg header struct:
```
==2534283== Thread 3 zebra_dplane:
==2534283== Syscall param recvmsg(msg) points to uninitialised byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0x85cd850 is on thread 3's stack
==2534283== in frame #2, created by nl_batch_read_resp (kernel_netlink.c:1051)
==2534283==
==2534283== Syscall param recvmsg(msg.msg_control) points to unaddressable byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==2534283==
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Thu, 22 Oct 2020 12:02:33 +0000 (08:02 -0400)]
zebra: Do not delete nhg's when retain_mode is engaged
When `-r` is specified to zebra, on shutdown we should
not remove any routes from the fib. This was a problem
with nhg's on shutdown due to their ref-count behavior.
Introduce a methodology where on shutdown we don't mess
with the nexthop groups in the kernel. That way on
next startup things will be ok.
When the ASBR stops announcing a prefix into the NSSA area, the LSA
type 7 is removed from the area. However the ABR is refreshing the
type 5 in its LSDB while removing the Type 7 LSA. Routers outside
the area do not get an update.
With the following topology: r1---r2---r3, with r3 being the ASBR
announcing type 7 LSA:
r3 configuration
router ospf
redistribute static
network 10.0.23.0/24 area 1
area 1 nssa
!
We stop announcing prefix 3.3.3.3 in the ASBR
r3# conf
r3(config)# router ospf
r3(config-router)# no redistribute static
r3(config-router)#
r2 (ABR)
r2# sh ip os database
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 33.33.33.33 3600 0x8000002f 0x13be E2 3.3.3.3/32 [0x0] <-- flushed
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 7 0x8000002f 0x73c7 E2 3.3.3.3/32 [0x0] <-- refreshed(?)
With PR#7086 the LSA type 5 is flushed from the LSDB in r2 and the change is
announced to routers outside the area (r1)
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 33.33.33.33 3600 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0] <-- flushed
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0] <-- flushed
r1# sh ip os da
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0] <-- flushed
Unfortunately I just realized that with PR#7086 I'm introducing a new bug, as Type-5 LSA
are not being refreshed when reaching MaxAge
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 33.33.33.33 35 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0] <--- refreshed
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0] <--- not refreshed!
So this PR should fix the original issue and the bug introduced later, so when stopping
redistribution in the ASBR, both type 5 and type 7 are flushed:
r2# sh ip os da
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 33.33.33.33 3600 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0]
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]
Routers outside the area are also notified
r1# sh ip os da
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]
Re-enabling redistribution, both LSA will be advertised again
Igor Ryzhov [Wed, 14 Oct 2020 20:01:49 +0000 (23:01 +0300)]
isisd: fix check for area-tag modification
Interface area-tag is not supposed to be modified once defined, but the
necessary check is currently broken, because the circuit is never in
init_circ_list if the area-tag is already configured for the interface.
Donald Sharp [Tue, 13 Oct 2020 12:16:15 +0000 (08:16 -0400)]
ospfd: Prevent crash if transferring config amongst instances
If we enter:
int eth0
ip ospf area 0
ip ospf 10 area 0
!
This will crash ospf. Prevent this from happening.
OSPF instances:
a) Cannot be mixed with non-instance
b) Are their own process.
Since in multi-instance world ospf instances are their own process,
when an ospf processes receives an instance command we must remove
our config( if present ) and allow the new config to be active
in the new process. The problem here is that if you have not
done a `router ospf` above the lookup of the ospf pointer will
fail and we will just crash. Put some code in to prevent a crash
in this case.
Igor Ryzhov [Tue, 13 Oct 2020 11:03:42 +0000 (14:03 +0300)]
ospfd: fix "no ip ospf area"
This commit fixes the following behavior:
```
nfware(config)# interface enp2s0
nfware(config-if)# ip ospf area 0
nfware(config-if)# no ip ospf area 0
% [ospfd]: command ignored as it targets an instance that is not running
```
We should be able to use the command without configuring the instance.
Igor Ryzhov [Tue, 20 Oct 2020 19:43:31 +0000 (22:43 +0300)]
ospf6d: fix crash on message receive
OSPF6 daemon starts listening on its socket and reading messages right
after the initialization before the ospf6 router is created. If any
message is received, ospf6d crashes because ospf6_receive doesn't
NULL-check ospf6 pointer.
Fix this by opening the socket and reading messages only after the
creation of ospf6 router.
Mark Stapp [Fri, 16 Oct 2020 21:37:09 +0000 (17:37 -0400)]
zebra: support multiple connected subnets on an interface
[7.5 version]
We support configuration of multiple addresses in the same
subnet on a single interface: make sure that zebra supports
multiple instances of the corresponding connected route.
Donald Sharp [Fri, 16 Oct 2020 17:51:52 +0000 (13:51 -0400)]
zebra: Fix use after free in debug path
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:
2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
#0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
#1 0x555555705b54 in re_list_first zebra/rib.h:222
#2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
#3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
#4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
#5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
#6 0x7ffff7450e9c in thread_call lib/thread.c:1581
#7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#8 0x55555561a578 in main zebra/main.c:455
#9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
#10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
#0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
#1 0x7ffff739ccad in qfree lib/memory.c:129
#2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
#3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
#4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
#5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
#6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
#7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
#8 0x7ffff7450e9c in thread_call lib/thread.c:1581
#9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#10 0x55555561a578 in main zebra/main.c:455
#11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
#1 0x7ffff739cb98 in qcalloc lib/memory.c:110
#2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
#3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
#4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
#5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
#6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
#7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
#8 0x5555555e584a in connected_up zebra/connected.c:254
#9 0x5555555e42ff in connected_announce zebra/connected.c:94
#10 0x5555555e4fd3 in connected_update zebra/connected.c:195
#11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
#12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
#13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
#14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
#15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
#16 0x7ffff7450e9c in thread_call lib/thread.c:1581
#17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
#18 0x55555561a578 in main zebra/main.c:455
#19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first
This is happening because we are using the dest pointer after a call into
rib_gc_dest. In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage. We must reload the
dest pointer in this case.
Trey Aspelund [Mon, 12 Oct 2020 19:39:11 +0000 (15:39 -0400)]
bgpd: fix show bgp neighbor routes for labeled-unicast
bgp_show_neighbor_route() was rewriting safi from LU to uni
before checking if the peer was enabled for LU. This resulted
in the peer's address-family check looking for unicast, which
would always fail for LU peers since unicast + LU are
mutually-exclusive AFIs.
This moves this safi reassignment after the peer AFI check,
ensuring that the peer's address-family check looks for LU
while the call to bgp_show() still uses uni.
spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 labeled-unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
after:
spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 label neighbors 1.1.1.1 routes
BGP table version is 1, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 11.11.11.11/32 1.1.1.1 0 0 1 i
Displayed 1 routes and 1 total paths
Donald Sharp [Mon, 12 Oct 2020 14:36:37 +0000 (10:36 -0400)]
bgpd: Correctly calculate threshold being reached
if (pcout > (pcount * peer->max_threshold[afi][safi] / 100 ))
is always true. So the very first route received will always
trigger the warning. We actually want the warning to happen
when we hit the threshold.
Igor Ryzhov [Fri, 9 Oct 2020 12:14:58 +0000 (15:14 +0300)]
rip(ng)d: fix interfaces cleaning
rip(ng)d_instance_disable unlinks the vrf from the instance which means
that rip(ng)_interfaces_clean never works, because rip(ng)->vrf is
always NULL there. This leads to the crash #6477.
Clean interfaces before disabling the instance to fix the issue.
vdhingra [Fri, 9 Oct 2020 16:23:14 +0000 (09:23 -0700)]
staticd: To set the default value of blackhole type correctly
When nexthop is allocated, default value of blockhole type
was not getting set, this leads to below problem. The default
value should be in-sync with the deafult value in yang model.
c t
ip route 131.1.1.0/24 Null0
do show running-config
...
!
ip route 131.1.1.0/24 blackhole
!
end
Igor Ryzhov [Thu, 8 Oct 2020 16:23:08 +0000 (19:23 +0300)]
isisd: fix incorrect vrf lookups
Lookup in C_STATE_NA must be made before the new circuit creation, or it
will be leaked if the isis instance is not found. All other lookups are
unnecessary - we just need to remember the previously used instance.
Martin Buck [Tue, 29 Sep 2020 21:07:40 +0000 (23:07 +0200)]
ospf6d: Fix flooding of old copies of self-originated LSAs
When receiving old copies (e.g. originated before the local ospf6d was
restarted) of supposedly self-originated LSAs which we previously tried to
flush from the network (by setting them to MaxAge), neither flood them nor
add them to our LSDB. Instead, keep the MaxAge version until we actually
(re-)originate them.
Possible fix for #7030. Testcase in #7168
(tests/topotests/ospf6-dr-no-netlsa-bug7030).
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Donald Sharp [Thu, 1 Oct 2020 18:58:37 +0000 (14:58 -0400)]
zebra: Make connected routes their own entry on the meta_q
During quick ifdown / ifup events from the linux kernel there
exists a situation where a prefix that has both a kernel route
and a static route can queued up on the meta-q. If the static
route happens to point at a connected route for nexthop resolution
and we receive a series of quick up/down events *after* the
static route and kernel route are queued up for rib reprocessing.
Since the static route and kernel route are queued on meta-q 1
and the connected route is also on meta-q 1 there exists a situation
where the connected route will be resolved after the static route
fails to resolve, leaving the static route in a unresolved state.
Add a new queue level and put connected routes on their own level,
since they are the fundamental building blocks of pretty much
all the other routes.
Donald Sharp [Wed, 30 Sep 2020 21:55:44 +0000 (17:55 -0400)]
zebra: When processing route_entries ignore unusable routes
When zebra is processing routes to determine what to send
to the rib, suppose we have two routes (a) a route processed
earlier that none of it's nexthops were active and (b)
a route that has good nexthops but has a worse admin distance.
rib_process, would not relook at (a)'s nexthops because
the ROUTE_ENTRY_CHANGED flag was not true and it would
win when compared to (b) because it's admin distance
was better, leaving us with a state where we would
attempt and fail to install route (a) because it
was not valid.
Modify the code to consider the number of nexthops
we have as a determiner if we can use the route.
Donald Sharp [Wed, 30 Sep 2020 21:26:02 +0000 (17:26 -0400)]
zebra: Prevent uninstall attempts when new entry is not happy
In rib_process_update_fib, the function is sent two route entries
the old ( previously installed ) and new ( the one to install )
When the function detects that the new is unusable because
the number of nexthops that are usable for that route is 0,
then we uninstall the old route. The problem here is that
we should not attempt to uninstall any route that is
not owned by FRR. Modify the code to not attempt
this behavior
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bfdd: Make new multihop peer if local-address is unique
Previously if there were two multihop peers created that had the same
peer address but different local addresses then the second peer to be
created would be merged with the first one and niether would be able to
be deleted. This was due to an issue in the function bfd_key_lookup().
When the second peer was created its key would be sent into the lookup
function and would reach the last section, even though it shouldn't
have. A check has been placed around the section so that it will not be
entered if a peer is multihop.
Stephen Worley [Wed, 23 Sep 2020 18:17:15 +0000 (14:17 -0400)]
pbrd: use bool for pbr_send_pbr_map() return val
Use a bool as the return val for pbr_send_pbr_map() to make
the code a bit more readable. Dont expect there to be need
for values other than true or false anyway.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 17 Sep 2020 19:34:36 +0000 (15:34 -0400)]
pbrd: cleanup pbr ifp info if not sent to zebra
Properly cleanup the pbr interface data if nothing actually
gets sent to zebra, since we will never get the callback
notification from zapi to issue final deletion.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 17 Sep 2020 19:32:01 +0000 (15:32 -0400)]
pbrd: add return val for pbr_send_pbr_map()
Add a return val so caller can know if something was actually sent to
zebra here. Some things need to be cleanued up by the caller
if we arent getting a callback from zapi.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Chirag Shah [Sun, 27 Sep 2020 21:09:43 +0000 (14:09 -0700)]
zebra: avoid duplication node in l3vni l2vni-list
With l2vni flap leading to duplicate entry creation
in l3vni's l2vni-list.
Use list sorted add with no duplicates.
root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1000 1000 0 0 1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1000 1000 0 0 1002 1002
Ticket:CM-31545
Reviewed By:
Testing Done:
With Fix:
Multiple time flaps vni counts remained the same.
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# net show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1002
Donald Sharp [Tue, 29 Sep 2020 11:54:35 +0000 (07:54 -0400)]
zebra: Make nexthop_active check use the same debug
When debugging why a route was not successfully installed into the
rib, it would be preferable that the end user only have to turn
on `debug zebra rib detail` as that is what we have been telling
people to do for the last couple of years. Consolidate *back*
to this.
An adjacency should be removed when the holdtimer expires, but if the
system is overloaded we may end up doing it late. In the meanwhile vtysh
will display an incorrect value in the show isis neighbor output, due to
an overflow of the unsigned variable used to display the Holdtime, e.g.:
pe1# show isis neighbor
Area test:
System Id Interface L state Holdtime SNPA
Spirent-1 2.201 1 Down 26 2020.2020.2020
Spirent-1 2.203 1 Up 21 2020.2020.2020
Spirent-1 2.204 1 Up 18446744073709551615 2020.2020.2020
Spirent-1 2.207 1 Up 18446744073709551615 2020.2020.2020
Spirent-1 2.208 1 Up 18446744073709551615 2020.2020.2020
Spirent-1 2.209 1 Up 0 2020.2020.2020
Spirent-1 2.210 1 Up 18446744073709551615 2020.2020.2020
pe2 12.200 1 Up 30 2020.2020.2020
Guard against that by printing an "Expiring" message instead.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>