Don Slice [Thu, 2 Apr 2020 20:17:33 +0000 (20:17 +0000)]
ospf6d: stop looping thru same Inter-Area Router LSAs
Processing loop uncovered when there are multiple ABRs also
acting as ASBRs into the same area in ospf6. The problem
was that when looking thru the list of Inter-area router
entries, if the current entry being processed matched, it
still merged next-hops and re-initiated the process. In
this fix, if the route/path matches and the next-hops also
match, there is no need to re-initiate the examine process.
Ticket: CM-28900 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Stephen Worley [Mon, 6 Jan 2020 18:33:45 +0000 (13:33 -0500)]
zebra: reset nexthop pointer in zread of nexthops
We were not resetting the nexthop pointer to NULL for each
new read of a nexthop from the zapi route. On the chance we
get a nexthop that does not have a proper type, we will not
create a new nexthop and update that pointer, thus it still
has the last valid one and will create a group with two
pointers to the same nexthop.
Then when it enters any code that iterates the group, it loops
endlessly.
This was found with zapi fuzzing.
```
0x00007f728891f1c3 in jhash2 (k=<optimized out>, length=<optimized out>, initval=12183506) at lib/jhash.c:138
0x00007f728896d92c in nexthop_hash (nexthop=<optimized out>) at lib/nexthop.c:563
0x00007f7288979ece in nexthop_group_hash (nhg=<optimized out>) at lib/nexthop_group.c:394
0x0000000000621036 in zebra_nhg_hash_key (arg=<optimized out>) at zebra/zebra_nhg.c:356
0x00007f72888ec0e1 in hash_get (hash=<optimized out>, data=0x7ffffb94aef0, alloc_func=0x0) at lib/hash.c:138
0x00007f72888ee118 in hash_lookup (hash=0x7f7288de2f10, data=0x7f728908e7fc) at lib/hash.c:183
0x0000000000626613 in zebra_nhg_find (nhe=0x7ffffb94b080, id=0, nhg=0x6020000032d0, nhg_depends=0x0, vrf_id=<optimized out>,
afi=<optimized out>, type=<optimized out>) at zebra/zebra_nhg.c:541
0x0000000000625f39 in zebra_nhg_rib_find (id=0, nhg=<optimized out>, rt_afi=AFI_IP) at zebra/zebra_nhg.c:1126
0x000000000065f953 in rib_add_multipath (afi=AFI_IP, safi=<optimized out>, p=0x7ffffb94b370, src_p=0x0, re=0x6070000013d0,
ng=0x7f728908e7fc) at zebra/zebra_rib.c:2616
0x0000000000768f90 in zread_route_add (client=0x61f000000080, hdr=<optimized out>, msg=<optimized out>, zvrf=<optimized out>)
at zebra/zapi_msg.c:1596
0x000000000077c135 in zserv_handle_commands (client=<optimized out>, msg=0x61b000000780) at zebra/zapi_msg.c:2636
0x0000000000575e1f in main (argc=<optimized out>, argv=<optimized out>) at zebra/main.c:309
```
Stephen Worley [Mon, 6 Jan 2020 17:58:41 +0000 (12:58 -0500)]
zebra: don't created connected if duplicate depend
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.
The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.
This is a pretty unlikely situation to happen.
Also, pull up the RB handling of _del RB API as well.
This was found with the zapi fuzzing code.
```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840== by 0x48E1008: qcalloc (memory.c:110)
==1052840== by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840== by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840== by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840== by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840== by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840== by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840== by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840== by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840== by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840== by 0x428B11: main (main.c:309)
```
Stephen Worley [Tue, 28 Jan 2020 20:20:18 +0000 (15:20 -0500)]
zebra: add debug for duplicate NH in dataplane array conversion
When we find a nexthop ID thats a duplicate in the code that converts
NHG rb trees into a flat list of nexthop IDs for the dataplane,
output a debug message.
Stephen Worley [Tue, 28 Jan 2020 19:33:10 +0000 (14:33 -0500)]
zebra: don't add ID to kernel nh_grp if not installed/queued
When we transform the nexthop group rb trees into a flat
array of IDs to send into the dataplane code (zebra_nhg_nhe2grp),
don't put an ID in there that has not been in installed or is
not currently queued to be installed into the dataplane.
Otherwise, if some of the nexthops fail to install, we will
still try to create a group with them and then the entire group
will fail.
Stephen Worley [Tue, 28 Jan 2020 00:36:01 +0000 (19:36 -0500)]
zebra: handle NHG in NHG dataplane group conversion
We were not properly handling the case of a NHG inside of
another NHG when converting the rb tree of a multilevel NHG
into a flat list of IDs. When constructing, we call the function
zebra_nhg_nhe2grp_internal() recursively so that the rare
case of a group within a group is handled such that its
singleton nexthops are appended to the grp array of IDs
we send to the dataplane code.
Ex)
1:
-> 2:
-> 3
-> 4
->5:
->6
becomes this:
1:
->3
->4
->6
when its sent to the dataplane code for final kernel installation.
David Lamparter [Thu, 2 Apr 2020 19:16:04 +0000 (21:16 +0200)]
bgpd, ospfd, ospf6d: long is not bool :(
... Oops ...
(for context, the defaults code originally didn't have a dedicated
"bool" variant and just used long for bools... I derp'd this when
adding bool as a separate case :( )
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Lamparter <equinox@diac24.net>
(cherry picked from commit 4c1458b595282bff6a6e0b20767bb5cb655d0b4c)
Stephen Worley [Wed, 1 Apr 2020 19:31:40 +0000 (15:31 -0400)]
zebra: free unhashable (dup) NHEs via ID table cleanup
Free unhashable (duplicate NHEs from the kernel) via ID table
cleanup. Since the NHE ID hash table contains extra entries,
that's the one we need to be calling zebra_nhg_hash_free()
on, otherwise we will never free the unhashable NHEs.
This was found via a memleak:
==1478713== HEAP SUMMARY:
==1478713== in use at exit: 10,267 bytes in 46 blocks
==1478713== total heap usage: 76,810 allocs, 76,764 frees, 3,901,237 bytes allocated
==1478713==
==1478713== 208 (88 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 35 of 41
==1478713== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1478713== by 0x48E35E8: qcalloc (memory.c:110)
==1478713== by 0x451CCB: zebra_nhg_alloc (zebra_nhg.c:369)
==1478713== by 0x453DE3: zebra_nhg_copy (zebra_nhg.c:379)
==1478713== by 0x452670: nhg_ctx_process_new (zebra_nhg.c:1143)
==1478713== by 0x4523A8: nhg_ctx_process (zebra_nhg.c:1234)
==1478713== by 0x452A2D: zebra_nhg_kernel_find (zebra_nhg.c:1294)
==1478713== by 0x4326E0: netlink_nexthop_change (rt_netlink.c:2433)
==1478713== by 0x427320: netlink_parse_info (kernel_netlink.c:945)
==1478713== by 0x432DAD: netlink_nexthop_read (rt_netlink.c:2488)
==1478713== by 0x41B600: interface_list (if_netlink.c:1486)
==1478713== by 0x457275: zebra_ns_enable (zebra_ns.c:127)
Repro with:
ip next add id 1 blackhole
ip next add id 2 blackhole
Martin Winter [Fri, 20 Mar 2020 22:50:29 +0000 (23:50 +0100)]
tests: Make topotest working on different locale
"sort" as used in all-protocol-startup used sort which causes
different sort order based on locale settings. Specify the
correct one to make output matching our expected result
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Donald Sharp [Tue, 11 Feb 2020 00:25:52 +0000 (19:25 -0500)]
bgpd: Update failed reason to distinguish some NHT scenarios
Current failed reasons for bgp when you have a peer that
is not online yet is `Waiting for NHT`, even if NHT has
succeeded. Add some code to differentiate this.
eva# show bgp ipv4 uni summ failed
BGP router identifier 192.168.201.135, local AS number 3923 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 43 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.44.1 0 0 never Waiting for NHT
192.168.201.139 0 0 never Waiting for Open to Succeed
Total number of neighbors 2
eva#
eva# show bgp nexthop
Current BGP nexthop cache:
192.168.44.1 invalid, peer 192.168.44.1
Must be Connected
Last update: Mon Feb 10 19:05:19 2020
Martin Winter [Fri, 14 Feb 2020 14:03:09 +0000 (15:03 +0100)]
FRRouting Release 7.3
BGPd
EVPN PIP Support
Route Aggregation code speed ups
BGP Vector I/O speed ups
New CLI: `set distance XXX`
New CLI: `aggregate-address A.B.C.D/M route-map WORD`
New CLI: `bgp reject-as-sets`
New CLI: `advertise pip ...`
New CLI: `match evpn rd ASN:NN_OR_IP-ADDRESS:NN`
New CLI: `show bgp l2vpn evpn community|large-community X`
New CLI: `show bgp l2vpn evpn A.B.C.D`
Auto-completion for clear bgp command
Add ability to set tcp socket buffer size
OSPFd
Partial MPLS TE support
PBRd
New CLI: `set vrf unchanged|NAME`
BFDd
VRF Support
New CLI: 'show bfd peers brief'
New CLI: 'clear bfd peer ...'
PIMd
Significant Speedups in accessing Internal Data for higher scale
Support for joining any-source Multicast
Updated CLI: 'show ip pim upstream-join-desired'
New CLI: 'show ip pim channel'
Debug Cleanup
MLAG experimental support
VRRPd
VRF Support
Northbound Conversion- NHRPd
vtysh
New CLI: `banner motd line LINE...`
yangx
New CLI: `show yang operational-data XPATH`
New CLI: `debug northbound`
Zebra
Nexthop Group support
New CLI: 'debug zebra nexthop [detail]'
New CLI: 'show router-id'
MLAG experimental support
watchfrr
Additional status messages of system state to systemd
New CLI: `watchfrr ignore DAEMON`
Others
As always all daemons have received too many bug fixes to fully list
There has been a significant focus on increasing test coverage
Change in Behavior:
ISISd
All areas created default automatically to level-1-2
Zebra
Nexthop Group Installation in Kernel is turned on by default
if the kernel supports- New CLI: 'show nexthop-group rib [singleton]'
Man Pages
Renamed to frr-* to remove collision with other packages
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Donald Sharp [Mon, 13 Jan 2020 21:11:46 +0000 (16:11 -0500)]
zebra: nexthop groups vrf's are only a function of namespaces
Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.
Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 13 Jan 2020 21:00:33 +0000 (16:00 -0500)]
zebra: Modify 'show nexthop-group rib ip|ipv6'
The zebra implementation of nexthop groups has
two types of nexthops groups currently. Singleton
objects which have afi's and combined nexthop groups
that do not. Specifically call this out in the code
to make this distinction.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 6 Feb 2020 02:02:25 +0000 (21:02 -0500)]
bgpd: Remove prefix pointer creation
The creation of a prefix pointer is unnecessary. Save the
prefix as part of the actual data structure. This will
reduce the data needed by 8 bytes per nexthop stored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Wed, 22 Jan 2020 20:22:27 +0000 (12:22 -0800)]
bgpd: fix memory leak in evpn json outputs
Found memory leak in json output of evpn's route
commands.
After executing 'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json' few times
(6 times) with more than 600 routes in total seeing
memory footprint for bgpd continue to grow.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 12 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 8390 KiB
Free small blocks: 1760 bytes
Free ordinary blocks: 3762 KiB
Ordinary blocks: 1161
Small blocks: 51
Holding blocks: 0
Ticket:CM-27920
Testing Done:
After fix:
excute few times,
'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json'
commands where used ordinary blocks (uordblks) is
in steady state.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 9968 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6486 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3482 KiB
Ordinary blocks: 1110
Small blocks: 54
Holding blocks: 0
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 10100 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6488 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3612 KiB
Ordinary blocks: 1113
Small blocks: 54
Holding blocks: 0
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Broke onlink behavior and as a result ospf unnumbered failed
to work. This commit adds a small test to create 2 ospf routers,
connect them through ospf unlinked behavior and then ensure
that the routes are installed into the kernel as expected.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This code combined the different types of nexthop encoding
being done in the zapi protocol. What was missed that
resolved nexthops of type NEXTHOP_TYPE_IPV4|6 have an ifindex
value that was not being reported. This commit ensures
that we always send this data( even if it is 0).
The following test commit will ensure that this stays working
as is expected by an upper level protocol.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mitchell Skiba [Thu, 9 Jan 2020 19:46:13 +0000 (11:46 -0800)]
bgpd: add addpath ID to adj_out tree sort
When withdrawing addpaths, adj_lookup was called to find the path that
needed to be withdrawn. It would lookup in the RB tree based on subgroup
pointer alone, often find the path with the wrong addpath ID, and return
null. Only the path highest in the tree sent to the subgroup could be
found, thus withdrawn.
Adding the addpath ID to the sort criteria for the RB tree allows us to
simplify the logic for adj_lookup, and address this problem. We are able
to remove the logic around non-addpath subgroups because the addpath ID
is consistently 0 for non-addpath adj_outs, so special logic to skip
matching the addpath ID isn't required. (As a side note, addpath will
also never use ID 0, so there won't be any ambiguity when looking at the
structure content.)
!
nexthop-group red
nexthop 1.1.1.1
nexthop 1.1.1.2
!
sharp install routes 8.8.8.1 nexthop-group red 1
=========================================
==11898== Invalid write of size 8
==11898== at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254)
==11898== by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296)
==11898== by 0x453593: handle_recursive_depend (zebra_nhg.c:481)
==11898== by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572)
==11898== by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597)
==11898== by 0x4536B4: depends_find (zebra_nhg.c:1065)
==11898== by 0x453526: depends_find_add (zebra_nhg.c:1087)
==11898== by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567)
==11898== by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==11898== by 0x452268: nexthop_active_update (zebra_nhg.c:1729)
==11898== by 0x461517: rib_process (zebra_rib.c:1049)
==11898== by 0x4610C8: process_subq_route (zebra_rib.c:1967)
==11898== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Zebra crashes because we weren't handling the case of the depend nexthop
being recursive.
For this case, we cannot make the function more efficient. A nexthop
could resolve to a group of any size, thus we need allocs/frees.
To solve this and retain the goal of the original patch, we separate out the
two cases so it will still be more efficient if the nexthop is not recursive.
Stephen Worley [Fri, 3 Jan 2020 17:35:15 +0000 (12:35 -0500)]
zebra: just set nexthop member in handle_recursive_depend()
With recent changes to the lib nexthop_group
APIs (e1f3a8eb193267da195088cc515b598ae5a92a12), we are making
new assumptions that this should be adding a single nexthop
to a group, not a list of nexthops.
This broke the case of a recursive nexthop resolving to a group:
```
D> 2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09
* via 1.1.1.1, dummy1 onlink, 00:00:09
via 1.1.1.2 (recursive), 00:00:09
* via 1.1.1.2, dummy2 onlink, 00:00:09
D> 3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04
* via 1.1.1.1, dummy1 onlink, 00:00:04
K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21
```
This group can instead just directly point to the nh that was passed.
Its only being used for a lookup (the memory gets copied and used
elsewhere if the nexthop is not found).
Donald Sharp [Fri, 10 Jan 2020 20:13:36 +0000 (15:13 -0500)]
zebra: Actually add the NLA_F_NESTED flag to our code
The existing usage of the rta_nest and addattr_nest
functions were not adding the NLA_F_NESTED flag
to the type. As such the new nexthop functionality was
actually looking for this flag, while apparently older
code did not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Stephen Worley [Fri, 13 Dec 2019 01:14:51 +0000 (20:14 -0500)]
pimd: allow pimd to handle nexthop_lookup zapi error
Allow pimd to stop the lookup if zebra tells pimd that the
lookup failed due to a zapi error. Otherwise, it will keep
waiting for a nexthop message that will never come.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 17 Dec 2019 22:00:52 +0000 (17:00 -0500)]
lib,zebra: add zapi msg top level error handling
Add error handling for top level failures (not able to
execute command, unable to find vrf for command, etc.)
With this error handling we add a new zapi message type
of ZEBRA_ERROR used when we are unable to properly handle
a zapi command and pass it down into the lower level code.
In the event of this, we reply with a message of type
enum zebra_error_types containing the error type.
Mark Stapp [Tue, 17 Dec 2019 16:31:17 +0000 (11:31 -0500)]
zebra: make current show nexthop-group cli zebra-specific
There's confusion between the nexthop-group configuration and a
zebra-specific show command. For now, make the zebra show
command string RIB-specific until we're able to unify these
paths.
Donald Sharp [Tue, 7 Jan 2020 14:03:08 +0000 (09:03 -0500)]
doc: Clarify what is supported directly in PIM documentation
The FRR community keeps getting asked about what is supported or not.
Try to clarify in an additional spot what is and what is not supported.
Where people interested in using PIM might have a chance at actually
seeing the notification.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>