Chirag Shah [Wed, 6 Nov 2019 02:30:56 +0000 (18:30 -0800)]
bgpd: fix memory leak in vni table for evpn routes
There is a memory leak of the bgp node (route node)
in vni table while processing evpn remote route(s).
During the remote evpn route processing, a new route
is created in per vni route table, the refcount for
the route node is incremented twice. First refcount
is incremented during the node creation and the second
one when the bgp_info_add is added.
Post evpn route creation, the bgp node refcount needs
to be decremented.
Ticket:CM-26898,CM-26838,CM-27169
Reviewed By:CCR-9474
Testing Done:
In EVPN topology send 1K MAC routes then check the memory footprint
at the remote VTEP before sending 1K type-2 routes
and after flushing/withdrawal of the routes.
After cleaning up 1K MAC entries from source VTEP which triggers BGP withdraw
at the remote VTEP.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 4008 32
BGP node : 2182 152 <-- Here 2K delta from initial count.
BGP route : 96 112
With fix:
---------
After 1K MAC entries cleaned up at the remote VTEP, the memory footprint
(BGP Node and Hash Bucket count) is equilibrium to start of the test.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Donald Sharp [Tue, 5 Nov 2019 12:35:36 +0000 (07:35 -0500)]
bgpd: use bgp->name_pretty in debugs and add vrf to some output
Recently had a case where I was attempting to debug a nexthop tracking
issue across multiple bgp vrf's and since the setup vrf's in it with
overlapping address ranges, it became real fun real fast to track
vrf data associated. Add a bit of code to allow us to figure out
what vrf we are in when we print out debug messages.
Look through the rest of the code and find debugs where we are
not using bgp->name_pretty and switch it over.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Configure route-map to set preferred global address on and apply
route-map-IN on R2 for R1-R2 session. Now check on R3's BGP and
RIB table has route nexthop as R1 link-local address, which is
not correct.
As of now we clear link-local address info from mp_nexthop_global,
only if mp_nexthop_global is populated with link-local address.
We should do it even if route-map is configured boz forwarding
link-local address from one link scope to another is violation of
the standards.
This commit make bgpd to skip and ignore unsupported
sub-type of PREFIX_SID. (especially new defined sub-type)
Current bgpd can't parase unsupported sub-type of PREFIX_SID.
PREFIX_SID is drafted on draft-ietf-idr-bgp-prefix-sid-27.
There are already new sub-type drafted on
draft-dawra-idr-srv6-vpn-05. (Type5,6 is new defined.)
This commit fix the problem reported as #5277 on GitBub.
Renato Westphal [Wed, 30 Oct 2019 22:28:56 +0000 (19:28 -0300)]
topotest: update the LDP VPLS test
The final test case of this topotest wasn't really testing
anything. Do the following changes to fix this:
* Remove "no link-detect" from the zebra configs so that link down
failures are detected immediately;
* After shutting down the interface between r1 and r2, wait five
seconds before testing if the pseudowires reconverged through
the r3 router.
Vincent Bernat [Sun, 3 Nov 2019 17:32:07 +0000 (18:32 +0100)]
debian: update debian/copyright
Some authors are added in the "GPL-2+" section, notably Alexandre
Cassen for the code in `vrrpd/`, and Cumulus Networks and Open Source
Routing which were uncredited despite many occurrence in the headers.
Stephen Worley [Fri, 1 Nov 2019 19:52:47 +0000 (15:52 -0400)]
zebra: separate zebra_vrf_lookup_table_with_id()
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.
Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).
Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.
This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.
=23464== Invalid read of size 8
==23464== at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464== by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464== by 0x48F3D45: vrf_delete (vrf.c:243)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== by 0x13D8C5: sigint (main.c:172)
==23464== by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464== by 0x48F0502: thread_fetch (thread.c:1417)
==23464== by 0x48AC82B: frr_run (libfrr.c:1023)
==23464== by 0x13DD02: main (main.c:483)
==23464== Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B25B8: qfree (memory.c:129)
==23464== by 0x48EA335: route_node_destroy (table.c:500)
==23464== by 0x48E967F: route_node_free (table.c:90)
==23464== by 0x48E9742: route_table_free (table.c:124)
==23464== by 0x48E9599: route_table_finish (table.c:60)
==23464== by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464== by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464== by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464== by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464== by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== Block was alloc'd at
==23464== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B24A2: qcalloc (memory.c:110)
==23464== by 0x48EA2FE: route_node_create (table.c:488)
==23464== by 0x48E95C7: route_node_new (table.c:66)
==23464== by 0x48E95E5: route_node_set (table.c:75)
==23464== by 0x48E9EA9: route_node_get (table.c:326)
==23464== by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464== by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464== by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464== by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464== by 0x19D689: zserv_process_messages (zserv.c:523)
==23464== by 0x48F09F8: thread_call (thread.c:1599)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Fri, 1 Nov 2019 14:10:10 +0000 (10:10 -0400)]
pimd: Do not spew a million warnings
We have a zlog_warn that is unguarded ( and really is a debug message )
as that there is nothing the end user can do and nothing to note
here other than a debug message to track refcounts. Change
to an appropriate debug and zlog_debug it instead.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
ospf: BFD down not tearing down OSPF adjacency for point-to-point network
Root Cause:
Lookup for the point-to-point neighbor was failing because the neighbor
lookup was based on neighbor interface IP address. But, for point-to-point
neighbor the key is router-id for lookup. Lookup failure was causing the
BFD updates from PTM to get dropped.
Fix:
Added walk of the neighbor list if the network type is point-to-point to
find the appropriate neighbor. The match is based on source IP address of
the neighbor since that’s the address registered with BFD for monitoring.
Renato Westphal [Thu, 17 Oct 2019 20:26:02 +0000 (17:26 -0300)]
tools: update the northbound callbacks generator
Add a new '-s' option which controls whether the generated northbound
callbacks are declared with the 'static' specifier or not. If not
(the default), a prototype is generated for each callback before
their declarations.
It's suggested that daemons shouldn't use the '-s' option so that
their northbound callbacks can be implemented in different files
according to their class (config, state, rpc or notification).
libfrr commands, on the other hand, can use the '-s' option when
their associated YANG module is too small and putting all callbacks
in the same file is desirable.
Renato Westphal [Thu, 17 Oct 2019 20:11:17 +0000 (17:11 -0300)]
bfdd: split northbound callbacks into multiple files
Rearrange the bfdd northbound callbacks as following:
* bfd_nb.h: prototypes of all northbound callbacks.
* bfd_nb.c: definition of all northbound callbacks and their
associated YANG data paths.
* bfd_nb_config.c: implementation of YANG configuration nodes.
* bfd_nb_state.c: implementation of YANG state nodes.
This should help to keep to code more organized and easier to
maintain.
Renato Westphal [Thu, 17 Oct 2019 19:03:09 +0000 (16:03 -0300)]
ripngd: split northbound callbacks into multiple files
Rearrange the ripngd northbound callbacks as following:
* ripng_nb.h: prototypes of all northbound callbacks.
* ripng_nb.c: definition of all northbound callbacks and their
associated YANG data paths.
* ripng_nb_config.c: implementation of YANG configuration nodes.
* ripng_nb_state.c: implementation of YANG state nodes.
* ripng_nb_rpcs.c: implementation of YANG RPCs.
This should help to keep to code more organized and easier to
maintain.
Renato Westphal [Thu, 17 Oct 2019 18:46:54 +0000 (15:46 -0300)]
ripd: split northbound callbacks into multiple files
Rearrange the ripd northbound callbacks as following:
* rip_nb.h: prototypes of all northbound callbacks.
* rip_nb.c: definition of all northbound callbacks and their
associated YANG data paths.
* rip_nb_config.c: implementation of YANG configuration nodes.
* rip_nb_state.c: implementation of YANG state nodes.
* rip_nb_rpcs.c: implementation of YANG RPCs.
* rip_nb_notifications.c: implementation of YANG notifications.
This should help to keep to code more organized and easier to
maintain.
Renato Westphal [Thu, 17 Oct 2019 18:33:53 +0000 (15:33 -0300)]
isisd: split northbound callbacks into multiple files
Rearrange the isisd northbound callbacks as following:
* isis_nb.h: prototypes of all northbound callbacks.
* isis_nb.c: definition of all northbound callbacks and their
associated YANG data paths.
* isis_nb_config.c: implementation of YANG configuration nodes.
* isis_nb_state.c: implementation of YANG state nodes.
* isis_nb_notifications.c: implementation of YANG notifications.
This should help to keep to code more organized and easier to
maintain.
SumitAgarwal123 [Mon, 21 Oct 2019 05:53:01 +0000 (22:53 -0700)]
bfdd: Handling local and remote admin-down
Scenarios where this code change is required:
1. BFD is un-configured from BGP at remote end.
Neighbour BFD sends ADMIN_DOWN state, but BFD on local side will send
DOWN to BGP, resulting in BGP session DOWN.
Removing BFD session administratively shouldn't bring DOWN BGP session
at local or remote.
2. BFD is un-configured from BGP or shutdown locally.
BFD will send state DOWN to BGP resulting in BGP session DOWN.
(This is akin to saying do not use BFD for BGP)
Removing BFD session administratively shouldn't bring DOWN BGP session at
local or remote.
Signed-off-by: Sayed Mohd Saquib sayed.saquib@broadcom.com
Donald Sharp [Sat, 26 Oct 2019 01:10:31 +0000 (21:10 -0400)]
bgpd: bgp_path_info_mpath_next only returns values
Since we don't set a value from the return of bgp_path_info_mpath_next
it is impossible for this function to do anything as such the if statement
is dead code as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Stephen Worley [Wed, 23 Oct 2019 22:28:10 +0000 (18:28 -0400)]
zebra: Cleanup zebra_nhg APIs
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.
Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 23 Oct 2019 20:49:07 +0000 (16:49 -0400)]
zebra: Re-work zebra_nhg_*_valid APIs
Re-work the validity setting and checking APIs
for nhg_hash_entry's to make them clearer.
Further, they were originally only beings set
on ifdown and install. Extended their use into
releasing entries and to account for setting
the validity of a recursive dependent.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 23 Oct 2019 19:07:22 +0000 (15:07 -0400)]
zebra: Improve commenting for group requeue case
The commenting for why we would need to requeue a
group from the kernel to be later processed was not
sufficient. Add a better explanation for the flow
and state of the system.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 23 Oct 2019 18:50:30 +0000 (14:50 -0400)]
zebra: Change wording of duplicate kernel nhg flag
Change the wording of the flag indicating we have received
a nexthop group from the kernel with a different ID but
is fundamentally identical to one we already have.
It was colliding with a flag of similar name in the nexthop struct.
Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE
since it is in fact unhashable.
Also change the wording of functions and comments referencing the same
problem.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 23 Oct 2019 18:08:12 +0000 (14:08 -0400)]
zebra: Check depends for validity, not dependents
When determining whether to set the nhg_hash_entry as
invalid, we should have been checking the depends, not
the dependents. If its a group and at least one of its
depends is valid, the group is still valid.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
zebra: during shutdown processing, drop dplane results
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
```
Adjusted nhg uninstall handling to clear data and other
cleanup before sending to the dataplane.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Fri, 4 Oct 2019 17:54:39 +0000 (13:54 -0400)]
pbrd: nexthop_group delete cb don't free pbr->nhg
The pbr->nhg callback is used exclusively for individual nexthops
set through `set nexthop`. If an actuall "tracked" nexthop_group is
used, only the `pbrms->nhgrp_name` is set. Thus this delete does
nothing.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 24 Sep 2019 22:40:09 +0000 (18:40 -0400)]
zebra: Use ng pointer in mpls_ftn_uninstall
With the new nexthop group shared memory framework, pointers
are being used in route_entry for the nexthop_group. Update
the use of this in `mpls_ftn_uninstall()` to reflect the change.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 9 Sep 2019 17:44:17 +0000 (13:44 -0400)]
doc: Add docs for new nexthop group commands
Add some doc information for the new nexthop group commands. Also
had to add some for ones that were missing, which we are adding
additional commands to.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 3 Sep 2019 17:53:45 +0000 (13:53 -0400)]
topotests: Expect shared nexthop memory
A few topotests were failing since they were not aware
of shared nexthops and, therefore, matching on flags that
could be changed when another route sharing that nexthop is
installed.
Update routes that are not installed to not match their json output
on the nexthop flag information. The ones that are installed will
still retain their matches though since they can be sure the nexthop
should have those flags (they would be the route that set them).
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>