Donald Sharp [Wed, 29 Aug 2018 02:45:06 +0000 (22:45 -0400)]
staticd: Fix mixup in vrf translations
When we store the nexthop for ref-counting, keep
track of the nexthop vrf_id as well. This will allow
us to track the nexthop per vrf!
Additionally when we get the callback from zebra about
a nexthop update, iterate over all static routes to
see if the nexthop we are getting a callback is
one we are concerned about.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 3 Aug 2018 17:18:59 +0000 (13:18 -0400)]
lib: Add Aggregate Table and Aggregate_node
Add a abstraction for `struct route_node` and `struct route_table`
such that we can have an aggregate route_node and table. This
is because only bgp/rfapi and ripng use the aggregate data pointer
in `struct route_node`. For full route tables other routing
protocols and tables are paying a 8 byte overhead per node.
A full bgp table ends up being ~1.2 million routes in bgp
and zebra. This is not an insiginificant amount of data.
So create the data structures for this replacement, but
do not replace the aggregate pointer yet. This is because
later commits will convert rfapi and ripng over to this
new data, and finally we'll move the aggregate pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Philippe Guibert [Fri, 22 Jun 2018 14:01:07 +0000 (16:01 +0200)]
doc: add information about dynamic update of default vrf name
It is possible to dynamically change default VRF name, if vrf backend is
a netns backend. By creating a link to the default netns in
/var/run/netns folder, then the file name will be used to name the
default VRF. If no backend netns is chosen, it is explained that it is
still possible to statically configure the default vrf name to new
define.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 22 Jun 2018 14:03:11 +0000 (16:03 +0200)]
lib: protect newly created vrfs against default vrf naming.
Prevent from creating vrf, if the default vrf name is the same as the
vrf to be created.
Also, prevent at startup from creating default vrf with a name already
used in vrf list.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 31 May 2018 08:12:11 +0000 (10:12 +0200)]
bgpd: handle vrf aliases in vty API
Because a VRF name can be used for default VRF, or an alias of an
already created VRF can be passed as parameter, the default VRF name
must be found out. This avoids creating double BGP instances for
example.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Tue, 29 May 2018 09:17:10 +0000 (11:17 +0200)]
*: add a vrf update hook to be informed of the vrf name
The Vrf aliases can be known with a specific hook. That hook will then,
from zebra propagate the information to the relevant zapi clients.
The registration hook function is the same for all daemons.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Tue, 28 Aug 2018 12:50:16 +0000 (08:50 -0400)]
pimd: Add some more useful data to debug output
End user was seeing this debug but we are not giving
the user enough information to debug this on his own.
Add a tiny bit of extra information that could point
the user to solving the problem for themselves.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Christian Franke [Sat, 25 Aug 2018 15:50:03 +0000 (17:50 +0200)]
watchfrr: fix global restart
watchfrr needs to handle a SIGCHLD also when it calls a global restart
command. Before this patch, it would lead to the following behavior:
15:44:28: zebra state -> down : unexpected read error: Connection reset by peer
15:44:33: Forked background command [pid 6392]: /usr/sbin/frr.init watchrestart all
15:44:53: Warning: restart all child process 6392 still running after 20 seconds, sending signal 15
15:44:53: waitpid returned status for an unknown child process 6392
15:44:53: background (unknown) process 6392 terminated due to signal 15
15:45:13: Warning: restart all child process 6392 still running after 40 seconds, sending signal 9
15:45:33: Warning: restart all child process 6392 still running after 60 seconds, sending signal 9
15:45:53: Warning: restart all child process 6392 still running after 80 seconds, sending signal 9
15:46:13: Warning: restart all child process 6392 still running after 100 seconds, sending signal 9
15:46:33: Warning: restart all child process 6392 still running after 120 seconds, sending signal 9
15:46:53: Warning: restart all child process 6392 still running after 140 seconds, sending signal 9
This is obviously incorrect and can be fixed by comparing the pid to
the global restart object as well.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Donald Sharp [Sat, 25 Aug 2018 00:42:45 +0000 (20:42 -0400)]
staticd: refcount the nht add/removal
When we add / remove a nexthop that we need to track,
keep track of the number of times we have done this
for each nexthop. Consequently keep track of the
number of available nexthops, so that we can
just install new routes when we get one
that uses a pre-existing nexthop. Deletion of
nexthops is done on refcount going to 0.
Removal of routes is handled elsewhere for removal.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 23 Aug 2018 20:05:02 +0000 (16:05 -0400)]
zebra: When registering a nexthop, we do not always need to re-eval
The code prior to this change, was allowing clients to register
for nexthop tracking. Then zebra would look up the rnh and
send to that particular client any known data. Additionally
zebra was blindly re-evaluating the rnh for every registration.
This leads to interesting behavior in that all people registered
for that nexthop will get callbacks even if nothing changes.
Modify the code to know if we have evaluated the rnh or not
and if so limit the re-evaluation to when absolutely necessary
This is of particular importance to do because of nht callbacks
for protocols cause those protocols to do not insignificant
work and as more protocols are registering for nht callbacks
we will cause more work than is necessary.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
David Lamparter [Sat, 25 Aug 2018 00:12:42 +0000 (02:12 +0200)]
doc/user: add protocols vs. platform table
A nicely-formatted colorful table of all our daemons and target OS'.
Based off & intended to replace / extend
https://github.com/FRRouting/frr/wiki/Features-and-Kernel-Support
Signed-off-by: David Lamparter <equinox@diac24.net>
Chirag Shah [Fri, 24 Aug 2018 22:15:36 +0000 (15:15 -0700)]
ospfd: interface speed change during intf add
The problem is seen where speed mismatch caused ECMP route
not being reflected with correct number paths (NHs).
During cold boot, some interface speed updated by zebra as
part of one shot timer and triggers interface add to clients.
In this case, ospf already have created interface (bond interface),
but speed was not updated, trigger to do interface speed change
as part of interface add, which will trigger all Router LSA to
use updated speed into cost calculation.
Ticket:CM-22170
Testing Done:
Bring up CLOS config with Spine and leafs. Leaf have CLAG pair,
with same VRR ip address.
At spine one of the bond connecting to leaf node was having
higher speed than the paired device, With this fix, at spine (DUT)
bond interface speed is equal from all peer nodes.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Donald Sharp [Fri, 24 Aug 2018 19:21:04 +0000 (15:21 -0400)]
bgpd: Fix CONFDATE to 2019 for a couple of items.
While perusing CONFDATE I noticed that we had a couple
CONFDATE 201805, which we were not picking up( for other
reasons and fixed in a different PR ). But upon investigation
of these I noticed that the commits where in 201805, so these
CONFDATES should be in 2019
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 24 Aug 2018 14:56:15 +0000 (10:56 -0400)]
doc, lib, zebra: Remove deprecated encode and decode functionality
The ZEBRA_IPV4_ROUTE_[ADD|DELETE] and ZEBRA_IPV6_ROUTE_[ADD|DELETE] functionality
has been deprecated for a year now, let's remove this code from the system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 24 Aug 2018 14:49:20 +0000 (10:49 -0400)]
zebra: Remove unmaintained and uncompilable code
The zebra/client_main.c code is not being maintained or used.
Remove from system. Especially since the encode/decode
zapi functionality it `purports` to be testing is deprecated
and now being removed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Quentin Young [Wed, 22 Aug 2018 20:05:04 +0000 (20:05 +0000)]
bgpd: fix rpki exit command
If a command returns a nonzero exit status and VTYSH has a corresponding
command, VTYSH will skip executing its own version. If this happens in a
command that changes CLI nodes we get node desynchronization.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Donald Sharp [Thu, 23 Aug 2018 00:59:46 +0000 (20:59 -0400)]
lib: Limit depth of unused thread list
The master->unused list was unbounded during normal operation.
A full BGP feed on my machine left 11k threads on the unused
list, taking up over 2mb of data. This seemed a bit excessive,
reduce to a limit of 10.
Also fix a crash that this exposed where we assumed that a thread
structure was not deleted.
Future committers can make this configurable? or modify
the value to something better for their system. I am
dubious of the value of this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 16 Aug 2018 17:59:27 +0000 (13:59 -0400)]
lib: Remove default case statement from a enum driven switch
We are using a enum to drive a switch statement and we have
a default case statement that can never be entered because
we know all the enum states have been covered. Remove it
from the code as that it cannot happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 16 Aug 2018 17:51:13 +0000 (13:51 -0400)]
lib: Remove smux option for snmp
The smux.c code has not been able to compile for 2+ years
and no-one has noticed. Additionally net-snmp has marked
smux integration as deprecated for quite some time as well.
Since no-one has noticed and it's been broken and smux integration
is deprecated let's just remove this from the code base.
From looking at the code, it sure looks like SNMP could use
a decent cleanup.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 16 Aug 2018 16:21:25 +0000 (12:21 -0400)]
bgpd: convert zlog_warns to debugs or errors
Several zlog_warns were being used to tell the end
user that bgp had detected a bug. These all look like information
added during development that can be noted as debugs or logged
as an error situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>