Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripdng: clear list of peers when RIPng is deconfigured
This is an old standing bug where the list of RIPng peers wasn't
cleared after deconfiguring RIPng, which caused the existing peers
to still be present on a newly configured RIPng instance (except
when the timed out after ~3 minutes). Fix this.
Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripngd: simplify cleaning up of routing instance
* Call ripng_clean() only when RIPng is configured, this way we can
remove one indentation level from this function.
* ripng_redistribute_clean() is only called on shutdown, so there's
no need to call ripng_redistribute_withdraw() there since the RIPng
table is already cleaned up elsewhere.
* Deallocate the ripng structure only at the end of the function. This
prepares the ground for the next commits where all global variables
will be moved to the ripng structure.
Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripd: remove the rip global variable
This is the last step to make ripd ready for multi-instance support.
Remove the rip global variable and add a "rip" parameter to all
functions that need to know the RIP instance they are working
on. On some functions, retrieve the RIP instance from the interface
variable when it exists (this assumes interfaces can pertain to
one RIP instance at most, which is ok for VRF support).
In preparation for the next commits (VRF support), add a "vrd_id"
member to the rip structure, and use rip->vrf_id instead of
VRF_DEFAULT wherever possible.
Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripd: clear list of peers when RIP is deconfigured
This is an old standing bug where the list of RIP peers wasn't
cleared after deconfiguring RIP, which caused the existing peers
to still be present on a newly configured RIP instance (except when
the timed out after ~3 minutes). Fix this.
Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripd: move global counters to the rip structure
The only sideeffect of this change is that these counters will be
reset when RIP is deconfigured and then configured again, but this
shouldn't be a problem as the RIP MIB isn't specific about this.
Renato Westphal [Fri, 4 Jan 2019 21:08:10 +0000 (19:08 -0200)]
ripd: simplify cleaning up of routing instance
* Call rip_clean() only when RIP is configured, this way we can
remove one indentation level from this function.
* rip_redistribute_clean() is only called on shutdown, so there's
no need to call rip_redistribute_withdraw() there since the RIP
table is already cleaned up elsewhere.
* There's no need to clean up the "rip->neighbor" nodes manually before
calling route_table_finish().
* Deallocate the rip structure only at the end of the function. This
prepares the ground for the next commits where all global variables
will be moved to the rip structure.
Renato Westphal [Fri, 4 Jan 2019 21:08:09 +0000 (19:08 -0200)]
ripngd: fix valgrind warning about uninitialized memory usage
Fixes the following warning when running ripngd with valgrind:
==38== Syscall param sendmsg(msg.msg_control) points to uninitialised
byte(s)
==38== at 0x5EA1E47: sendmsg (sendmsg.c:28)
==38== by 0x118C48: ripng_send_packet (ripngd.c:226)
==38== by 0x11D1D6: ripng_request (ripngd.c:1924)
==38== by 0x120BD8: ripng_interface_wakeup (ripng_interface.c:666)
==38== by 0x4ECB4B4: thread_call (thread.c:1601)
==38== by 0x4E8D9CE: frr_run (libfrr.c:1011)
==38== by 0x1121C8: main (ripng_main.c:180)
==38== Address 0xffefffc34 is on thread 1's stack
==38== in frame #1, created by ripng_send_packet (ripngd.c:172)
Renato Westphal [Mon, 14 Jan 2019 18:29:18 +0000 (16:29 -0200)]
doc: update build instructions for freebsd on how to obtain libyang
Unfortunately the first version of the FreeBSD libyang port contained
a bug in which the libyang pkginfo file wasn't being installed
correctly in the system, and this prevented the FRR build system from
detecting the library. This bug was already fixed months ago but some
FreeBSD package repositories still have the old bugged version of the
port. This means we can't suggest people to install libyang using
"pkg install" since this causes problems for most people. In this
case, suggest FreeBSD users to build and install libyang manually
as we suggest for other BSD platforms.
This commit should be reverted once all FreeBSD package repositories
are updated with the new version of the libyang port.
Renato Westphal [Mon, 14 Jan 2019 18:29:18 +0000 (16:29 -0200)]
lib: update suggestions related to some northbound errors
Since commit 3a11599c, the FRR YANG modules are embedded inside the
binaries and no longer need to be loaded from the file system. This
way, it's impossible for the FRR binaries and YANG modules to be out
of sync anymore. As such, update the suggestions of the northbound
error codes.
Renato Westphal [Mon, 14 Jan 2019 18:29:18 +0000 (16:29 -0200)]
lib: don't abort when incomplete xpath is given by the user
Instead of aborting when an incomplete xpath is given to the
nb_oper_data_iterate() function, just return an error so that the
callers have a chance to treat this error. Aborting based on invalid
user input is never the right thing to do.
Renato Westphal [Mon, 14 Jan 2019 18:29:18 +0000 (16:29 -0200)]
lib: fix "use of uninitialised value" valgrind warning
When FRR is built without the --enable-config-rollbacks option,
the nb_db_transaction_save() function does nothing and the
"transaction_id" output parameter is left uninitialized. For
this reason, all northbound clients should initialize the
"transaction_id" argument before calling nb_candidate_commit() or
nb_candidate_commit_apply() (except when a NULL pointer is given,
which is the case of the confd and sysrepo plugins).
Renato Westphal [Mon, 14 Jan 2019 18:29:18 +0000 (16:29 -0200)]
lib: fix "may be used uninitialized" build warning
We are already handling all possible four cases from the "nb_event"
enumeration, so this problem can't happen in practice. Initialize the
"ref" variable to zero to silence the warning.
Renato Westphal [Fri, 11 Jan 2019 21:20:13 +0000 (19:20 -0200)]
lib, zebra: add AFI parameter to the ZEBRA_REDISTRIBUTE_DEFAULT_* messages
Some daemons like ospfd and isisd have the ability to advertise a
default route to their peers only if one exists in the RIB. This
is what the "default-information originate" commands do when used
without the "always" parameter.
For that to work, these daemons use the ZEBRA_REDISTRIBUTE_DEFAULT_ADD
message to request default route information to zebra. The problem
is that this message didn't have an AFI parameter, so a default route
from any address-family would satisfy the requests from both daemons
(e.g. ::/0 would trigger ospfd to advertise a default route to its
peers, and 0.0.0.0/0 would trigger isisd to advertise a default route
to its IPv6 peers).
Fix this by adding an AFI parameter to the
ZEBRA_REDISTRIBUTE_DEFAULT_{ADD,DELETE} messages and making the
corresponding code changes.
Donald Sharp [Thu, 3 Jan 2019 18:50:23 +0000 (13:50 -0500)]
zebra: Add a switch statement for rib_process_after
Future commits are going to introduce more rigor in
state setting in the case of received results from
the data plane. So let us move the DPLANE_OP_ROUTE_DELETE
state check to the same spot as the rest of the code that
is handling a particular operation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 3 Jan 2019 16:56:35 +0000 (11:56 -0500)]
zebra: Limit meta_queue insertion to one time.
Modify the meta_queue insertion such that we only enqueue
the route_node into one meta_queue instead of several.
Suppose we have multiple route_entries associated with
a particular node from rip, bgp, staticd. If we receive a
route update from rip, we would enqueue the route_node into
the 1, 2, 3 meta-nodes. Which means that we would run
the entire process of figuring out a route 3 times, while
nothing would change the second two times.
Modify the code to choose the lowest meta-queue and
install it into that one for processing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 9 Jan 2019 23:28:10 +0000 (18:28 -0500)]
lib: Add another 32 bit accessor to the prefix data structure
It would be nice to have the ability to access the prefix data structure
address as a block of 4 uint32_t's. This will allow me to easily/quickly
update the v6 address by 1. This will be used in subsuquent commits.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Fri, 21 Dec 2018 19:12:33 +0000 (14:12 -0500)]
zebra: pass lists of results from dataplane to zebra
Pass lists of results back to zebra from the dataplane subsystem
(and pthread). This helps reduce the lock/unlock cycles when
zebra is busy. Also remove a couple of typedefs that made their
way into the dataplane header file - those violate the FRR style
guidelines.
Donald Sharp [Wed, 9 Jan 2019 19:59:22 +0000 (14:59 -0500)]
lib, bgpd: Convert frr_pthread_set_name to only cause it to set os name of the thread
The current invocation of frr_pthread_set_name was causing it reset the os_name.
There is no need for this, we now always create the pthread appropriately
to have both name and os_name. So convert this function to a simple
call through of the pthread call now.
Before(any of these changes):
sharpd@robot ~/frr1> ps -L -p 16895
PID LWP TTY TIME CMD
16895 16895 ? 00:01:39 bgpd
16895 16896 ? 00:00:54
16895 16897 ? 00:00:07 bgpd_ka
Donald Sharp [Wed, 9 Jan 2019 19:32:44 +0000 (14:32 -0500)]
lib: Cleanup thread name setting to happen at start
When we start a thread we always call fpt_run and since
the last commit we know os_name is filled with something,
therefore we can just set the name on startup.
I removed the abstraction to frr_pthread_set_name because
it was snprintf'ing into the same buffer which was the
real bug here( the first character of os_name became null).
In the next commit I'll remove that api because
it is unneeded and was a horrible hack to get
this to work for the one place it was wanted.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 9 Jan 2019 18:41:46 +0000 (13:41 -0500)]
lib: On frr_pthread_new save a os_name
On call of frr_pthread_new, save the os_name if given,
if not given use the name passed in( shortening to fit
in available space ) and finally if the name was not
passed in use the default value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Philippe Guibert [Fri, 28 Dec 2018 13:27:45 +0000 (14:27 +0100)]
zebra: do not create vrf if name already set to default vrf at startup
if the default vrf name is manually set, by passing -o parameter to
zebra, then this should be detected when walking the list of netns
available in the system. If a netns called vrf0 is present, then it
should be ignored.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 21 Dec 2018 15:25:20 +0000 (16:25 +0100)]
zebra: start the netns notification mechanism after ns initialisation
when zebra is run, by using vrf netns backend mode, then the parser
detector of netns is run before forcing the default vrf to a possible
value. In that case, there is a possibility that the forced '-o' option
will create a second vrf with same name, whereas this option should be
there to uniquely have a default vrf with a value.
To make things consistent, the forced value will be priorised. Then, the
notifier will attempt to create vrf contexts. The expectation is that
the creation will fail, due to an already present vrf with same name.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Wed, 9 Jan 2019 17:18:21 +0000 (12:18 -0500)]
lib: Convert RUSAGE_SELF to RUSAGE_THREAD where we can
When using getrusage, we have multiple choices about what
to call for data gathering about this particular thread of execution.
RUSAGE_SELF -> This means gather all cpu run time for all pthreads associated
with this process.
RUSAGE_THREAD -> This means gather all cpu run time for this particular
pthread.
Clearly with data gathering for slow thread as well as `show thread cpu`
it would be preferable to gather only data about the current running
pthread. This probably was the original behavior of using RUSAGE_SELF
when we didn't have multiple pthreads. So it didn't matter so much.
Prior to this change, 10 iterations of 1 million routes install/remove
from zebra would give us this cpu time for the dataplane pthread:
Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs Type Thread
0 280902.149 326541 860 2609982 550 2468910 E dplane_thread_loop
After this change we are seeing this:
Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs Type Thread
0 58045.560 334944 173 277226 539 2502268 E dplane_thread_loop
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 9 Jan 2019 13:48:37 +0000 (08:48 -0500)]
bgpd: Do not send a label to zebra that it doesn't understand
When using an `import vrf` mechanism we are marking
the vrf label as BGP_PREVENT_VRF_2_VRF_LEAK, and then sending
this down to zebra. Since zebra knows nothing about this special
value, convert it to a value that it does know MPLS_LABEL_NONE.
And zebra kept the label as:
donna.cumulusnetworks.com# show mpls table
Inbound Outbound
Label Type Nexthop Label
-------- ------- --------------- --------
-2 BGP GREEN
-2 BGP BLUE
After this fix, neither the labels are stored in zebra nor do we see
the log error message.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 9 Jan 2019 01:23:11 +0000 (20:23 -0500)]
bgpd: Further refine hash lookup to store hash value
Further refine the previous commit to store the hash value in
both the `struct community_list` as well as the `struct rmap_community`
structures. This allows us to know a priori what our hash value
is. This change cuts another couple of seconds of convergence
off to ~55 seconds and further reduces cpu load of bgp:
16 40061.706 433732 92 330102 129 1242965 RWTEX TOTAL
Down from ~43 seconds previously.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 26 Dec 2018 18:21:10 +0000 (13:21 -0500)]
bgpd: Add a hash for quick lookup in community_list_lookup
The community_list_lookup function in a situation where you have
a large number of communities and route-maps that reference them
becomes a very expensive operation( effectively a linked list walk
per route per route-map you apply per peer that has a routemap that
refereces a community, ecommunity or lcommunity. This is a very
expensive operation.
In my testbed, I have a full bgp feed that feeds into 14 namespace
view based bgp processes and finally those 14 feed into a final
namespace FRR instance that has route-maps applied to each
incoming peer for in and out:
!
router bgp 65033
bgp bestpath as-path multipath-relax
neighbor 192.168.41.1 remote-as external
neighbor 192.168.42.2 remote-as external
neighbor 192.168.43.3 remote-as external
neighbor 192.168.44.4 remote-as external
neighbor 192.168.45.5 remote-as external
neighbor 192.168.46.6 remote-as external
neighbor 192.168.47.7 remote-as external
neighbor 192.168.48.8 remote-as external
neighbor 192.168.49.9 remote-as external
neighbor 192.168.50.10 remote-as external
neighbor 192.168.51.11 remote-as external
neighbor 192.168.52.12 remote-as external
neighbor 192.168.53.13 remote-as external
neighbor 192.168.54.14 remote-as external
!
address-family ipv4 unicast
neighbor 192.168.42.2 prefix-list two-in in
neighbor 192.168.42.2 route-map two-in in
neighbor 192.168.42.2 route-map two-out out
neighbor 192.168.43.3 prefix-list three-in in
neighbor 192.168.43.3 route-map three-in in
neighbor 192.168.43.3 route-map three-out out
neighbor 192.168.44.4 prefix-list four-in in
neighbor 192.168.44.4 route-map four-in in
neighbor 192.168.44.4 route-map four-out out
neighbor 192.168.45.5 prefix-list five-in in
neighbor 192.168.45.5 route-map five-in in
neighbor 192.168.45.5 route-map five-out out
neighbor 192.168.46.6 prefix-list six-in in
neighbor 192.168.46.6 route-map six-in in
neighbor 192.168.46.6 route-map six-out out
neighbor 192.168.47.7 prefix-list seven-in in
neighbor 192.168.47.7 route-map seven-in in
neighbor 192.168.47.7 route-map seven-out out
neighbor 192.168.48.8 prefix-list eight-in in
neighbor 192.168.48.8 route-map eight-in in
neighbor 192.168.48.8 route-map eight-out out
neighbor 192.168.49.9 prefix-list nine-in in
neighbor 192.168.49.9 route-map nine-in in
neighbor 192.168.49.9 route-map nine-out out
neighbor 192.168.50.10 prefix-list ten-in in
neighbor 192.168.50.10 route-map ten-in in
neighbor 192.168.50.10 route-map ten-out out
neighbor 192.168.51.11 prefix-list eleven-in in
neighbor 192.168.51.11 route-map eleven-in in
neighbor 192.168.51.11 route-map eleven-out out
neighbor 192.168.52.12 prefix-list twelve-in in
neighbor 192.168.52.12 route-map twelve-in in
neighbor 192.168.52.12 route-map twelve-out out
neighbor 192.168.53.13 prefix-list thirteen-in in
neighbor 192.168.53.13 route-map thirteen-in in
neighbor 192.168.53.13 route-map thirteen-out out
neighbor 192.168.54.14 prefix-list fourteen-in in
neighbor 192.168.54.14 route-map fourteen-in in
neighbor 192.168.54.14 route-map fourteen-out out
exit-address-family
!
This configuration on my machine before this change takes about 2:45 to converge
and bgp takes:
Total thread statistics
16 151715.050 493440 307 3464919 335 7376696 RWTEX TOTAL
CPU time as reported by 'show thread cpu'.
After this change BGP takes 58 seconds to converge and uses:
Total thread statistics
-------------------------
16 42954.284 350319 122 295743 157 1194820 RWTEX TOTAL
almost 43 seconds of CPU time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 26 Dec 2018 18:01:06 +0000 (13:01 -0500)]
bgpd: Use `struct rmap_community` when we use community_list_lookup
The community_list_lookup function is being changed in a future
commit. As such we want to use the `struct rmap_community` data
structure for storing compiled information about communities,ecommunities
or lcommunities.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Rhonda D'Vine [Fri, 4 Jan 2019 10:34:13 +0000 (11:34 +0100)]
debianpkg: use getent instead of egrepping files
User data might not be stored in the files in etc. getent is the
dedicated tool to extract those information, regardless of where the
user data is stored
Rafael Zalamena [Tue, 8 Jan 2019 12:32:28 +0000 (10:32 -0200)]
zebra: fix FreeBSD warning on fresh OS boot
Handle corner case where a warning log message is issued on interface
address netmask handling with sockaddr type AF_LINK: it may come empty
or with match all (all 0xFF).
In the first case all lengths are zero and we only need to copy the
first bytes, second case it comes with a zero index and all 0xFF bytes.
In any case we only need to figure out a few of the first bytes instead
of all data.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Rafael Zalamena [Tue, 8 Jan 2019 10:14:28 +0000 (08:14 -0200)]
zebra: implement FreeBSD route attr handling
When porting routing socket macro data handling to functions, the
attribute function was forgotten. The only difference between the
attribute and address handler is the family type check.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Donald Sharp [Sat, 22 Dec 2018 00:16:52 +0000 (19:16 -0500)]
bgpd: Modify End of Rib notification to INFO
The End of Rib notification in BGP is useful to know no matter
the circumstances. So change this from a debug message to
an info and cleanup the message a bit and add vrf we are in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>