Don Slice [Wed, 19 Jun 2019 11:22:21 +0000 (11:22 +0000)]
zebra: resolve issue with rnh not evaluating nexhops correctly
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers. Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some. Changed
from frr_each to frr_each_safe and the problem is now gone.
Ticket: CM-25301 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 19:47:10 +0000 (15:47 -0400)]
zebra: Display a bit better debugging for rnh tracking
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 13:21:49 +0000 (09:21 -0400)]
bgpd: BGP_ERR_MULTIPLE_INSTANCE_NOT_SET is an impossible condition
This code is not returned anywhere in the system as that bgp
is by default multiple-instance 'only' now. So remove
the last remaining bits of it from the code base.
Remove BGP_ERR_MULTIPLE_INSTANCE_USED too.
Make bgp_get explicitly return BGP_SUCCESS
instead of 0.
Remove the multi-instance error code too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 6 Jun 2019 00:59:02 +0000 (20:59 -0400)]
bgpd: Fix crash when rd has no data
There exists a state where we may have a rd node but no individual
evpn prefix nodes in the two level table:
(gdb) bt
at bgpd/bgp_evpn_vty.c:1190
filter=FILTER_RELAXED) at lib/command.c:1060
at lib/command.c:1119
vtysh=vtysh@entry=0) at lib/command.c:1273
(gdb) f 5
at bgpd/bgp_evpn_vty.c:1190
1190 bgpd/bgp_evpn_vty.c: No such file or directory.
(gdb) p buf
$1 = "[2]:[0]:[48]:[00:00:00:00:00:00]", '\000' <repeats 240 times>...
(gdb) p json_nroute
$2 = (json_object *) 0x0
(gdb) p rd_header
$3 = 1
(gdb) p buf
$4 = "[2]:[0]:[48]:[00:00:00:00:00:00]", '\000' <repeats 240 times>...
(gdb)
I'm not entirely sure that this is not a `different` problem in that the
rd node should have been removed. But I think preventing the crash
in a show command is probably the right thing to do here.
Fixes: #4501 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 6 Jun 2019 00:53:01 +0000 (20:53 -0400)]
bgpd: Mac rescan on interface up/down efficency improvements
On interface up/down, bgp stores the mac address of the interface
in a bgp_mac_hash table entry and then initiates a rescan
of the evpn l2vpn table. The problem with this scan is that
it is looking at every item in the table when only 1 mac
has changed. So every up/down event causes some major trauma
in the bgp_update processing.
Modify the mac scanning such that we know the mac that is changed
and as such we should reprocess those entries only.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 00:16:30 +0000 (20:16 -0400)]
bgpd: Fix memleak of Mac Hash String upon insertion
If we get a callback for a interface change but we do not
actually have to move the mac entry in the hash then
we were accidently leaking the Mac Hash String all over
ourselves. Messy Messy!
Ticket: CM-25351 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 13 Jun 2019 02:27:29 +0000 (22:27 -0400)]
lib: Add check for non-preexisting thread
When adding a read/write poll event and we are using a developmental
build add a bit of code to ensure that we do not already have an read
or write event scheduled.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 13 Jun 2019 01:13:18 +0000 (21:13 -0400)]
lib: Prevent infinite loop in fd handling
If we have a case where have created a fd for i/o and we have
removed the handling thread but still have the fd in the poll
data structure, there existed a case where we would get
the handle this fd return from poll but we would immediately
do nothing with it because we didn't have a thread to hand
the event to.
This leads to an infinite loop. Prevent the infinite loop
from happening and log the problem.
We still need to find the cause of this happening. But
let's prevent the system from melting down in the mean time.
Fixes: #2796 Signed-off-by: David Lamparter <equinox@diac24.net> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Tue, 4 Jun 2019 01:42:00 +0000 (18:42 -0700)]
bgpd: skip evpn remove marked routes from rescan
Skip evpn routes marked for removed from rescan list
when an interface is flapped.
Ticket:CM-24933
Testing Done:
Validated in a scenario where evpn route is marked
for remove as bgp evpn withdrawal is received. Due to
link flap (frr restart of downstream router), the route
was considered for readd via bgp_update. With this
fix, the remove marked routes are skipped from update.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
David Lamparter [Wed, 12 Jun 2019 17:13:30 +0000 (19:13 +0200)]
*: fix northbound initializer warning on OpenBSD
For some reason, the compiler on OpenBSD on our CI boxes doesn't like
struct initializers with ".a.b = x, .a.c = y", generating a warning
about overwritten initializers...
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Wed, 12 Jun 2019 15:11:21 +0000 (17:11 +0200)]
lib: use snprintfrr() in "hidden" printfs
We need to be calling snprintfrr() instead of snprintf() in places that
wrap snprintf in some user-exposed way; otherwise the extensions won't
be available for those functions.
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Tue, 4 Jun 2019 18:27:05 +0000 (20:27 +0200)]
lib/clippy: assert() for non-optional args
This is mostly to help static analysis; since we know from the command
string which args are optional and which aren't, we can add assert()
statements on them.
Fixes: #3270 Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Tue, 4 Jun 2019 15:07:57 +0000 (17:07 +0200)]
lib/clippy: error out on unsupported bits
clippy can't process #ifdef or similar bits inside of an argument list
(e.g. within the braces of a DEFUN or DEFPY statement.) Improve error
reporting to catch these cases instead of generating broken C code.
Fixes: #3840 Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Tue, 4 Jun 2019 14:20:20 +0000 (16:20 +0200)]
build: improve env var handling for cross build
We can use `$ac_precious_vars` to get at autoconf's conception of which
environment variables are relevant. This makes "HOST_..." setup more
consistent for cross-compilation setups.
Fixes: #4006 Signed-off-by: David Lamparter <equinox@diac24.net>
Renato Westphal [Mon, 27 May 2019 22:48:13 +0000 (19:48 -0300)]
lib: fix outdated candidate configuration issue
Even when using the classic CLI mode (i.e. when --tcli is not
used), the northbound code still uses vty->candidate_config
to perform configuration changes. From the perspective of the
user, the running configuration is being edited directly, but
under the hood the northbound layer does a full configuration
transaction for each command. When the running configuration is
edited by a northbound client other than the CLI (e.g. kernel,
gRPC), vty->candidate_config might become outdated, and this can
lead to lots of weird problems. To fix this, always regenerate
vty->candidate_config before each configuration command when
using the classic CLI mode. When using the transactional CLI,
the user needs to update the candidate manually using the "update"
command, otherwise the "commit" command will fail with this error:
"% Candidate configuration needs to be updated before commit".
Fixes some problems reported by Don after moving an interface from
one VRF to another one while zebra is running.
Reported-by: Don Slice <dslice@cumulusnetworks.com> Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Ameya Dharkar [Thu, 9 May 2019 22:34:52 +0000 (15:34 -0700)]
Lib: Debugs for route-map code in FRR
Added a CLI "debug route-map" to enble route-map debugs
Added debugs for following triggers
1. Add/delete a route-map
2. Add/delete a sequence in route-map
3. Add/delete a match statement(dependency)
4. Update a dependency
5. Apply a route-map
Soman K S [Tue, 11 Jun 2019 13:20:09 +0000 (06:20 -0700)]
bgpd: Process core when bgp instance is deleted
* When the bgp is being deleted and routes are in clear workqueue
and new aggregate address being allocated
* Added flag BGP_FLAG_DELETE_IN_PROGRESS in bgp structure to
bgp instance is being deleted
* When adding aggregate route check this flag and peer_self is valid