Mark Stapp [Tue, 30 Oct 2018 13:41:55 +0000 (09:41 -0400)]
zebra: only uninstall once, when closing rib table
When the rib code is informed that a table is closing/
going away, only try once to uninstall associated routes from
the fib/dataplane. The close path can be called multiple times
in some cases - zebra shutdown, e.g.
David Lamparter [Sat, 27 Oct 2018 17:06:22 +0000 (19:06 +0200)]
build: crop excessive net-snmp library list
This fixes the longstanding GPL vs. OpenSSL licensing issue in our SNMP
code (and cuts down on its other dependencies a wee bit.)
In a way, net-snmp is really buggy here in what it says that we should
link against, but I don't know their application scenarios well enough
to say it should be changed at their end.
Signed-off-by: David Lamparter <equinox@diac24.net>
Mark Stapp [Mon, 15 Oct 2018 15:14:07 +0000 (11:14 -0400)]
zebra: only perform shutdown signal processing once
Avoid running the shutdown/sigint handler code more than once. With
the async dataplane, once shutdown has been initiated, the completion
of all async updates triggers final shutdown of the zebra main
pthread. During that time, avoid taking and processing a second
signal, such as SIGINT or SIGTERM.
Mark Stapp [Mon, 27 Aug 2018 20:03:37 +0000 (16:03 -0400)]
zebra: support zebra shutdown and cleanup
Dplane support for zebra's route cleanup during shutdown (clean
shutdown via SIGINT, anyway.) The dplane has the opportunity to
process incoming updates, and then triggers final cleanup
in zebra's main thread.
Mark Stapp [Fri, 17 Aug 2018 19:50:09 +0000 (15:50 -0400)]
zebra: add dplane show commands
Add first pass at show commands for the zebra dplane. Add some stats
counters to show. Start prep for correct shutdown processing, and for
multiple providers.
Mark Stapp [Thu, 19 Jul 2018 18:55:02 +0000 (14:55 -0400)]
zebra: ensure redist of system routes
We need a bit of special handling for system routes, which need
to be offered for redistribution even though they won't be
passing through the dplane system.
Mark Stapp [Wed, 23 May 2018 16:20:43 +0000 (12:20 -0400)]
zebra: start dataplane layer work
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block
Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.
Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.
David Lamparter [Wed, 24 Oct 2018 15:31:31 +0000 (17:31 +0200)]
build: add "redistclean" target
This puts a source tree back in the state it was in after unpacking a
dist tarball. Different from distclean in that it doesn't remove files
that are included in the tarball.
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Mon, 15 Oct 2018 04:51:30 +0000 (06:51 +0200)]
build: work around automake wtf
For some reason, automake was "randomizing" the order of these few lines
in the generated output Makefile.in.
I have absolutely no clue what's going on, but it's the only thing
preventing me from building reproducible source tarballs (i.e.
bit-exactly identical), so... just slightly "rephrase" this.
Should behave exactly the same as before.
Signed-off-by: David Lamparter <equinox@diac24.net>
Donald Sharp [Wed, 24 Oct 2018 15:34:50 +0000 (11:34 -0400)]
zebra: Notice when a route fails to install on *bsd
When we fail to install a route into bsd, note the case
where we have no viable nexthops installed for it, so
that we can know in zebra if the route is good or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
David Lamparter [Tue, 23 Oct 2018 12:06:25 +0000 (14:06 +0200)]
build: carry --with-pkg-extra-version into tarballs
If we use "./configure --with-pkg-extra-version=... && make dist", we
probably want the dist tarball to remember the extra version it was
configured with.
Use --without-pkg-extra-version to kill the tag.
Signed-off-by: David Lamparter <equinox@diac24.net>
Donald Sharp [Mon, 27 Aug 2018 18:36:46 +0000 (14:36 -0400)]
zebra: Move rules_hash to zrouter
Move the rules_hash to the zrouter data structure and provide
the additional bit of work needed to lookup the rule based upon
the namespace id as well. Make the callers of functions not
care about what namespace id we are in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 27 Aug 2018 14:43:37 +0000 (10:43 -0400)]
zebra: Start breakup of zns into zrouter and zns
The `struct zebra_ns` data structure is being used
for both router information as well as support for
the vrf backend( as appropriate ). This is a confusing
state. Start the movement of `struct zebra_ns` into
2 things `struct zebra_router` and `struct zebra_ns`.
In this new regime `struct zebra_router` is purely
for handling data about the router. It has no knowledge
of the underlying representation of the Data Plane.
`struct zebra_ns` becomes a linux specific bit of code
that allows us to handle the vrf backend and is allowed
to have knowledge about underlying data plane constructs.
When someone implements a *bsd backend the zebra_vrf data
structure will need to be abstracted to take advantage of this
instead of relying on zebra_ns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Christian Franke [Wed, 24 Oct 2018 05:19:22 +0000 (07:19 +0200)]
isisd: delay lsp regeneration while events are still coming in
When there is a stream of events coming in, where IS-IS learns
about a lot of updates, IS-IS would regenerate its LSPs before
the updates have been processed completely.
This causes suboptimal convergence because the intermediate state
will be flooded. Only after the configured `lsp_gen_interval`, a
new update with the correct and final state will be generated.
Resolve this by holding off LSP generation while there are still
events coming in.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Christian Franke [Wed, 24 Oct 2018 04:27:17 +0000 (06:27 +0200)]
isisd: Combine lsp_l1/l2_refresh
lsp_l1_refresh and lsp_l2_refresh are identical apart from the
hardcoded IS-IS level they are referring to. So merge them and
pass the level as part of the argument.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Christian Franke [Wed, 24 Oct 2018 03:38:53 +0000 (05:38 +0200)]
isisd: Log LSP-update trigger source
For debugging the timing of LSP generation, it is useful to know
which event caused a regeneration to be scheduled. Therefore, add
this information to the debug log.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
bgpd:Fixing the signature of community_free function
community_free, lcommunity_free and ecommunity_free are similar type of functions. Most of the places, these three are called together. The signature of community_free is different from other two functions. Modified the community_free API signature to align with other two functions to avoid any confusion. There is no functionality impact with this and this is just to avoid any confusion.
Testing: manual testing and show commands Signed-off-by: Sri Mohana Singamsetty msingamsetty@vmware.com
bgpd: fill in prefix for flowspec entry when json format is requested
as prefix is opaque for flowspec, and json needs to have a non empty
full of meaning value in prefix, the proposal is to encode the
displayable form of flowspec entry.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Renato Westphal [Fri, 19 Oct 2018 18:55:47 +0000 (15:55 -0300)]
ospfd: fix issue with the "no segment-routing prefix A.B.C.D/M" command
Add a missing check to bail out earlier when SR is not configured. The
same command without the "no" prefix has the same check as it prevents
unexpected things (i.e. crashes) from happening.
Fixes the following segfaults:
ospfd aborted: vtysh -c "configure terminal" -c "router ospf" -c "no segment-routing prefix 1.1.1.1/32"
ospfd aborted: vtysh -c "configure terminal" -c "router ospf" -c "no segment-routing prefix 1.1.1.1/32 index 65535 no-php-flag"
Renato Westphal [Fri, 19 Oct 2018 18:55:22 +0000 (15:55 -0300)]
bgpd: use the vrf_bitmap_*() helper functions when necessary
zclient->redist[afi][type] is a hash table and not an integer since a
while ago when VRF support was introduced. As such, zclient->redist[][]
should never be manipulated directly, the vrf_bitmap_*() helper functions
should be used instead. This fixes a few crashes found by the CLI fuzzer.
Renato Westphal [Fri, 19 Oct 2018 18:55:12 +0000 (15:55 -0300)]
bgpd: fix bug while iterating over VPN table
The routing table data structure can create intermediate route nodes
during its normal operation, so we always need to check if the 'info'
pointer of a route node is NULL or not before dereferencing it.
Renato Westphal [Fri, 19 Oct 2018 18:55:08 +0000 (15:55 -0300)]
bgpd: remove wrong assert
The vnc_direct_del_rn_group_rd() function can be called with the 'afi'
parameter set to AFI_L2VPN on some specific cases. Remove the assert to
fix the crash.
Renato Westphal [Fri, 19 Oct 2018 18:55:03 +0000 (15:55 -0300)]
bgpd: fix NULL pointer dereference bug
Other parts of the rfapi code check if the 'rfg->rfapi_import_table'
pointer is NULL or not before using it. Do the same here to fix a crash
detected by the CLI fuzzer.
Renato Westphal [Fri, 19 Oct 2018 18:54:57 +0000 (15:54 -0300)]
bgpd: add a NULL check to prevent a crash in the rfapi code
The rfapiDeleteRemotePrefixesIt() function checks on several places if
'p' is NULL or not. Introduce an additional NULL check to prevent a
crash from happening.
Renato Westphal [Fri, 19 Oct 2018 18:54:47 +0000 (15:54 -0300)]
bgpd: fix crashes caused by missing input validation
The rfapi code wasn't checking if strtoul() succeeded or not when parsing
the list of labels. Fix the affected commands by not allowing the user
to enter a non-numeric input.
Renato Westphal [Fri, 19 Oct 2018 18:53:55 +0000 (15:53 -0300)]
bgpd: handle NULL pointers in lcommunity_cmp()
Like community_cmp() and ecommunity_cmp(), the lcommunity_cmp() function
also needs to handle NULL pointers for correct operation.
Without this fix, bgpd can crash when entering the following commands:
vtysh -c "configure terminal" -c "ip large-community-list standard WORD deny"
vtysh -c "configure terminal" -c "no ip large-community-list expanded WORD"
Renato Westphal [Fri, 19 Oct 2018 18:53:46 +0000 (15:53 -0300)]
bgpd: fix cleanup of dampening configuration
The bgp_damp_config_clean() function was deallocating some arrays without
resetting the variables that represent their sizes. This was leading to
some crashes because other parts of the code iterate over these arrays
by looking at their corresponding sizes, which could be invalid.
Fixes the following segfaults (which only happen under certain
circumstances):
vtysh -c "configure terminal" -c "router bgp 1" -c "bgp dampening"
vtysh -c "configure terminal" -c "router bgp 1" -c "no bgp dampening"
vtysh -c "configure terminal" -c "router bgp 1" -c "no bgp dampening 45"
vtysh -c "" -c "clear ip bgp dampening"
Renato Westphal [Fri, 19 Oct 2018 18:53:33 +0000 (15:53 -0300)]
bfdd: do not allow multihop peers without a local-address
The BFD code assumes that multihop peers have a local address
configured. When that doesn't happen, the BFD client daemons fail to
decode some BFD ZAPI messages and abort. To fix this, do not accept the
configuration of multhop peers unless a local-address is configured.
Donald Sharp [Sat, 20 Oct 2018 12:59:30 +0000 (08:59 -0400)]
ospfd: Do not allow thread drop
When the ospf->oi_write_q is not empty that means that ospf could
already have a thread scheduled for running. Just dropping
the pointer before resheduling does not stop the one currently
scheduled for running from running. The calling of thread_add_write
checks to see if we are already running and does the right thing here
so it is sufficient to just call thread_add_write.
This issue was tracked down from this stack trace:
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [EC 134217739] interface eth2.1032:172.16.4.110: ospf_check_md5 bad sequence 5333618 (expect 5333649)
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: message repeated 3 times: [ [EC 134217739] interface eth2.1032:172.16.4.110: ospf_check_md5 bad sequence 5333618 (expect 5333649)]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Assertion `node’ failed in file ospfd/ospf_packet.c, line 666, function ospf_write
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Backtrace for 8 stack frames:
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 0] /usr/lib/libfrr.so.0(zlog_backtrace+0x3a) [0x7fef3efe9f8a]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 1] /usr/lib/libfrr.so.0(_zlog_assert_failed+0x61) [0x7fef3efea501]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 2] /usr/lib/frr/ospfd(+0x2f15e) [0x562e0c91815e]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 3] /usr/lib/libfrr.so.0(thread_call+0x60) [0x7fef3f00d430]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 4] /usr/lib/libfrr.so.0(frr_run+0xd8) [0x7fef3efe7938]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 5] /usr/lib/frr/ospfd(main+0x153) [0x562e0c901753]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 6] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fef3d83db45]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 7] /usr/lib/frr/ospfd(+0x190be) [0x562e0c9020be]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Current thread function ospf_write, scheduled from file ospfd/ospf_packet.c, line 881
Oct 19 18:04:00 VYOS-R1 zebra[1771]: [EC 4043309116] Client ‘ospf’ encountered an error and is shutting down.
Oct 19 18:04:00 VYOS-R1 zebra[1771]: client 41 disconnected. 0 ospf routes removed from the rib
We had an assert(node) in ospf_write, which means that the list was empty. So I just
searched until I saw a code path that allowed multiple writes to the ospf_write function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 17 Oct 2018 19:27:12 +0000 (15:27 -0400)]
*: Replace hash_cmp function return value to a bool
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>