Quentin Young [Thu, 26 Apr 2018 04:06:15 +0000 (00:06 -0400)]
zebra: fix write task collision
Only one I/O task can be scheduled per file descriptor. Having two
separate tasks for buffer filling and buffer flushing was breaking that
invariant and causing messages to never be written.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Wed, 25 Apr 2018 22:45:21 +0000 (18:45 -0400)]
zebra: some more i/o optimizations
* Separate flush task from write task, so we can continue adding to the
write buffer while it's waiting to flush
* Handle write errors sooner rather than later
* Only schedule a process job if we have packets to process
* Tweak zserv_process_messages to not reschedule itself and rely on
zserv_read() to do so in all proper cases
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 24 Apr 2018 21:03:19 +0000 (17:03 -0400)]
zebra: optimize zserv_process_messages
* Simplify zapi_msg <-> zserv interaction
* Remove header validity checks, as they're already performed before the
packet ever makes it here
* Perform the same kind of batch processing done in zserv_write by
copying multiple inbound packets under lock instead of doing serial
locking
* Perform self-scheduling under the same lock
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 24 Apr 2018 18:51:26 +0000 (14:51 -0400)]
zebra: optimize zserv_write
Dequeue all pending messages when writing and push them all into the
write buffer. This removes the necessity to self-schedule, avoiding a
mutex lock, and should also maximize throughput by not writing 1 packet
per job.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 24 Apr 2018 15:36:25 +0000 (11:36 -0400)]
zserv: optimize zserv_read
* Increase the maximum number of packets to read per read job
* Store read packets in a local cached buffer to avoid mutex overhead
* Only update last-read time / last-command if we actually read a packet
* Add missing log line for corrupt header case
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 23 Apr 2018 20:58:14 +0000 (16:58 -0400)]
zebra: reorganize zserv.c by pthread affinity
Since it is already quite difficult to understand the various pieces
going on here, I reorganized the file to make it much cleaner and easier
to understand. The organization is now:
Quentin Young [Mon, 23 Apr 2018 20:18:46 +0000 (16:18 -0400)]
zebra: fix session stats data race, memory leak
* Time counters need to use atomic access between threads
* After a client disconnects, we properly kill the thread but need to
free its frr_pthread as well
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Sat, 21 Apr 2018 07:55:44 +0000 (03:55 -0400)]
zebra: fix some memory errors, scheduling bugs
* Add doc comments explaining hairy bits of thread lifecycle
* Remove t_suicide as it no longer makes sense
* Remove client double-free
* Remove unnecessary THREAD_OFF being used in incorrect pthread context
* Eliminate unnecessary racey access to client's obuf_fifo
* Ensure zserv_process_messages() reschedules itself if it has not
finished its work
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Donald Sharp [Fri, 25 May 2018 18:45:16 +0000 (14:45 -0400)]
zebra: Add a breadcrumb for when we ignore a route
When we receive a route that we think we own and we
are not in startup conditions, then add a small debug
to help debug the issue when this happens, instead
of silently just ignoring the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 25 May 2018 18:36:12 +0000 (14:36 -0400)]
tools, zebra: Use different protocol value for our statics
The re-use of RTPROT_STATIC has caused too many collisions
where other legitimate route sources are causing us to
believe we are the originator of the route. Modify
the code so that if another protocol inserts RTPROT_STATIC
we will assume it's a Kernel Route.
Fixes: #2293 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We added our own copy of if_link.h (among others). This
file unconditionally defines IFLA_WIRELESS, so we don't need
the conditional defines in the if_netlink.c code...
Issue: https://github.com/FRRouting/frr/issues/2299 Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
Quentin Young [Thu, 24 May 2018 18:43:57 +0000 (18:43 +0000)]
lib: add proper doc comments for hash & linklist
* Remove references to ospf source files from linklist.[ch]
* Remove documentation comments from hash.c and linklist.c
* Add comprehensive documentation comments to linklist.h and hash.h
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Philippe Guibert [Wed, 23 May 2018 12:14:53 +0000 (14:14 +0200)]
bgpd: do not install BGP FS entries, while table range not obtained
Sometimes at startup, BGP Flowspec may be allocated a routing table
identifier not in the range of the predefined table range.
This issue is due to the fact that BGP peering goes up, while the BGP
did not yet retrieve the Table Range allocator.
The fix is done so that BGP PBR entries are not installed while
routing table identifier range is not obtained. Once the routing table
identifier is obtained, parse the FS entries and check that all selected
entries are installed, and if not, install it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 23 May 2018 10:10:00 +0000 (12:10 +0200)]
bgpd: increase buffer size to store ecomunity as a string
On the case where an ecom from FS redirect is received, the ecom may be
with the format A.B.C.D:E. On this case, the printable format of the
Flowspec redirect VRF ecom value may use more bytes in the buffer
dedicated for that. The buffer that stores the ecommunity is increased.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 18 May 2018 15:11:50 +0000 (17:11 +0200)]
pbrd: add ZAPI_RULE_FAIL_REMOVE flag in switch
The notification handler consecutive to an add/remove of a rule in zebra
is being added the FAIL_REMOVE flag. It is mapped on REMOVE flag
behaviour for now.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Mon, 21 May 2018 14:40:31 +0000 (16:40 +0200)]
bgpd: upon uninstalling pbr rule, update local structure
Currently, uninstall pbr rule is not handled by BGP notification
handler. So the uninstall update of the structure is done, immediately
after sending the request of uninstall to zebra.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 18 May 2018 14:22:23 +0000 (16:22 +0200)]
zebra: add pbr objects fail_remove value into notification
After PBR or BGP sends back a request for sending a rule/ipset/ipset
entry/iptable delete, there may be issue in deleting it. A notification
is sent back with a new value indicating that the removal failed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra: PBR config and monitor IPSET/IPTABLE hooks declared
The following PBR handlers: ipset, and iptables will prioritary
call the hook from a possible plugin.
If a plugin is attached, then it will return a positive value.
That is why the return status is tested against 0 value, since that
means that there are no plugin module plugged
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd: traffic rate value is ignored for searching bpa
There are cases where a redirect IP or redirect VRF stops the ecom
parsing, then ignores a subsequent rate value, letting passed value to
0. Consequently, a new table identifier may be elected, despite the
routing procedure is the same. This fix ignores the rate value in bpa
list.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 25 Apr 2018 16:29:35 +0000 (18:29 +0200)]
bgpd: add vty command to restrict FS policy routing to a defined interface
policy routing is configurable via address-family ipv4 flowspec
subfamily node. This is then possible to restrict flowspec operation
through the BGP instance, to a single or some interfaces, but not all.
Two commands available:
[no] local-install [IFNAME]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 30 Mar 2018 11:01:39 +0000 (13:01 +0200)]
bgpd: add 3 fields to ipset_entry : src,dst port, and proto
Those 3 fields are read and written between zebra and bgpd.
This permits extending the ipset_entry structure.
Combinatories will be possible:
- filtering with one of the src/dst port.
- filtering with one of the range src/ range dst port
usage of src or dst is exclusive in a FS entry.
- filtering a port or a port range based on either src or dst port.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 18 May 2018 14:14:46 +0000 (16:14 +0200)]
bgpd: do not account twice references to rule context
When rule add transaction is sent from bgpd to zebra, the reference
context must not be incremented while the confirmation message of
install has not been sent back; unless if the transaction failed to be
sent.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 17 May 2018 07:30:28 +0000 (09:30 +0200)]
bgpd: add missing ecommunity flowspec to display
On some cases, the ecommunity flowspec for redirect vrf is not displayed
in all cases. On top of that, display the values if ecom can no be
decoded.
Also, sub_type and type are changed from int to u_int8_t, because the
values contains match the type and sub type of extended communities.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 25 Apr 2018 16:34:27 +0000 (18:34 +0200)]
zebra: handle iptable list of interfaces
Upon reception of an iptable_add or iptable_del, a list of interface
indexes may be passed in the zapi interface. The list is converted in
interface name so that it is ready to be passed to be programmed to the
underlying system.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 30 Mar 2018 11:01:39 +0000 (13:01 +0200)]
zebra: add 3 fields to ipset_entry : src,dst port, and proto
Those 3 fields are read and written between zebra and bgpd.
This permits extending the ipset_entry structure.
Combinatories will be possible:
- filtering with one of the src/dst port.
- filtering with one of the range src/ range dst port
usage of src or dst is exclusive in a FS entry.
- filtering a port or a port range based on either src or dst port.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Mon, 23 Apr 2018 13:17:19 +0000 (15:17 +0200)]
zebra: pbr vty show command for ipset and iptables
Two new vty show functions available:
show pbr ipset <NAME>
show pbr iptables <NAME>
Those function dump the underlying "kernel" contexts. It relies on the
zebra pbr contexts. This helps then to know which zebra pbr
context has been configured since those contexts are mainly configured
by BGP Flowspec.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Don Slice [Thu, 24 May 2018 14:58:37 +0000 (10:58 -0400)]
bgpd: additional neighbor message improvement
Added improved error message text to other places that could also
encounter the same condition. In testing found that in certain
case, duplicate error messages were previously issued. This fix
also removes the duplicates.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Don Slice [Wed, 23 May 2018 12:09:59 +0000 (08:09 -0400)]
bgpd: improve error message for neighbor not found
Problem reported due to tab completion showing all possible peers
in every vrf, but when neighbor in wrong vrf entered "no such
neighbor" is the error message. Making it slightly more clear
with "no such neighbor in the view/vrf" to clue the user that they
may have specified the wrong vrf.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Thu, 24 May 2018 13:03:11 +0000 (09:03 -0400)]
zebra: Fix RULE notification netlink messages
Fix the code so that we would actually start receiving
RULE netlink notifications.
The Kernel expects the long long to be a bit field
value, while the newer netlink message types are
an enum. So we need to convert the message type
number to a bit position and set that value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 23 May 2018 13:24:12 +0000 (09:24 -0400)]
zebra: Move where we check for non-kernel netlink messages
Move where we check for non-kernel netlink messages to
a slightly earlier spot. This will allow in subsuquent
commits the removal of an extra parameter that needs to
be passed around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 23 May 2018 12:25:51 +0000 (08:25 -0400)]
zebra: Ignore most netlink notifications from ourselves
The BPF filter was an exclusion list of netlink messages
we did not want to receive from our self. The problem
with this is that the exclusion list was and will be
ever growing. So switch the test around to an inclusion
list since it is shorter and not growing. Right
now this is RTM_NEWADDR and RTM_DELADDR.
Change some of the debug messages to error messages
so that when something slips through and it is unexpected
during development we will see the problem.
Also try to improve the documentation about what
the filter is doing and leave some breadcrumbs for
future developers to know where to change code
when new functionality is added.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 22 May 2018 14:54:20 +0000 (10:54 -0400)]
bgpd: Ensure virt->vrfs is valid
Move the list_delete_and_null of the virt->vrfs code to
the actual deletion function to ensure proper lifecycle.
This assumption allows us to know that irt->vrfs is always
true so remove the NULL check on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 22 May 2018 14:50:53 +0000 (10:50 -0400)]
bgpd: Free vni list on actual deletion
The irt->vnis list was being freed on going down,
but actually delete it from the deletion function. Then
we can know that the irt->vnis is a valid list anywhere
we have a irt pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>