Donald Sharp [Fri, 11 Jan 2019 18:31:46 +0000 (13:31 -0500)]
zebra: Move the master thread handler to the zrouter structure
The master thread handler is really part of the zrouter structure.
So let's move it over to that. Eventually zserv.h will only be
used for zapi messages.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
David Lamparter [Wed, 30 Jan 2019 17:11:54 +0000 (18:11 +0100)]
build: fix a whole bunch of *FLAGS
- some target_CFLAGS that needed to include AM_CFLAGS didn't do so
- libyang/sysrepo/sqlite3/confd CFLAGS + LIBS weren't used at all
- consistently use $(FOO_CFLAGS) instead of @FOO_CFLAGS@
- 2 dependencies were missing for clippy
Signed-off-by: David Lamparter <equinox@diac24.net>
Donald Sharp [Wed, 30 Jan 2019 14:31:32 +0000 (09:31 -0500)]
zebra: On route update context is sometimes indeterminate in post-processing
When we get into rib_process_result and the operation we are handling
is DPLANE_OP_ROUTE_UPDATE *and* the route entry being looked at
is a route replace, we currently have no way to decode to the old_re
and the re due to how we have stored context. As such they are the
same pointer.
As such the route replace for the same route type is causing the re
to set the installed flag and then immediately unset the installed
flag, leaving us in a state where the kernel has the route but
the rib thinks we are not installed.
Since the true old_re( the one being replaced by the update operation )
is going away( as that it zebra deletes the old one for us already )
this fix is not optimal but will get us moving forward.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 30 Jan 2019 02:35:07 +0000 (21:35 -0500)]
zebra: Convert route entry id number to string in debugs
The route entry being displayed in debugs was displaying
the originating route type as a number. While numbers
are cool, I for one am not terribly interested in
memorizing them. Modify the (type %d) to a (%s) to
just list the string type of the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Philippe Guibert [Fri, 30 Nov 2018 13:13:37 +0000 (14:13 +0100)]
bgpd: change priority of fs pbr rules
two kind of rules are being set from bgp flowspec: ipset based rules,
and ip rule rules. default route rules may have a lower priority than
the other rules ( that do not support default rules). so, if an ipset
rule without fwmark is being requested, then priority is arbitrarily set
to 1. the other case, priority is set to 0.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 29 Nov 2018 13:35:41 +0000 (14:35 +0100)]
bgpd: notify callback when ip rule from/to rule has been configured
because ip rule creation is used to not only handle traffic marked by
fwmark; but also for conveying traffic with from/to rules, a check of
the creation must be done in the linked list of ip rules.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 29 Nov 2018 14:17:36 +0000 (15:17 +0100)]
bgpd: conversion from fs to pbr: support for ip rule from/to
adding/suppressing flowspec to pbr is supported. the add and the remove
code is being added. now,bgp supports the hash list of ip rule list.
The removal of bgp ip rule is done via search. The search uses the
action field. the reason is that when a pbr rule is added, to replace an
old one, the old one is kept until the new one is installed, so as to
avoid traffic to be cut. This is why at one moment, one can have two
same iprules with different actions. And this is why the algorithm
covers this case.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 29 Nov 2018 14:14:41 +0000 (15:14 +0100)]
bgpd: ip rule zebra layer adapted to handle both cases
now, ip rule can be created from two differnt ways; however a single
zebra API has been defined. so make it consistent by adding a parameter
to the bgp zebra layer. the function will handle the rest.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 29 Nov 2018 14:04:52 +0000 (15:04 +0100)]
bgpd: upon bgp fs study, determine if iprule can be used
instead of using ipset based mechanism to forward packets, there are
cases where it is possible to use ip rule based mechanisms (without
ipset). Here, this applies to simple fs rules with only 'from any' or
'to any'.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Mon, 28 Jan 2019 16:54:50 +0000 (17:54 +0100)]
bgpd: detach vrf labels allocated, when removing bgp instance
bgp instance is disabling the label allocated to reach vrf entity.
previously, only vrf disabling was removing the label. now, when bgp
leaves, bgp instance also frees the label used.
PR=62306 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Acked-by: Julien Floret <julien.floret@6wind.com>
Donald Sharp [Mon, 28 Jan 2019 21:14:03 +0000 (16:14 -0500)]
zebra: Use the kernel flags from the IFA_FLAGS if it is available
The ifa_flags value in the netlink message was originally a uint8_t
value. The linux kernel quickly ran out of 8 bits of data to
pass and the IFA_FLAGS value was added to the netlink message to allow
more than 8 bits of data to be passed. So replace the ifa_flags
with the IFA_FLAGS value if it exists in the interface netlink
message.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Nitin Soni [Thu, 24 Jan 2019 08:44:42 +0000 (00:44 -0800)]
ospfd: ospfd core if hello packet exceeds link MTU
Ospfd cored because of an assert when we try to write more than the MTU
size to the ospf packet buffer stream. The problem is - we allocate only MTU
sized buffer. The expectation is that Hello packets are never large
enough to approach MTU. Instead of crashing, this fix discards hello and
logs an error. One should not have so many neighbors behind an
interface.
Ticket: CM-22380 Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com> Reviewed-by: CCR-8204
Donald Sharp [Sun, 27 Jan 2019 01:44:42 +0000 (20:44 -0500)]
*: The onlink attribute should be owned by the nexthop not the route.
The onlink attribute was being passed from upper level protocols
as an attribute of the route *not* the individual nexthop. When
we pass this data to the kernel, we treat the onlink as a attribute
of the nexthop. This commit modifies the code base to allow
us to pass the ONLINK attribute as an attribute of the nexthop.
This commit also fixes static routes that have multiple nexthops
some onlink and some not.
ip route 4.5.6.7/32 192.168.41.1 eveth1 onlink
ip route 4.5.6.7/32 192.168.42.2
S>* 4.5.6.7/32 [1/0] via 192.168.41.1, eveth1 onlink, 00:03:04
* via 192.168.42.2, eveth2, 00:03:04
sharpd@robot ~/frr2> sudo ip netns exec EVA ip route show
4.5.6.7 proto 196 metric 20
nexthop via 192.168.41.1 dev eveth1 weight 1 onlink
nexthop via 192.168.42.2 dev eveth2 weight 1
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 28 Nov 2018 23:46:36 +0000 (18:46 -0500)]
bgpd: interface based peers should automatically override it's peer group
When a interface based peer is setup and if it is part of a peer
group we should ignore this and just use the PEER_FLAG_CAPABILITY_ENHE
no matter what.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 28 Nov 2018 23:21:08 +0000 (18:21 -0500)]
bgpd: Fix crash in various 'show bgp neighbor json' commands
bgp would crash with various `show bgp neighbor json` commands
based upon whether or not it did a pretty print of the output
or not. This is because we were freeing the data 2 times.
Cleanup so that we free the json data 1 time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 14 Jan 2019 21:37:53 +0000 (16:37 -0500)]
zebra: Use ROUTE_ENTRY_INSTALLED as decision for route is installed
zebra is using NEXTHOP_FLAG_FIB as the basis of whether or not
a route_entry is installed. This is problematic in that we plan
to separate out nexthop handling from route installation. So modify
the code to keep track of whether or not a route_entry is installed/failed.
This basically means that every place we set/unset NEXTHOP_FLAG_FIB, we
actually also set/unset ROUTE_ENTRY_INSTALLED on the route_entry.
Additionally where we check for route installed via NEXTHOP_FLAG_FIB
switch over to checking if the route think's it is installed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 15 Jan 2019 12:21:22 +0000 (07:21 -0500)]
zebra: Fix use before initialized
When we discover that a command given to the route add/delete
function in rt_socket.c is bogus, print out a debug message
but don't attempt to actually use a nexthop that we have not
figured out yet as part of the data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 15 Jan 2019 12:26:00 +0000 (07:26 -0500)]
zebra: Having one goto in a function to just return is silly
Just return right there, goto's are useful if you have common
code that needs to be cleaned up before exiting this function,
of which this function has none and there is only one goto.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Thu, 24 Jan 2019 20:19:53 +0000 (12:19 -0800)]
zebra: fix dup addr detect remote macip add case
A MACIP is detected as duplicate and after that
the host continue to move behind different VTEPs results
in local VTEP receiving remote mobility events.
In remote_macip_add, ensure to trigger dad if
MAC is marked as duplicate. In case of freeze
action enabled, is_dup_detect will be set to
avoids installing frozen MAC into kernel.
Ticket:CM-23649
Testing Done:
Configured detection action freeze with detection count
as 7 at DUT and >7 at remote VTEP,
trigger MAC-IP mobility between VTEPs.
once tdetection count reached, MAC detected as duplicate,
post detection move the host to remote. The local VTEP
receives remote macip add and entry is not installed into
kernel with fix.
root@VTEP1:~# net show evpn mac vni 1002 mac aa:aa:aa:aa:aa:aa
MAC: aa:aa:aa:aa:aa:aa
Remote VTEP: 27.0.0.16
Local Seq: 7 Remote Seq: 8
Duplicate, detected at Fri Jan 25 05:03:29 2019
Neighbors:
11.11.11.11 Inactive
Kernel entry still points to LOCAL
root@VTEP1:~# bridge fdb show | grep aa:aa:aa
aa:aa:aa:aa:aa:aa dev hostbond3 vlan 1002 master VxLanA-1
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In a VRR/VRRP setup we can have connected routes with different costs.
So this change eliminates suppressing metric display for connected routes.
Sample output -
root@TORC11:~# vtysh -c "show ipv6 route vrf vrf1"
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
> - selected route, * - FIB route
VRF vrf1:
K * ::/0 [255/8192] unreachable (ICMP unreachable), 00:00:36
C * 2001:aa:1::/64 [0/100] is directly connected, vlan1002-v0, 00:00:36
C>* 2001:aa:1::/64 [0/90] is directly connected, vlan1002, 00:00:36
zebra: set connected route metric based on the devaddr metric
MACVLAN devices are typically used for applications such as VRR/VRRP that
require a second MAC address (virtual). These devices have a corresponding
SVI/VLAN device -
root@TORC11:~# ip addr show vlan1002
39: vlan1002@bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9152 qdisc noqueue master vrf1 state UP group default
link/ether 00:02:00:00:00:2e brd ff:ff:ff:ff:ff:ff
inet6 2001:aa:1::2/64 scope global
valid_lft forever preferred_lft forever
root@TORC11:~# ip addr show vlan1002-v0
40: vlan1002-v0@vlan1002: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9152 qdisc noqueue master vrf1 state UP group default
link/ether 00:00:5e:00:01:01 brd ff:ff:ff:ff:ff:ff
inet6 2001:aa:1::a/64 metric 1024 scope global
valid_lft forever preferred_lft forever
root@TORC11:~#
The macvlan device is used primarily for RX (VR-IP/VR-MAC). And TX is via
the SVI. To acheive that functionality the macvlan network's metric
is set to a higher value.
Zebra currently ignores the devaddr metric sent by the kernel and hardcodes
it to 0. This commit eliminates that hardcoding. If the devaddr metric
is available (METRIC_MAX) it is used for setting up the connected route
otherwise we fallback to the dev/interface metric.
Setting the macvlan metric to a higher value ensures that zebra will always
select the connected route on the SVI (and subsequently use it for next hop
resolution etc.) -
root@TORC11:~# vtysh -c "show ip route vrf vrf1 2001:aa:1::/64"
Routing entry for 2001:aa:1::/64
Known via "connected", distance 0, metric 1024, vrf vrf1
Last update 11:30:56 ago
* directly connected, vlan1002-v0
Routing entry for 2001:aa:1::/64
Known via "connected", distance 0, metric 0, vrf vrf1, best
Last update 11:30:56 ago
* directly connected, vlan1002
bgpd: reinstate current bgp best route on an inactive neigh del
When an inactive-neigh delete is rxed bgp will not have a local path to
remove (and re-run path selection). Instead it simply re-installs the
current best remote path if any.
When a local neigh is added with a MAC that is remote or absent the
neigh is kept in zebra as local/in-active. But not propagated to bgpd.
Similarly when an inactive neigh is deleted the del-msg is not propagated
to bgpd.
Without this change bgp and zebra would fall out of sync as that
bgp would not know to rerun bestpath and for it to reinstall a
known remote path for the mac-ip in question. To fix this we
now propagate inactive neigh deletes to bgpd.
Ticket: CM-23018
Testing Done:
1. evpn-min
2. manually triggered the out-of-sync state and verified the fix
bgpd: fill the zebra mac-ip route via a common api
Move the info filling for zebra mac-ip install (sent by bgpd) to a
common place.
The commit also fixes missing ROUTER flag for one of the cases
added in a code branch that doesn't have the ROUTER changes -
[ 6d8c603a
bgpd: use IP address as tie breaker if the MM seq number is the same
]
Donald Sharp [Fri, 25 Jan 2019 16:57:09 +0000 (11:57 -0500)]
pimd: Prevent crash from using pim static mroutes
If you have an interface being added to a static mroute
and that interface has been configured w/ pim but does
not have a valid ip address yet, we do not create a
VIF for that device yet. As such when we attempt
to assign the vif array in the pim static data structure
we attempt to write into -1 of that array.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
David Lamparter [Fri, 30 Nov 2018 20:42:25 +0000 (21:42 +0100)]
build, lib/yang: bake in extensions if possible
Starting with libyang 0.16.74, we can load internally embedded yang
extensions instead of going through the file system/dlopen. Detect
support for this at build time and use if available.
NB: the fallback mechanism will go away in a short while.
Signed-off-by: David Lamparter <equinox@diac24.net>
Donald Sharp [Tue, 22 Jan 2019 12:39:14 +0000 (07:39 -0500)]
zebra: Add code to track sequence number from zebra_router
The sequence number used should be unique and increase by 1
for netlink commands. This will allow the code to match
up batched commands to actual requests, so that we can signal
the failure correctly back.
So start the movement and tracking of sequence numbers as
an atomic uint32_t in zebra_router. Modify the dataplane
code to start tracking contexts from this value.
In future commits we will move more of the sequencing
data into using this value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Nitin Soni [Thu, 24 Jan 2019 09:43:48 +0000 (01:43 -0800)]
bgpd: route-map fails to filter type-5 routes
Route-map filtering is based on the value of
"bgp->adv_cmd_rmap[afi][safi].map". For example, we advertise routes in
bgp_evpn_advertise_type5_routes() based on the value of
"bgp->adv_cmd_rmap[afi][safi].map". This variable gets populated in vty
handler bgp_evpn_advertise_type5. This variable will not get populated
if we have not yet applied the route-map configuration. The fix is to
correctly populate "bgp->adv_cmd_rmap[afi][safi].map" in
bgp_route_map_process_update() if it has not been populated before.
Ticket: CM-23263 Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com> Reviewed-by: CCR-8163
Ruben Kerkhof [Tue, 22 Jan 2019 17:32:08 +0000 (18:32 +0100)]
Redirect output of ranlib to /dev/null
./configure on Mac OS logs:
checking whether ranlib supports D option... error: /Library/Developer/CommandLineTools/usr/bin/ranlib: unknown option character `D' in: -D
Usage: /Library/Developer/CommandLineTools/usr/bin/ranlib [-sactfqLT] [-] archive [...]
no
Ruben Kerkhof [Tue, 22 Jan 2019 17:29:29 +0000 (18:29 +0100)]
Redirect output of $AR check to /dev/null
./configure logs this on Mac OS:
checking whether ar supports D option... /Library/Developer/CommandLineTools/usr/bin/ar: illegal option -- D
usage: ar -d [-TLsv] archive file ...
ar -m [-TLsv] archive file ...
ar -m [-abiTLsv] position archive file ...
ar -p [-TLsv] archive [file ...]
ar -q [-cTLsv] archive file ...
ar -r [-cuTLsv] archive file ...
ar -r [-abciuTLsv] position archive file ...
ar -t [-TLsv] archive [file ...]
ar -x [-ouTLsv] archive [file ...]
no
This is quite noisy and we're only interested in the result of the
check, not the output.
Rafael Zalamena [Wed, 23 Jan 2019 12:25:30 +0000 (10:25 -0200)]
ospf6d: keep track of the socket set thread
When using the timer to set the socket multicast options, keep track
of the thread pointer. If we lose the thread reference we might have
situations where multicast is enabled when it should be disabled and
vice versa.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Don Slice [Thu, 27 Dec 2018 16:03:38 +0000 (11:03 -0500)]
bgpd: improve peer-group remote-as definitions
Problem reported that with certain sequences of defining the
remote-as on the peer-group and the members, the configuration would
become wrong, with configured remote-as settings not reflected in
the config but peers unable to come up. This fix resolves these
inconsistencies.
Ticket: CM-19560 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Wed, 23 Jan 2019 01:00:55 +0000 (20:00 -0500)]
topotests: Modify bgp convergence to be more than 120 seconds
Waiting 10 seconds for bgp convergence makes no sense, especially
if the test system is under load and a node is started up before
the node it is connecting to is up. We should wait for the full
default of 120 seconds, plus a little time to ensure nothing is
screwed up too much.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>