This is not part of the make check tests and it has been broken
for a while, apparently. The way the label manager is coded makes
it very hard to code unit tests, and testing the relay of requests
to an external label manager is probably better done through
a topotest, so remove this.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
when requesting a specific label chunk (e.g. for the SRGB),
it might happen that we cannot get what we want. In this
event, we must be prepared to receive a response with no
label chunk. Without this fix, if the remote label manager
was not able to alloate the chunk we requested, we would
hang indefinitely trying to read data from the stream which
was not there.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
For SRGB, we need to support chunk requests starting at a
specific point in the label space, rather than just asking
for any sufficiently large chunk. To this purpose, we extend
the label manager api to request a chunk with a base value;
if the base is set to 0, the label manager will behave as it
currently does, i.e. fetching the first free chunk big enough
to satisfy the request.
update all the existing calls to get chunks from the label
manager so that they use MPLS_LABEL_BASE_ANY as the base
for the requested chunk
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
David Lamparter [Fri, 10 May 2019 13:31:04 +0000 (15:31 +0200)]
bgpd: add instance delete & config write hooks
Both of these hooks are necessary for proper operation of extensions
that need to latch on to a particular instance.
- without the delete hook, it's impossible to get rid of stale
references, leading to crashes with invalid instance pointers.
- the config-write hook is necessary because per-instance config needs
to be written inside the "router bgp" block to have the appropriate
context; adding a separate config node can't do that.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 24 Apr 2019 18:14:19 +0000 (20:14 +0200)]
bgpd: fix last_reset_cause setup
last_reset_cause_size is the length *used* in last_reset_cause[]. It's
straight up used wrong here; we're saving off a reset cause and need to
check against the *available* size in last_reset_cause[].
This could actually have led to (hopefully rare) crashes in the assert
there, since the assert condition might fail incorrectly.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Tue, 2 Jul 2019 23:06:22 +0000 (19:06 -0400)]
ospfd: Do not modify list when using _RO loop
The shutdown of ospf was causing crashes because the shutdown
was calling a ALL_LIST_ELEMENTS_RO macro and modifying the
underlying data structures. Switch to using ALL_LIST_ELEMENTS.
ospfd: Cleanup ospf->redist and ospf->external on shutdown
Effectively my original testing for this only had one external
route and as such we would not have a crash here. It only
showed up after multiple externals have been introduced.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Tue, 2 Jul 2019 18:05:18 +0000 (14:05 -0400)]
lib: add process pids to 'show modules'
Add the process pids to the output produced by 'show modules'.
At least in a development setting, where there may be multiple
instances of frr running, it can be handy to be able to id
the exact pids, for debugging e.g.
bgp update messages were not correctly calculating the size
for a labeled-unicast prefix, as they were not accounting
for the label. If the update message was large enough to
overflow the maximum packet size (4096 bytes) this could
cause bgpd to send a malformed update packet.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Chirag Shah [Fri, 28 Jun 2019 17:42:08 +0000 (10:42 -0700)]
zebra: evpn entries are not cleaned upon frr stop
As part of PR 4458, when a client (bgpd) goes down,
zebra cleans up any evpn state including remotely learned
neighs, macs and vteps are suppose to be cleaned up,
uninstall from kernel tables.
Neighs (arps), macs and vteps (HREP entries) were not
removed from kernel tables as the uninstall flag as not set.
Clean up l3vni associated remote neighs, macs and vteps.
Validated in evpn symmetric routing topology where
remotely learned l2/l3 vnis neigh, macs and remote
vtep (hrep) entries are installed in kernel table,
perform systemctl stop frr.service and validated
all remotely learned entries cleaned up from kernel
tables.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Biswajit Sadhu [Mon, 24 Jun 2019 07:50:33 +0000 (00:50 -0700)]
bgpd: 'show bgp ipv6 neighbors <X::Y> prefix-counts' prefix-count is
not getting displayed.
Neighbour prefix-count is not getting displayed with IPV6 neighbours
and displays the o/p “ % No such neighbor or address family ”.
However, I observed it is working fine for IPV4 neighbour.
ripd: fix problems with the "ip rip authentication" command
libyang-0.16-r3 doesn't allow the creation of data nodes if their
corresponding "when" statement (if any) resolves to false. This
change broke the "ip rip authentication" command.
This commit adapts this command so that it's not affected by the
new libyang stricter validation checks.
libyang-0.16-r3 contains a commit[1] that changed the autodelete
behavior of subtrees when validating data. A few FRR commands were
affected by this change since they relied on the old autodelete
behavior.
To fix these commands, use the LYD_OPT_WHENAUTODEL flag when
validating data to restore the old autodelete behavior (which adds
a lot of convenience for us).
Stephen Worley [Mon, 24 Jun 2019 18:04:13 +0000 (14:04 -0400)]
lib: Private api for nexthop_group manipulation
Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 24 Jun 2019 15:37:49 +0000 (11:37 -0400)]
lib: Add a nexthop_dup() that allocates and copies
Add a nexthop_dup() api that both allocates and copies
a new nexthop from an old one. Still retain the old exposed
function nexthop_copy() so we can copy without allocation.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Thu, 20 Sep 2018 17:59:35 +0000 (13:59 -0400)]
lib: Add a couple functions to nexthop_group.c
Add nexthop_group_copy and nexthop_group_add_sorted functions.
nexthop_group_copy -> Copy src nexthop_group into dst nexthop_group
nexthop_group_add_sorted -> Adds a new nexthop to the nexthop group
in a sorted manner.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 26 Jun 2019 00:14:25 +0000 (20:14 -0400)]
vtysh: Try to be perscriptive about pam failures
When using pam for authentication, the code just silently
fails and gives no indication to the end user what has gone
wrong. Try to increase messaging about what has gone wrong
by outputting some more data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 25 Jun 2019 21:10:12 +0000 (17:10 -0400)]
zebra: import table entries are not showing up in the right table
When we are importing/removing the table entry from table X into the
default routing table we are not properly setting the table_id
of the route entry. This is causing the route to be pushed
into the wrong internal table and to not be found for deletion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 25 Jun 2019 21:07:30 +0000 (17:07 -0400)]
zebra: Push VRF_DEFAULT outside of import table code
The import table code assumes that they will only work
in the default vrf. This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.
This will allow us to fix a couple of things:
1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.
2) In the future allow `ip import-table X` to become
vrf aware very easily.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 25 Jun 2019 20:35:40 +0000 (16:35 -0400)]
zebra: Use correct parameter order for table lookup up.
The import-table code when looking up the table to use
for route-import was reversing the order of the table_id
and vrf_id causing us to never ever lookup a table
and we would cause the `ip|ipv6 import-table X` commands
to be just ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Stephen Worley [Tue, 25 Jun 2019 14:49:43 +0000 (10:49 -0400)]
doc: Add a performance note for log filtering
Add a not under the log-filter command to indicate
that, while filtering reduces load on the system by
large margins, there may still be a small performance
hit from filtering those.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Tue, 25 Jun 2019 04:30:11 +0000 (00:30 -0400)]
pimd: Dissallow query to be received from a non-connected source
When we receive an igmp query on a interface, ensure that the
source address of the packet is connected to the incoming
interface. This will prevent a meanie from crafting a igmp
packet with a source address less than ours and causing
us to suspend query activities.
Fixes: #1692 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 24 Jun 2019 13:50:55 +0000 (09:50 -0400)]
zebra: Fix rnh old -vs- new comparison
1) If we are moving the nexthop we are tracking to
a new rn in the rib, then we know that the route
to get to that nexthop has changed. As such
we should notify the upper level.
This manifested itself because the code had a trigraph `?`
in the wrong order. Put the comparison in the right order.
2) If we are re-matching to the same rn and we call compare_state
then we need to see if our stored nexthops are the same or different.
If they are the same we should not notify. If they are different
we should notify. compare_state was only comparing the flags
on a route and since those are not necessarily the right flags
to look at( and we are well after the fact that the route has
already changed and been processed ) let's just compare
the nexthops to see if they are the same or different.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
rgirada [Thu, 13 Jun 2019 17:21:37 +0000 (10:21 -0700)]
pimd: Added cli to generate igmp query.
Fix details :
Added a utility cli to generate a igmp query on an interface.
This won't impact the existing query generation based on the
general query interval.
when receiving an EAGAIN while trying to read the header
of a ZAPI message, we were erroneously continuing as if
everything was fine, which could crash zebra. Fix this
by returning and letting the re-armed read task deal with
this
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
under some conditions, the callback to get a label for
a LU bgp path could be called after the path had already
been freed. In this case we would be reading garbage
and potentially crash. Lock the path info before
queueing the callback, and unlock as the first step
of the callback, exiting gracefully if the path info
is now NULL.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
pimd: remove pim and igmp OIFs when ifchannel_delete happens
In a pim-evpn setup (say TORC11<=>TORC12) an mroute can have a mix of
PIM and IGMP joins. The vxlan termination device ipmr-lo is IGMP
joined on termination mroutes and the peerlink-rif can be pim joined
on the same mroute if the MLAG peer (TORC11) loses all its uplinks to
underlay -
root@TORC12:~# net show pim state 239.1.1.101|grep pimreg
1 * 239.1.1.101 uplink-1
pimreg(I ), ipmr-lo( J ), peerlink-3.4094( J )
root@TORC12:~#
When the uplinks come back up on TORC11 it will prune the peerlink-rif
and join the RP (say spine) via the uplinks.
TORC12 is rxing the prune and removing the if_channel
(pim_ifchannel_delete). However it is not removing the OIF from
mfcc_ttl basically leaving behind a leaked OIF in the forwarding
entry. And this is because it is deriving the owner flag from the
parent upstream entry and incorrectly concluding that all OIFs are
IGMP joined.
Thix fix flushes out both PIM and IGMP ownership when the ifchannel is
deleted.
There is a second fix in the commit and that is to set the proto mask
correctly (to STAR) for inherited OIFs. (S,G) entries can inherit the
OIF from the (*, G) entry and this decision can change when the pim/igmp
ifchannel is removed. The earlier code was setting the proto-mask
incorrectly to PIM or IGMP.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit d4d1d968dbbe61347393f7dace8b675496ff1024)
pimd: ensure that the oif is removed from all the mroutes pre-vifi deletion
When a link goes down the vifi was being deleted but the OIF stayed
in the OIL with a stale vifi -
oroot@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s1 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo pimreg( J ), ipmr-lo( *), swp1s1( J )
root@act-7726-03:~# ip link set swp1s1 down
root@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s0 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo ipmr-lo( *), swp1s0( J ), <oif?>( J ) >>>>>>>>
root@act-7726-03:~#
The problem was as a part ifchannel_delete the join state of the channel
was checked to avoid incorrect OIF deletion this was preventing the OIF
from being flushed. Fix is to flip the channel join-state to NOINFO before
deleting it.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 27 Feb 2019 20:08:29 +0000 (15:08 -0500)]
ospfd: Cleanup ospf->redist and ospf->external on shutdown
These two data types were written to handle redistribute
and external data types. On shutdown cleanup the memory
allocated to these if we are doing redistribution.
This was found using valgrind.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>