lib: fix handling of deleted nodes in the sysrepo plugin
Make the sysrepo plugin ignore the deletion of configuration
nodes that don't exist anymore instead of logging an error and
rejecting the changes. This is necessary because Sysrepo delivers
delete notifications for all nodes of a deleted data tree instead
of delivering a single delete notification of the top-level subtree
node (which would suffice for the northbound layer).
From Sysrepo's documentation:
"Note: do not use fork() after creating a connection. Sysrepo
internally stores PID of every created connection and this way a
mismatch of PID and connection is created".
Introduce a new "frr_very_late_init" hook in libfrr that is only
called after the daemon is forked (when the '-d' option is used)
and after the configuration is read. This way we can initialize
the sysrepo plugin correctly even when the daemon is daemonized,
and after the Sysrepo CLI commands are processed (only "debug
northbound client sysrepo" for now).
Currently, when the is-type of an area is changed and its circuits resign,
we are not resetting the DIS flag. Consequently, if the area type is reverted
we are not running the DR election and not regenerating the pseudonode LSP.
Also adding event debug logs for circuit commence/resign.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Don Slice [Thu, 10 Sep 2020 12:40:28 +0000 (12:40 +0000)]
bgpd: correct community-list replace logic
Problem rerported that if you enter an existing community list
sequence number with new community information, the entire community
list would be deleted. This commit fixes the replace logic to do
the right thing.
Ticket: CM-30555 Signed-off-by: Don Slice <dslice@nvidia.com>
Donald Sharp [Fri, 11 Sep 2020 17:05:55 +0000 (13:05 -0400)]
pbrd: Ensure rule is installed on interface up
If we are experiencing an interface that is bouncing
very fast and the last operation that we experienced
was a ifdown we will send rule deletions associated
with that interface. If we have not received notification
that hte rule was removed *but* we immiedately get another
ifup notification when we go to install the rule we
are deciding that it's not ready to send down again,
as that we still think it is installed.
Force the rule installation when we have a interface up
event.
Ticket: CM-31042 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Thu, 10 Sep 2020 15:31:39 +0000 (11:31 -0400)]
bgpd, lib, pbrd, zebra: Pass by ifname
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042 Signed-off-by: Stephen Worley <sworley@nvidia.com> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
lib: fix crashes with leafrefs that point to non-implemented modules
Whenever libyang loads a module that contains a leafref, it will
also implicitly load the module of the referring node if it's
not loaded already. That makes sense as otherwise it wouldn't be
possible to validate the leafref value correctly.
The problem is that loading a module implicitly violates the
assumption of the northbound layer that all loaded modules
are implemented (i.e. they have a northbound node associated
to each schema node). This means that loading a module that
isn't implemented can lead to crashes as the "priv" pointer
of schema nodes is no longer guaranteed to be valid. To fix this
problem, add a few null checks to ignore data nodes associated
to non-implemented modules.
The side effect of this change is harmless. If a daemon receives
configuration it doesn't support (e.g. BFD peers on staticd),
that configuration will be stored but otherwise ignored. This can
only happen when using a northbound client like gRPC, as the CLI
will never send to a daemon a command it doesn't support. This
minor problem should go away in the long run as FRR migrates to
a centralized management model, at which point the YANG-modeled
configuration of all daemons will be maintained in a single place.
Finally, update some daemons to stop implementing YANG modules
they don't need to (i.e. revert 1b741a01c and a74b47f5).
Donald Sharp [Fri, 11 Sep 2020 12:27:28 +0000 (08:27 -0400)]
pimd: Warn when we try to build MAXVIFS > 256
We use the pim mroute socket for kernel notifications of events.
Currently this is limited to 8 bits of data. There are patches
coming down the pike in kernel land to allow this to expand.
Rather than fix this and all the other places we assume MAXVIFS < 256
in the pim code right now. Leave a land mine for the developer
doing this work to point them in the right direction.
staticd: fix display of the "nexthop-vrf" parameter of static routes
When the static route VRF and its nexthop VRF are inactive in the
kernel, both VRFs will have the same ID (VRF_UNKNOWN) even though
they might not be the same. This can cause "sh run" to not display
the "nexthop-vrf" parameter correctly when necessary. Change the
code to compare VRFs by their names to fix this problem.
staticd: remove checks that are no longer necessary
All call sites of static_route_leak() are passing a non-null pointer
to the 'vty' parameter, hence remove the 'vty' null checks that
are no longer necessary.
Don Slice [Tue, 22 Sep 2020 13:14:52 +0000 (06:14 -0700)]
bgpd: allow derived router-id update if previously 0x0
Problem found that if a router-id was not defined or derived
initially, the bgp->router_id would be set to 0x0 and used
for determining auto-rd values. When bgp received a subsequent
router-id update from zebra, bgp would not completely process
the update since it was treated as updating an already derived
router-id with a new value, which is not desired. This also
could leave the auto rd/rt inforamation missing or invalid in
some cases. This fix allows updating the derived router-id if
the previous value was 0/0.
Ticket: CM-31441 Signed-off-by: Don Slice <dslice@nvidia.com>
bgpd: Use bgp instance's default keepalive interval if < (holdtime/3)
bgp->default_keepalive was not considered when setting
peer->v_keepalive, causing the effective keepalive interval to
always be (holdtime/3), even when default_keepalive < (holdtime/3).
This ensures that the default_keepalive is used when it's set and
is < (holdtime/3).
Donald Sharp [Fri, 25 Sep 2020 13:45:24 +0000 (09:45 -0400)]
bgpd: Allow bgp static routes to use /32's
If you are including a network statement of a /32
then the current bgp martian checks will match the /32
together.
Problem:
!
router bgp 3235
neighbor 192.168.161.2 remote-as external
neighbor 192.168.161.131 remote-as external
!
address-family ipv4 unicast
network 10.10.3.11/32
network 192.168.161.0/24
no neighbor 192.168.161.2 activate
neighbor 192.168.161.2 route-map BLUE in
exit-address-family
!
eva# show bgp ipv4 uni
BGP table version is 1, local router ID is 10.10.3.11, vrf id 0
Default local pref 100, local AS 3235
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
10.10.3.11/32 0.0.0.0(eva) 0 32768 i
*> 192.168.161.0/24 0.0.0.0(eva) 0 32768 i
Displayed 2 routes and 2 total paths
eva# show bgp import-check-table
Current BGP import check cache:
192.168.161.0 valid [IGP metric 0], #paths 1
if enp39s0
Last update: Fri Sep 25 08:00:42 2020
10.10.3.11 valid [IGP metric 0], #paths 1
if lo
Last update: Fri Sep 25 08:00:42 2020
eva# show bgp ipv4 uni summ
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 1
RIB entries 3, using 576 bytes of memory
Peers 1, using 21 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt
janelle(192.168.161.131) 4 60000 69 70 0 0 0 00:03:21 0 1
Total number of neighbors 1
When we are deciding that a nexthop is valid there is not much point in checking
that a static route has a martian nexthop or not, since we self derived it already.
would end up leaving the 4.4.4.4/32 address on the interface under
freebsd.
This all boils down to the fact that the interface is not
considered connected yet we have a destination. If the
destination is the same and we are not connected ignore
it on freebsd.
I am sure there are other fun scenarios that someone
will have to squirrel out.
Pat Ruddy [Wed, 10 Jun 2020 13:39:37 +0000 (14:39 +0100)]
bgp: remove duplicate command installs
[no_]neighbor_nexthop_self_cmd & [no_]neighbor_nexthop_self_force_cmd
have duplicate install_element actions on the EVPN_NODE. This causes
duplicate command log errors which are caught by topotests. Remove
these.
topotests: add bgp_evpn_rt5 test with vrf netns backend
this test checks connectivity between a vrf-lite device and a vrf-netns
device. this ensures that evpn serice is importing appropriate evpn rt5
entries in the correct vrf.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra: dynamically detect vxlan link interfaces in other netns
this is used when parsing the newly network namespaces. actually, to
track the link of some interfaces like vxlan interfaces, both link index
and link nsid are necessary. if a vxlan interface is moved to a new
netns, the link information is in the default network namespace, then
LINK_NSID is the value of the netns by default in the new netns. That
value of the default netns in the new netns is not known, because the
system does not automatically assign an NSID of default network
namespace in the new netns. Now a new NSID of default netns, seen from
that new netns, is created. This permits to store at netns creation the
default netns relative value for further usage.
Because the default netns value is set from the new netns perspective,
it is not needed anymore to use the NETNSA_TARGET_NSID attribute only
available in recent kernels.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 20 Dec 2019 16:51:37 +0000 (17:51 +0100)]
lib, zebra: reuse and adapt ns_list walk functionality
the walk routine is used by vxlan service to identify some contexts in
each specific network namespace, when vrf netns backend is used. that
walk mechanism is extended with some additional paramters to the walk
routine.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 25 Oct 2019 12:25:00 +0000 (14:25 +0200)]
zebra: when parsing local entry against dad, retrieve config
when duplicate address detection is observed, some incrementation,
some timing mechanisms need to be done. For that the main evpn
configuration is retrieved. Until now, the VRF that was storing the dad
config parameters was the same VRF that hosted the VXLAN interface. With
netns backend, this is not true, as the VXLAN interface is in the
same VRF as the bridge interface. The modification takes same definition
as in BGP, that is to say that there is a single bgp evpn instance, and
this is that instance that will give the correct config settings.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 11 Oct 2019 12:11:13 +0000 (14:11 +0200)]
bgpd: evpn nexthop can be changed by default
There can be cases where evpn traffic is not meshed across various
endpoints, but sent to a central pe. For this situation, add the
configuration knobs to force nexthop attribute. Upon that change,
nexthop unchanged attribute is automatically disabled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra: zvni_map_to_vlan() adaptation for all namespaces
this change is needed when a MAC/IP entry is learned by zebra, and the
entry happens to be in a different namespace. So that the entry be
active, the correct vni match has to be found.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this information is necessary for local information, because the
interface associated to the mac address is stored with its ifindex, and
the ifindex may not be enough to get to the right interface when it
comes with multiple network namespaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra: bridge layer2 information records ns_id where bridge is
when working with vrf netns backend, two bridges interfaces may have the
same bridge interface index, but not the same namespace. because in vrf
netns backend mode, a bridge slave always belong to the same network
namespace, then a check with the namespace id and the ns id of the
bridge interface permits to resolve correctly the interface pointer.
The problem could occur if a same index of two bridge interfaces can be
found on two different namespaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra, lib: new API to get absolute netns val from relative netns val
when receiving a netlink API for an interface in a namespace, this
interface may come with LINK_NSID value, which means that the interface
has its link in an other namespace. Unfortunately, the link_nsid value
is self to that namespace, and there is a need to know what is its
associated nsid value from the default namespace point of view.
The information collected previously on each namespace, can then be
compared with that value to check if the link belongs to the default
namespace or not.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra, lib: store relative default ns id in each namespace
to be able to retrieve the network namespace identifier for each
namespace, the ns id is stored in each ns context. For default
namespace, the netns id is the same as that value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra, lib: add an internal API to get relative default nsid in other ns
as remind, the netns identifiers are local to a namespace. that is to
say that for instance, a vrf <vrfx> will have a netns id value in one
netns, and have an other netns id value in one other netns.
There is a need for zebra daemon to collect some cross information, like
the LINK_NETNSID information from interfaces having link layer in an
other network namespace. For that, it is needed to have a global
overview instead of a relative overview per namespace.
The first brick of this change is an API that sticks to netlink API,
that uses NETNSA_TARGET_NSID. from a given vrf vrfX, and a new vrf
created vrfY, the API returns the value of nsID from vrfX, inside the
new vrf vrfY.
The brick also gets the ns id value of default namespace in each other
namespace. An additional value in ns.h is offered, that permits to
retrieve the default namespace context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra: map vxlan interface to bridge interface with correct ns id
an incoming bridge index has been found, that is linked with vxlan
interface, and the search for that bridge interface is done. In
vrf-lite, the search is done across the same default namespace, because
bridge and vxlan may not be in the same vrf. But this behaviour is wrong
when using vrf netns backend, as the bridge and the vxlan have to be in
the same vrf ( hence in the same network namespace). To comply with
that, use the netnamespace of the vxlan interface. Like that, the
appropriate nsid is passed as parameter, and consequently, the search is
correct, and the mac address passed to BGP will be ok too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 26 Sep 2019 16:49:59 +0000 (18:49 +0200)]
zebra: importation of bgp evpn rt5 from vni with other netns
With vrf-lite mechanisms, it is possible to create layer 3 vnis by
creating a bridge interface in default vr, by creating a vxlan interface
that is attached to that bridge interface, then by moving the vxlan
interface to the wished vrf.
With vrf-netns mechanism, it is slightly different since bridged
interfaces can not be separated in different network namespaces. To make
it work, the setup consists in :
- creating a vxlan interface on default vrf.
- move the vxlan interface to the wished vrf ( with an other netns)
- create a bridge interface in the wished vrf
- attach the vxlan interface to that bridged interface
from that point, if BGP is enabled to advertise vnis in default vrf,
then vxlan interfaces are discovered appropriately in other vrfs,
provided that the link interface still resides in the vrf where l2vpn is
advertised.
to import ipv4 entries from a separate vrf, into the l2vpn, the
configuration of vni in the dedicated vrf + the advertisement of ipv4
entries in bgp vrf will import the entries in the bgp l2vpn.
the modification consists in parsing the vxlan interfaces in all network
namespaces, where the link resides in the same network namespace as the
bgp core instance where bgp l2vpn is enabled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Fri, 11 Sep 2020 12:51:05 +0000 (08:51 -0400)]
nhrpd: add frr-vrf to the list of implemented yang modules
PR #6376 introduced a VRF leafref in the frr-interface YANG module.
That change exposed a bug in the northbound layer that is causing
nhrpd to crash under certain circumstances. Even though nhrpd wasn't
converted to the new northbound model yet, make it implement the
frr-vrf module in order to work around this problem. This is a
temporary fix until a better solution is available.
pbrd: add frr-vrf to the list of implemented yang modules
PR #6376 introduced a VRF leafref in the frr-interface YANG module.
That change exposed a bug in the northbound layer that is causing
pbrd to crash under certain circumstances. Even though pbrd wasn't
converted to the new northbound model yet, make it implement the
frr-vrf module in order to work around this problem. This is a
temporary fix until a better solution is available.
Don slice [Wed, 5 Aug 2020 19:08:17 +0000 (19:08 +0000)]
bgpd: add global config for update-delay
Enhancement to update-delay configuration to allow setting globally
rather than per-instance. Setting the update-delay is allowed either
per-vrf or globally, but not both at the same time.
Ticket: CM-31096 Signed-off-by: Don Slice <dslice@nvidia.com>
This would be useful in cases with lots of peers and shutdown them
automatically if RTT goes above the specified limit.
A host with 512 or more IPv6 addresses has a higher latency due to
ipv6_addr_label(). This method tries to pick the best candidate address
fo outgoing connection and literally increases processing latency.
lynne [Wed, 12 Aug 2020 23:15:24 +0000 (19:15 -0400)]
ldpd: Fix issue when starting up LDP with no configuration.
LDP would mark all routes as learned on a non-ldp interface. Then
when LDP was configured the labels were not updated correctly. This
commit fixes issues 6841 and 6842.