Philippe Guibert [Thu, 28 Oct 2021 16:28:42 +0000 (18:28 +0200)]
zebra: update dataplane flowspec address family in ipset_info
It is needed for the ipset entry to know for which address family
this ipset entry applies to. Actually, the family is in the original
ipset structure and was not passed as attribute in the dataplane
ipset_info structure. Add it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 28 Oct 2021 11:42:57 +0000 (13:42 +0200)]
zebra: fix flowspec ipset operations
When injecting an ipset entry into the zebra dataplane context, the
ipset name is stored in a separate structure. This will permit the
flowspec plugin to be able to know which ipset has to be appended with
relevant ipset entry.
The problem was that the zebra dataplane objects related to ipset entries
is made up of an union between the ipset structure and the ipset info
structure. This was implying that the two structures were on the same
memory zone, and when extracting the data stored, the data were incomplete.
Fix this by replacing the union structure by a defined struct.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 27 Oct 2021 14:45:05 +0000 (16:45 +0200)]
ospf6d: avoid writing dumb ospf6 info at startup
in show: 'show ipv6 ospf6' handler command, the reason of SPF
executation is looked up and displayed. At startup, SPF has been
started, but shows no specific reason. Instead of dumping non
initialised string context, reset the string context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Igor Ryzhov [Tue, 2 Nov 2021 21:29:19 +0000 (00:29 +0300)]
lib: fix crash when terminating inactive VRFs
If the VRF is not enabled, if_terminate deletes the VRF after the last
interface is removed from it. Therefore daemons crash on the subsequent
call to vrf_delete. We should call vrf_delete only for enabled VRFs.
Igor Ryzhov [Tue, 2 Nov 2021 20:54:43 +0000 (23:54 +0300)]
zebra: fix stale pointer when netns is deleted
When the netns is deleted, we should always clear the vrf->ns_ctxt
pointer. Currently, it is not cleared when there are interfaces in the
netns at the time of deletion.
If the netns is re-created, zebra crashes because it tries to use the
stale pointer.
Donald Sharp [Mon, 1 Nov 2021 19:08:50 +0000 (15:08 -0400)]
tests: All_protocol_startup sporadic failure
the test_nexthop_groups function is failing occassionally
because the test executes 4 in succession sharp install
routes commands. When I dumped the rib on a failed test
run there were only 2 of the 4 routes in the rib and
the two that were in were the last 2 installed.
The sharp daemon setups a event process where it
installs routes `automatically`. If the previous
run is not finished entering a new command to install
the routes will mess up the last one from ever happening.
It is assumed that the user doesn't do stupid stuff here.
In this case I am just adding a small sleep between each
installation to just let the test proceed.
Donald Sharp [Fri, 29 Oct 2021 17:03:42 +0000 (13:03 -0400)]
lib: Return Null when we have an empty string for script name
The script entries were being stored in a hash lookup with
the script name a pre-defined array of characters. The hash
lookup is succeeding since it is auto-installed at script
start time irrelevant if there is a handler function.
Modify the code so that if the scriptname is an empty
string "\0" just return a NULL so that zebra does
not attempt to actually load up the script
Donald Sharp [Mon, 1 Nov 2021 00:08:29 +0000 (20:08 -0400)]
tests: isis_topo1 needs to wait for results under load
the isis_topo1 test has two functions where immediately
after the test ensures that the routes are in isis
tests to see if they are in the rib. Under system
load I am seeing this test failing because the
routes are still queued. Modify the zebra check
for the isis routes to look for the proper results
for 10 seconds.
Tested with GoBGP (Helper):
```
long-lived-graceful-restart: advertised and received
Local:
ipv4-unicast, restart time 100000 sec
Remote:
ipv4-unicast, restart time 10 sec, forward flag set
```
Donald Sharp [Fri, 29 Oct 2021 14:21:28 +0000 (10:21 -0400)]
tests: Fix zebra_seg6_route to not always reinstall the same route
This code has two issues:
a) The loop to test for successful installation re-installs
the route every time it loops. A system under load will
have issues ensuring the route is installed and repeated
attempts does not help
b) The nexthop group installation was always failing
but never noticed (because of the previous commit)
and the test was always passing, when it should
have never passed.
Donald Sharp [Fri, 29 Oct 2021 12:47:05 +0000 (08:47 -0400)]
tests: zebra_seg6local has a race condition
The test is checking installing of seg6 routes by this
loop:
for up to 5 times:
sharp install seg6 route
show ip route and is it installed
The problem is that if the system is under heavy
load the installation may not have happened yet
and by immediately reinstalling the same route
the same thing could happen again.
Modify the code to pull the route installation
outside of the loop and to increase to 10 attempts
in case there is very heavy system load.
Olivier Dugeon [Mon, 25 Oct 2021 09:52:19 +0000 (11:52 +0200)]
lib: Fix comparison function in link_state.c
ls_node_same, ls_attributes_same and ls_prefix_same are not producing expected
result due to a wrong usage of memcmp. In addition, if respective structures
are not initialized with 0, there is a risk that the comparison failed.
This patch correct usage of memcmp and expand comparison to each invidual
parameters of the respective structure for safer result.
Donald Sharp [Thu, 28 Oct 2021 19:51:46 +0000 (15:51 -0400)]
tests: Fix `check_ping` function in test_bgp_srv6l3vpn_to_bgp_vrf.py
The check_ping function `_check` function was asserting and being
passed to the topotests.run_and_expect() functionality causing
it to not run the full range of pings if one failed the test.
So effectively it was properly detecting pass / failure but
only allowing for 1 iteration if it was going to fail.
Modify the code to not assert and act like all the other
run_and_expect functionality.
Igor Ryzhov [Thu, 14 Oct 2021 16:15:14 +0000 (19:15 +0300)]
lib: make if_lookup_by_index_all_vrf internal
This function doesn't work correctly with netns VRF backend as the same
index may be used in multiple netns simultaneously. So let's hide it
from the public API to reduce temptation to use it instead of writing
the correct code.
Donald Sharp [Thu, 28 Oct 2021 12:10:28 +0000 (08:10 -0400)]
zebra: Fix netlink RTM_NEWNEXTHOP parsing for nested attributes
With the addition of resillient hashing for nexthops, the
parsing of nexthops requires telling the decoder functions
that there may be nested attributes. This was found by
code inspection of iproute2/ipnexthop.c when trying to
understand resillient hashing as well as statistics
gathering for nexthops that are / will be in upstream
kernels in the near future.
The default vrf name is obtained by zebra daemon. While isis is not
connected to zebra, i.e. at startup, when loading a startup configuration,
the macro VRF_DEFAULT_NAME is used and returns 'default'.
But because zebra connected and forces to a new default vrf name, the
configuration is not seen as the default one, and further attempts to
configure the isis instance via 'router isis 1' will trigger creation
of an other instance.
To handle this situation, at vrf_enable() event, which is called for
each default vrf name change, the associated isis instance is updated
with th new vrf name. The same is done for NB yang path.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Sai Gomathi [Tue, 26 Oct 2021 14:03:58 +0000 (07:03 -0700)]
pimd: modification in mroute
Problem:
-------
(*,G) created on transit node where same groups are defined as SSM
At present FRR has SSM checks for IGMP report, but SSM check is missing for PIM join.
Fix:
----
Whenever there is a modification in prefix list for SSM range, then we need to browse the ifchannels (PIM joins)
and groups coming in SSM range with (and *,G) should be removed from ifchannel database and also withdraw those routes
from upstream routers.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Sai Gomathi [Tue, 26 Oct 2021 12:55:23 +0000 (05:55 -0700)]
pimd: modifications in PIM joins
Problem :
=======
(*,G) created on transit node where same groups are defined as SSM
At present FRR has SSM checks for IGMP report, but SSM check is missing for PIM join.
Fix :
===
When an interface receives the PIM (*,G)join with G as SSM group, then PIMd supposed to discard that join.
There is no need to maintain PIM state for this group.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Donatas Abraitis [Tue, 26 Oct 2021 07:37:22 +0000 (10:37 +0300)]
bgpd: Unify multiprotocol capability for OPEN logging
Before:
```
192.168.10.17 OPEN has MultiProtocol Extensions capability (1), length 4
192.168.10.17 OPEN has MP_EXT CAP for afi/safi: IPv4/unicast
```
After:
```
192.168.10.17 OPEN has MultiProtocol Extensions capability (1), length 4
192.168.10.17 OPEN has MultiProtocol Extensions capability for afi/safi: IPv4/unicast
```
Hiroki Shirokura [Mon, 25 Oct 2021 23:36:14 +0000 (23:36 +0000)]
lib: fix srv6 route hardcode with BGP
zclient_send_localsid is called by various routing protocol daemons. To set the
srv6 endpoint function. Fix a hard-coded error in the initial implementation.
Before this PR, the srv6 function will be registered to zebra as a BGP route
even if isisd executes zclient_send_localsid.
when gre information could not be retrieved because GRE interface has
been deleted, a GRE_UPDATE message may be sent to NHRP. In that case,
the gre values are reset. There was a missing tunnel destination value,
which has been omitted.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
So the 2.2.2.2 routes on r1,3 and 4 are received via ospf, but are
modified by some other process to add labels ( probably ldp, since
it is running too ). The 2nd ping to 2.2.2.2 is failing because
the 2.2.2.2 route on r4 is being replaced. As an example here
is `ip monitor all` on r4 during boot up. Please note timestamps
are not necessarily representative of what we will see on the
loaded ci system.
[2021-10-15T15:46:52.261456] [NEXTHOP]id 27 via 10.0.2.2 dev r4-eth0 scope link proto zebra
[2021-10-15T15:46:52.261490] [ROUTE]2.2.2.2 nhid 27 via 10.0.2.2 dev r4-eth0 proto ospf metric 20
<snip>
[2021-10-15T15:46:53.556405] [NEXTHOP]Deleted id 27 via 10.0.2.2 dev r4-eth0 scope link proto zebra
<snip>
[2021-10-15T15:46:53.566575] [NEXTHOP]id 32 via 10.0.2.2 dev r4-eth0 scope link proto zebra
[2021-10-15T15:46:53.566585] [ROUTE]2.2.2.2 nhid 32 via 10.0.2.2 dev r4-eth0 proto ospf metric 20
For a small amount of time the route was *gone*. I believe the upstream
CI system hits that window sometimes, causing the test to fail.
This patch attempts to ensure that the 2.2.2.2 route should be learned
appropriately ( thus slowing it down ) before the test moves onto
the ping. I suspect the long term answer might be to add a test to
the scripts/adjancies.py script to ensure that the test does not
continue until the appropriate label is in place, but I want to
make the test run a bit more perscriptive in what it is looking
for here.
Donald Sharp [Mon, 25 Oct 2021 13:03:48 +0000 (09:03 -0400)]
doc: Remove reference to non-existent command
The `neighbor <peer> version <X>` command does not
exist. I am unable to find it going back to version
2.0 of FRR. So this command has been not in the system
for a very long time.
In any event bgp already supports version 4 of bgp and
it auto-negotiates this.
Igor Ryzhov [Wed, 13 Oct 2021 15:30:06 +0000 (18:30 +0300)]
*: fix interface config write in NB-converted daemons
When writing the config from the NB-converted daemon, we must not rely
on the operational data. This commit changes the output of the interface
configuration to use only config data. As the code is the same for all
daemons, move it to the lib and remove all the duplicated code.