nguggarigoud [Tue, 21 Jun 2022 07:42:45 +0000 (00:42 -0700)]
tests: Removing invalid step from ospf tests.
1. Removed the step from hello test case with hello
timer of 65535. This test works in some platforms
and does not work in others, affecting stability.
Donald Sharp [Fri, 17 Jun 2022 17:43:30 +0000 (13:43 -0400)]
bgpd: Display useful values when using json for missing neighbor state
When a peer has not established connection yet, these values:
`hostLocal`, `portLocal`, `hostForeign`, `portForeign` might
not have any values and json output will not display anything
for them. Modify the code to display some nominal values in
this situation so that parsers are not surprised.
Christian Hopps [Fri, 17 Jun 2022 06:04:51 +0000 (02:04 -0400)]
lib: cleanup red-herring memleaks in parent of daemonizing fork
- The parent of the daemonizing fork reports memleaks for the early
northbound allocations (libyang). If these were real memleaks these
would show up in the child as well; however, ignoring all memleaks in
the parent of the fork is too hard a sale. Instead, spend some CPU
cycles cleaning up the allocations in the parent after the fork and
immeidatley prior to exiting the parent after the daemonizing fork.
Quentin Young [Mon, 6 Dec 2021 04:06:16 +0000 (23:06 -0500)]
tools: print daemon start cmd, vtysh_b cmd
When starting a daemon, print the full command run by the init script to
start it. This gives more information and is especially helpful when
debugging wrap commands.
Also add some more logs to vtysh_b to print the command used there,
log when we exit early because frr.conf doesn't exist, and simplify the
code path for creating the command to use.
Quentin Young [Mon, 6 Dec 2021 04:03:52 +0000 (23:03 -0500)]
tools: improve explanation of watchfrr_options
The explanation block for watchfrr_options was split into two blocks,
one explaining the --netns option and one making a vague statement that
the init script provides the list of daemons to start. The former can be
merged with the latter and the latter is more useful when stated as a
caveat for what you should actually use watchfrr_options for.
Quentin Young [Mon, 6 Dec 2021 04:03:11 +0000 (23:03 -0500)]
tools: improve explanation of 'wrap' options
We support 'wrap' variables in /etc/frr/daemons, but the explanation
given there doesn't touch on some of the subtleties of using these
variables.
The variables were designed for use with Valgrind, which has special
behavior when run with programs that daemonize; Valgrind will intercept
the fork()'d child process and run itself instead of the child. This
behavior allows it to follow the same forking semantics as the target
program.
For virtually every other wrapper, the wrap variables do not work as
demonstrated because the wrapper programs do not daemonize. If the
wrappers do not daemonize, they will block the init script. The examples
given with "perf" for example simply do not work, because perf remains
in the foreground even as it tracks forked children.
This patch adds an explanation of the behavior expected by the init
script and offers a solution for getting that behavior.
Renato Westphal [Wed, 15 Jun 2022 14:39:48 +0000 (11:39 -0300)]
tests: fix ldp_vpls_topo1 to work as expected
In the last step of this test, r1's link to r2 is shut down but
both routers stay connected through a multi-hop LDP session. That
happens because r1 and r2 have a targeted adjacency created by
the pseudowire. The test then checks whether the pseudowire is
still up, using an alternate path for nexthop resolution.
Everything's fine except for the fact that LDP GTSM (aka
ttl-security) is enabled by default. This means that messages sent
over a multi-hop session are not delivered. In the case of this
test, it can prevent PW-Status notifications from being delivered,
which in turn can prevent the pseudowire from coming back up.
Fix the test by disabling GTSM so that LDP multi-hop sessions can
work normally. This is in accordance with RFC6720 which mentions
that GTSM should be disabled (statically or dynamically) for
multi-hop sessions.
Donald Sharp [Wed, 15 Jun 2022 12:27:32 +0000 (08:27 -0400)]
zebra: On linux let interface data come in through netlink messaging
Consolidate on linux to using the netlink api for gathering all data
about a interface. Leave this interface alone in the meantime for
other OS's.
This also has the side effect of reducing the amount of work
being done on linux in that FRR was handling shut/no shut
events 2 times. Once for the ioctl question asked and
once for the netlink message received.
Donald Sharp [Wed, 15 Jun 2022 11:31:53 +0000 (07:31 -0400)]
zebra: Attempt to make ioctl.c have a bit more useful log messges
While examining the code, it was noticed that there was a chance
to improve the log output in some cases to give a fuller understanding
of what went wrong where.
Donald Sharp [Wed, 15 Jun 2022 14:32:53 +0000 (10:32 -0400)]
bgpd, ospfd: Remove extra newline for `show debugging`
This extra newline was adding a weird output to `show debugging`
display where there would be extra newlines sometims and not
others. Make it consistent.
When pim_upstream_inherited_olist_decide calls the api
pim_channel_add_oif, it can pass PIM_OIF_FLAG_PROTO_GM,
PIM_OIF_FLAG_PROTO_PIM and/or PIM_OIF_FLAG_PROTO_STAR.
Now a consider a case where PIM flag was already set
but STAR flag was not set and this api tries to set
both STAR + PIM and passes the same. The api pim_channel_add_oif
returns since it sees that PIM is already set without
setting the STAR flag.
So basically this will lead to issues in scenarios where for the
same OIF multiple flags(IGMP, PIM, STAR) needs to be set.
Donald Sharp [Tue, 14 Jun 2022 16:21:19 +0000 (12:21 -0400)]
tests: Fix verify_rib such that it will look at the selected route
When you have a static route with multiple different admin
distances there exists a chance that route will have been
installed multiple times due to system load when inserted
at about the same time. If this is the case then the
verify_rib function can and will select the wrong route
that happens to have a nexthop group that is still installed.
Modify verify_rib to ensure that the route that is going to
be looked at for nexthop correctness is the actual installed
route, not a previous version of it.
Renato Westphal [Mon, 13 Jun 2022 18:49:45 +0000 (15:49 -0300)]
tests: fix sporadic failures in the ldp_topo1 topotest
The sporadic failures were happening because, under heavy load,
the r4 router could form an OSPF adjacency with r3 a few seconds
before doing the same with r2. In that interim, LDP could establish
a neighborship with r2 going through r3 (instead of connecting
directly). That would cause all label mappings received from r3
to be ignored since they can't be mapped to the routes' nexthops
received from zebra, causing all sorts of test failures. None of
this is erroneous behavior as LDP simply follows the IGP.
The fix consists of updating the test to ensure all expected OSPF
adjacencies fully converged before proceeding to the LDP checks.
Donald Sharp [Tue, 14 Jun 2022 13:50:54 +0000 (09:50 -0400)]
pimd: Cleanup rpf lookup debug to help us figure out what is going on
The rpf lookup debug was not taking into account the fact that a prefix-list
might be applied and also we might need to make a choice between the two.
So let's give ourselves a bit more data.
For show ip pim interface traffic cli, doing the below changes
1. Changing DEFUN to DEFPY
2. Move the whole code to a common api and modify it so that can
be reused for pimv6.
lib: Update sysrepo code with the latest API changes
* sr_event_notif_send -> sr_notif_send
* sr_process_events -> sr_subscription_process_events
* sr_oper_get_items_subscribe -> sr_oper_get_subscribe
* Removed SR_SUBSCR_CTX_REUSE flag from the code at all
Donald Sharp [Thu, 9 Jun 2022 14:29:04 +0000 (10:29 -0400)]
pimd: Show interface traffic even if interface is currently `down`
the `show ip pim interface [x] traffic` command was deciding
to skip display of interfaces if they happened to be down at
that moment. This of course does not make a bunch of sense
to limit the output for a interface that may have sent data
in the past.
This fixes this test crash:
rnode = <lib.topogen.TopoRouter object at 0x7fc755be3880>, dut = 'c1', input_dict = {'c1': {'c1-l1-eth2': ['helloTx', 'helloRx']}}, output_dict = {'c1': {}}