Igor Ryzhov [Fri, 23 Apr 2021 22:32:53 +0000 (01:32 +0300)]
tests: fix bfd-bgp-cbit-topo3 test
This test is completely incorrect on test_bfd_loss_intermediate step.
It shuts down the interface and then "waiting" for the BGP session to
fail. But instead of the actual wait it compares the output of "show bfd
peers" with the "up" state. As it does this comparison right after the
interface shutdown, the BFD session has not yet failed and the comparison
is always successful except very rare cases when the command takes a lot
of time to execute (due to the heavy load on CI system I suppose).
David Lamparter [Fri, 23 Apr 2021 13:17:07 +0000 (15:17 +0200)]
ldpd: set `frr_is_after_fork` in lde/ldpe
These subprocesses don't use frr_config_fork(), so frr_is_after_fork is
never set. While the frr_pthread stuff isn't currently used there, set
the flag anyway to avoid future headaches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Thu, 22 Apr 2021 19:04:15 +0000 (15:04 -0400)]
isisd: Remove warnings and add some data to debugs for isis_csm.c
When running isis and not running isis on all interfaces results
in a bunch of warn messages to the log about circuit state
changes. These warn messages also didn't bother to inform
the end user what interface was causing the fun. Since
the end operator cannot do anything with these warn messages
and nor should they in the vast array of normal operations
modify the code to use event debugging and turn the warns
to debugs.
Additionally add some information to clue the operator
in on to what actual interface we are talking about.
the code processing an NHT update was only resetting the BGP_NEXTHOP_VALID
flag, so labeled nexthops were considered valid even if there was no
nexthop. Reset the flag in response to the update, and also make the
isvalid_nexthop functions a little more robust by checking the number
of nexthops.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Stephen Worley [Thu, 22 Apr 2021 21:21:12 +0000 (17:21 -0400)]
zebra: handle gracefulRS/retain with proto NHGs
Properly handle refcounting of Proto-owned NHGs when
zebra is operating under graceful restart and retain
conditions.
We have an extra refcnt of 1 we keep for proto-owned NHGs to
indicate the upper level proto has created and owns it.
When we are reading these in from the kernel, we need to set them
to 1 as appropriate. Without this, we fail in the assert() during
zebra_nhg_proto_add() after the owning daemons resends the NHG
and the refcnts are off by one.
Also add in the same logic we use for routes when sweeping with
respect to uptimes.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Donald Sharp [Thu, 22 Apr 2021 19:47:37 +0000 (15:47 -0400)]
tests: Remove kill_mininet_router_process
This function kills all processes that happen to have the same
name to frr processes and it was only ever used in the setup.
Setup should not be used to kill old runs. That should be a
separate process.
Igor Ryzhov [Thu, 22 Apr 2021 12:24:49 +0000 (15:24 +0300)]
lib: remove enabled flag for bfd sessions
Currently this flag is only helpful in an extremely rare situation when
the BFD session registration was unsuccessful and after that zebra is
restarted. Let's remove this flag to simplify the API. If we ever want
to solve the problem of unsuccessful registration/deregistration, this
can be done using internal flags, without API modification.
Also add the error log to help user understand why the BFD session is
not working.
David Lamparter [Thu, 22 Apr 2021 10:10:27 +0000 (12:10 +0200)]
lib: hard-fail creating threads before fork()
Creating any threads before we fork() into the background (if `-d` is
given) is an extremely dangerous footgun; the threads are created in
the parent and terminated when that exits.
This is extra dangerous because while testing, you'd often run the
daemon in foreground without `-d`, and everything works as expected.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Thu, 22 Apr 2021 11:18:19 +0000 (13:18 +0200)]
lib: add frr_config_pre hook
... for any initialization that needs to run after forking, but that
would be racy if it were just scheduled on the thread_master (since the
config load is also just a thread callback, ordering would be undefined
for another scheduled thread callback.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 21 Apr 2021 09:54:48 +0000 (11:54 +0200)]
build: properly split CFLAGS from AC_CFLAGS
`CFLAGS` is a "user variable", not intended to be controlled by
configure itself. Let's put all the "important" stuff in AC_CFLAGS and
only leave debug/optimization controls in CFLAGS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Wed, 21 Apr 2021 12:57:29 +0000 (08:57 -0400)]
tools: Fix warning when running frr-reload.py
When I run frr-reload.py I am seeing this error:
Apr 21 06:23:51 eva frrinit.sh[3776992]: /usr/lib/frr/frr-reload.py:1094: SyntaxWarning: "is not" with a literal. Did you mean "!="?
Apr 21 06:23:51 eva frrinit.sh[3776992]: if line is not "exit-vrf":
ospf6d: Do not delete external table when configure max-path
Issue: When maximum-path is configured in ospf6 view, the
function ospf6_restart_spf deletes the external table as well
which is not required since that stores the redistribute routes.
Specifying watchfrr as CMD instead of ENTRYPOINT allows one to easily
override this command when starting a docker container. This allows
simple, manual testing via (e.g.) bash. With ENTRYPOINT only the
container will simply explode with an exit code if watchfrr exits.
For instance one could start a shell session in this container via:
```
docker run --name test --rm -i -t <frr-container> bash
```
The default behavior (`docker run <frr-container>` with no command
specified) is not changed.
Introduced the idea of the bgp unnumbered peers using interface up/down
events to track the bgp peers nexthop. This code was not properly
working when a connection was received from a peer in some circumstances.
Effectively the connection from a peer was immediately skipping state transitions
and FRR was never properly tracking the peers nexthop. When we receive the
connection attempt, let's track the nexthop now.
David Lamparter [Thu, 15 Apr 2021 04:26:45 +0000 (06:26 +0200)]
lib: disable ASAN redzone around xref_p/xref_array
The "xref_p" variables are placed in the "xref_array" section
specifically so they're next to each other and we get an array at the
end. The ASAN redzone that is inserted around global variables is
breaks that since it'd be inserted before and after each of the array
items. So disable the ASAN redzone for these variables (and only these
variables, nothing else should be affected.)
Signed-off-by: David Lamparter <equinox@diac24.net>