Donald Sharp [Mon, 8 Oct 2018 00:47:42 +0000 (20:47 -0400)]
bgpd: Add ability to dump the bgp peerhash
The bgp->peerhash is a secretive bit of data that we use
to quickly lookup data about peers. Unfortunately
since we had not way to look at it, we had no way
of knowing if it had gotten in or out of sync.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 3 Oct 2018 16:27:57 +0000 (12:27 -0400)]
lib: Include compiler.h as early as is possible in the build
The compiler.h header provides us with some useful macro's
that we are using in the system. We do not know exactly
where the CPP_NOTICE and CPP_WARN macros are used but
they can move around. Place this header early in the
build then.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Quentin Young [Mon, 17 Sep 2018 19:18:47 +0000 (19:18 +0000)]
doc: clarify documentation on BGP multiple AS
Documentation on how to use multiple autonomous systems was inaccurate
and a bit scattered. Clarify usage of VRFs with multiple autonomous
systems, how to configure them, and their distinction from views. Also
moves a block on L3VPN VRFs out of the 'Basic Concepts' section.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The condition in the do/while is always false because 'return_nsid' cannot
reach the end of the loop with 'return_nsid' having a different value than
NS_UNKNOWN. Because of that, the condition can be replaced with 0 (false).
Also, the loop can be removed because the two assignments made at the end
of the loop before the condition check are not used (detected via Clang,
afterwards).
David Lamparter [Thu, 27 Sep 2018 02:18:48 +0000 (04:18 +0200)]
watchfrr, lib: cleanup & delay detaching
This cleans up watchfrr to be more "normal" like the other daemons in
terms of what it does in main(), i.e. using the full frr_*() call set.
Also, this changes the startup behaviour on watchfrr to stay attached on
the daemon's parent process until startup is really complete. This
should allow removing the "watchfrr.started" hack at some point.
Signed-off-by: David Lamparter <equinox@diac24.net>
Daniil Baturin [Mon, 1 Oct 2018 18:38:44 +0000 (20:38 +0200)]
tools: add a script for building a Debian package in one step.
The script simplifies the relatively lengthy procedure.
It should be invoked from the top level source directory, for example:
./tools/build-debian-package.sh
Donald Sharp [Tue, 11 Sep 2018 12:13:42 +0000 (08:13 -0400)]
bgpd: Try to notice when configuration changes during startup
During peer startup there exists the possibility that both
locally and remote peers try to start communication at the
same time. In addition it is possible for local configuration
to change at the same time this is going on. When this happens
try to notice that the remote peer may be in opensent or openconfirm
and if so we need to restart the connection from both sides.
Additionally try to write a bit of extra code in peer_xfer_conn
to notice when this happens and to emit a error message to
the end user about this happening so that it can be cleaned up.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Coverity points a copy-paste error in the Red-Black tree implementation. The
RB tree code is based on the OpenBSD implementation, so at first glance, it
is a strong point for thinking twice before touching anything.
Details:
The code is an augmented RB tree implementation [1], which adds to RB trees
the possibility of using a callback on every node update for updating per-node
associated metainformation. The bug is clear once checking other places where
the callback is called.
Impact:
- FRR: no impact, because the "augmented" capability is not being used.
- OpenBSD [2]: it seems there is no impact, at least in the 'src' repository.
Additional observations:
- If the "augmented" capability is not used, the code could run faster (at
every operation on a node the callback is checked for not being NULL). May
be branch prediction could be enough for those extra operations being
negligible on most processors in use.
Christian Franke [Fri, 28 Sep 2018 17:32:38 +0000 (19:32 +0200)]
doc: Use `mv -f` in Makefile
Sphinx always runs, even in the `make install` stage. When `make install`
is run as root and then another `make` is run by a nonprivileged user,
some versions of `mv` prompt like this:
Don Slice [Fri, 28 Sep 2018 15:55:39 +0000 (15:55 +0000)]
bgpd: solve issue entering aggregate twice
Problem reported that frr-relaod.py was not installing an aggregate
properly. Problem was actually that frr-reload.py does the command
twice, and the second time the aggregate command was entered, it would
appear in the config but the aggregate was removed from the bgp table
and not advertised to peers. Solved by noticing when an aggregate
was marked for deletion (info_invalid) and allowing the re-entry if
the old one was being removed.
Ticket: CM-22509 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Don Slice [Thu, 27 Sep 2018 16:51:59 +0000 (16:51 +0000)]
bgpd: enable aggregation in evpn
Problem encountered where using the aggregate-address command in an
evpn environment did not work properly. Depending on the order of
actions, the aggregate may not be created or removed when either the
commands were issued or routes come and go.
Ticket: CM-20585 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Tue, 7 Nov 2017 14:14:32 +0000 (09:14 -0500)]
bgpd: Add lua match command
Please note this is a Proof of Concept and not actually something
that is ready to commit at this point. The file tools/lua.scr
contains some documentation on how we expect it to work currently.
Additionally not all bgp values have been hooked up into the
ability to lua script yet.
There is still significant work to be done here:
1) Add the ability to pass in more data and to adjust the return values
as appropriate.
To set it up:
1) copy tools/lua.scr into /etc/frr (or whereever the config
directory is )
2) Create a route-map match command:
!
router bgp 55
neighbor 10.50.11.116 remote-as external
!
address-family ipv4 unicast
neighbor 10.50.11.116 route-map TEST in
exit-address-family
!
route-map TEST permit 10
match command mooey
!
3) In the lua.scr file make sure that you have a function
named 'mooey' ( as the above example does ):
In bgp if we have not configured bgp we were ignoring
interface based callbacks. Leading to states where
we may not be processing interface information.
Leading to states where we do not actually keep
ifp data. As an example:
Suppose vrf A and vrf B. A has interface swp1.
At the same time we only have a `router bgp 9 vrf B`
When we received the callback for moving swp1
from vrf A to vrf B we were not processing the
move at all and BGP would not consider the interface
part of vrf B at all.
This commit makes bgp pay attention to interface
events irrelevant if bgp is using that vrf. This
is now consistent with how the lib/if* expects
to work and the rest of the daemons in FRR.
Signed-off-by: Donald Sharp <sharpd@cumulsnetworks.com>
Donald Sharp [Thu, 20 Sep 2018 15:31:14 +0000 (11:31 -0400)]
watchfrr: Modify some stderr messages to zlog_warn
The stderr output is not being displayed as part of watchfrr invocation
in system startup. Specifically if the user has not properly sent
1 or more daemons to monitor. If the end-user is using tools/frr
this stderr is dropped( and systemd appears to drop stderr too? )
Modify the two stderr calls in this situation and use the zlog system.
Now I can clearly see an error message that tells me what has gone wrong.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
[DL: fixed typo]
Conditional code in netlink_macfdb_update() introduced in 2232a77c used
the 'dst_present' variable because not all cases were covered. Now it is
not necessary.
Issue 1: if the router ospf current configuration is "area 0.0.0.2
range 1.0.0.0/24 cost 23" and user try to configure "area 0.0.0.2
range 1.0.0.0/24 not-advertise", the existing o/p is "area 0.0.0.2
range 1.0.0.0/24 cost 23 not-advertise". The keywords "not-advertise"
& "cost" are multually exclusive, so they should not come together.
The vice versa way configuration is working fine.
Fix: When ospf area range "not-advertise", the cost should be initialized
to OSPF_AREA_RANGE_COST_UNSPEC.
Issue 2: if the router ospf current configuration "area 0.0.0.2 range
1.0.0.0/24 substitute 2.0.0.0/24" and user try to configure "area 0.0.0.2
range 1.0.0.0/24 not-advertise" the existing o/p is "area 0.0.0.2 range
1.0.0.0/24 not-advertise substitute 2.0.0.0/24". The keywords
"not-advertise" & "substiture" are multually exclusive, so they should
not come together. The vice versa way configuration is working fine.
Fix: When ospf area range "not-advertise" is configured,
ospf_area_range_substitute_unset() should be get called.
Issue 3: if the router ospf6 current configuration is "area 0.0.0.2
range 2001::/64 cost 23" and user try to configure "area 0.0.0.2 range
2001::/64 advertise", the existing o/p is area 0.0.0.2 range 2001::/64.
The keyword "cost 23" disappears.
Fix: When ospf area range "advertise" is configured and the range is not
NULL, the cost should not be modified.
ospfd: remove unnecessary housekeeping code when using linked lists
The head and tail pointers of linked lists should never be modified
manually, the linked list API guarantees that these pointers are always
valid and up-to-date.
Donald Sharp [Mon, 24 Sep 2018 19:12:36 +0000 (15:12 -0400)]
pimd: Fix several address sanitizer issues
This commit fixes two issues during pim shutdown.
1) The rp_info structure was being freed before the
outgoing notifications that depended on it's information
was sent out as part of shutdown.
2) The pim->upstream_list shutdown involved iterating
over the list via ALL_LIST_ELEMENTS. This typically
is enough but pim will auto delete child nodes as well
as itself when it goes away and they depend on it. As such
the node and nnode could possibly already have been freed.
So change the way we look at all the data in the upstream_list
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 24 Sep 2018 00:41:49 +0000 (20:41 -0400)]
eigrpd: Fix memory leaks and remove dead/unused functions
During shutdown we were not properly cleaning up some memory
as reported by valgrind. Additionally during cleanup operations
I noticed that there were some dead/unused functions remove/reduce.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>