vivek [Tue, 24 Mar 2020 20:53:09 +0000 (13:53 -0700)]
bgpd: Ensure link bandwidth extcommunity is not repeated
The BGP link bandwidth extended community must not be repeated. If the
attribute already carries this and the route-map specifies a new value,
the implementation will honor the policy configuration and overwrite
the existing values.
vivek [Tue, 24 Mar 2020 20:50:20 +0000 (13:50 -0700)]
bgpd: Ability to add/update unique extended communities
Certain extended communities cannot be repeated. An example is the
BGP link bandwidth extended community. Enhance the extended community
add function to ensure uniqueness, if requested.
Note: This commit does not change the lack of uniqueness for any of
the already-supported extended communities. Many of them such as the
BGP route target can obviously be present multiple times. Others like
the Router's MAC should most probably be present only once. The portions
of the code which add these may already be structured such that duplicates
do not arise.
vivek [Tue, 24 Mar 2020 19:25:28 +0000 (12:25 -0700)]
bgpd: Install multipath routes with weights
Perform weighted ECMP if the multipaths have link bandwidth. This involves
assigning weights to each of the next hops associated with the prefix based
on the link bandwidth of the corresponding path as a factor of the total
(cumulative) link bandwidth for the prefix. The weight values used are
between 1 and 100. Weights are assigned only if all paths in the multipath
have link bandwidth, otherwise any bandwidths are ignored and regular
ECMP is performed. This is as recommended in
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth
A subsequent commit will implement additional (user-configurable) behaviors.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
vivek [Tue, 24 Mar 2020 19:19:37 +0000 (12:19 -0700)]
bgpd: Add link-bandwidth fields for multipath calc
Introduce fields in the multipath structure for link bandwidth handling.
In the process, the mp_count field is changed to a uint16 as that is the
value set anyway.
vivek [Tue, 24 Mar 2020 18:50:44 +0000 (11:50 -0700)]
bgpd: Add link bandwidth route-map commands
Implement route-map option to set the link-bandwidth extended
community. The command is of the form:
set extcommunity bandwidth <(1-26214400)|cumulative|num-multipaths>
[non-transitive]
The options available are to specify the actual bandwidth value in
Mbps, base it on the cumulative downstream bandwidth or base it on
the number of multipaths. The last option is based on
https://tools.ietf.org/html/draft-mohanty-bess-ebgp-dmz. Further,
in alignment with the use case described in this IETF draft, the
extended community is encoded as transitive by default. There is an
option available to specify that it should be non-transitive.
The link-bandwidth itself is carried in bytes per second as specifed in
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth
Note: This commit only handles the processing for bandwidth specifed
as a value; subsequent commits will handle the processing of the other
options.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Quentin Young [Mon, 30 Mar 2020 18:28:10 +0000 (14:28 -0400)]
lib: defer grpc plugin initialization to post fork
When using the GRPC northbound plugin, initialization occurs at the
frr_late_init hook. This is called before fork() when daemonizing (using
-d). Because the GRPC library internally creates threads, this means our
threads go away in the child process, so GRPC doesn't work when used
with -d. Rectify this situation by deferring plugin init to after fork
by scheduling a task on the threadmaster, since those are executed by
the child.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 30 Mar 2020 18:25:37 +0000 (14:25 -0400)]
watchfrr: change some messages from errors to info
When watchfrr starts up, it first tries to connect to daemons. This is
expected to fail if we are just starting up FRR, but we log it as an
error, and it shows up red in journalctl. Similarly when we fork
background commands that is also logged as an error. This is scaring
users, let's change these to info.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
It was pointed out that we are not properly passing the nexthop
through and instead we were replacing the nexthop as a Route Server
with our own.
https://tools.ietf.org/html/rfc4456#section-4
10. Implementation Considerations
Care should be taken to make sure that none of the BGP path
attributes defined above can be modified through configuration when
exchanging internal routing information between RRs and Clients and
Non-Clients. Their modification could potentially result in routing
loops.
In addition, when a RR reflects a route, it SHOULD NOT modify the
following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED.
Their modification could potentially result in routing loops.
Modify the code such that when FRR is instructed to act as a
Route-Server to pass through the nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Fri, 13 Mar 2020 20:52:53 +0000 (16:52 -0400)]
zebra: don't include backup nhs in main nhe dependency tree
We don't want to install backup nexthops - yet - as part of the
nexthop-id-based kernel interactions on netlink platforms. Avoid
mixing backup and primary nexthops in the tree of dependencies
in the ecmp cases.
Mark Stapp [Tue, 24 Dec 2019 19:22:03 +0000 (14:22 -0500)]
zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.
Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.
saravanank [Fri, 27 Mar 2020 00:36:37 +0000 (17:36 -0700)]
vtysh: Crash during show running-config
RCA:
when client is killed, show running-config command crashes vtysh.
vtysh_client_config function temporarily makes vty->of which is standard output file
pointer to null inorder to suppress output to user.
This call further tries to communicate with each client and when the client
is terminated, socket call fails and hits the exception path to print the
connection has failed using vty_out.
vty_out crashes because vtysh_client_config has temporarily made vty->of
pointer to NULL to supress o/p to user.
Fix:
vty_out function should check for the sanity of vty->of pointer.
If it doesn't exist, this must have hit exception path, so use the
vty->saved_of if exists.
Signed-off-by: Saravanan K <saravanank@vmware.com>
Stephen Worley [Tue, 28 Jan 2020 20:20:18 +0000 (15:20 -0500)]
zebra: add debug for duplicate NH in dataplane array conversion
When we find a nexthop ID thats a duplicate in the code that converts
NHG rb trees into a flat list of nexthop IDs for the dataplane,
output a debug message.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 28 Jan 2020 19:33:10 +0000 (14:33 -0500)]
zebra: don't add ID to kernel nh_grp if not installed/queued
When we transform the nexthop group rb trees into a flat
array of IDs to send into the dataplane code (zebra_nhg_nhe2grp),
don't put an ID in there that has not been in installed or is
not currently queued to be installed into the dataplane.
Otherwise, if some of the nexthops fail to install, we will
still try to create a group with them and then the entire group
will fail.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 28 Jan 2020 00:36:01 +0000 (19:36 -0500)]
zebra: handle NHG in NHG dataplane group conversion
We were not properly handling the case of a NHG inside of
another NHG when converting the rb tree of a multilevel NHG
into a flat list of IDs. When constructing, we call the function
zebra_nhg_nhe2grp_internal() recursively so that the rare
case of a group within a group is handled such that its
singleton nexthops are appended to the grp array of IDs
we send to the dataplane code.
Ex)
1:
-> 2:
-> 3
-> 4
->5:
->6
becomes this:
1:
->3
->4
->6
when its sent to the dataplane code for final kernel installation.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 26 Mar 2020 14:39:16 +0000 (10:39 -0400)]
zebra: remove unnecessary `cmd =` check
In the netlink code for determining whether to set
a src on the route, we check if the cmd=NEW_ROUTE
but its not possible for this to ever be anything
but a new route since we do a goto skip further up
if its a DEL_ROUTE cmd.
So remove this unnecessary check.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donatas Abraitis [Thu, 26 Mar 2020 14:06:34 +0000 (16:06 +0200)]
bgpd: Show that prefix is malformed if aggregated by 0
Show if this malformed under `show [ip] bgp <prefix>`:
```
eva# sh ip bgp 103.79.124.0/22
BGP routing table entry for 103.79.124.0/22
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.201.136
64539 15096 6939 7545 7545 136001, (aggregated by 0(malformed) 0.0.0.0)
192.168.201.136 from 192.168.201.136 (192.168.201.136)
Origin IGP, valid, external, best (First path received)
Last update: Thu Mar 26 10:02:07 2020
```
once again, for both hello-multiplier and hello-interval
the order in which the number and level were shown in the
cli_show methods was inverted compared to the vtysh command,
which created issues with frr-reload.py.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
David Lamparter [Tue, 24 Mar 2020 18:15:04 +0000 (19:15 +0100)]
*: remove line breaks from log messages
Line break at the end of the message is implicit for zlog_* and flog_*,
don't put it in the string. Mid-message line breaks are currently
unsupported. (LF is "end of message" in syslog.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Tue, 10 Dec 2019 16:23:25 +0000 (17:23 +0100)]
sharpd: add "logpump" to bulk test logging
This just generates log messages in bulk for testing logging backend
performance. It's in sharpd so the full "context" of being in a daemon
is available (e.g. different logging configs, parallel load in the main
thread.)
Signed-off-by: David Lamparter <equinox@diac24.net>
Don Slice [Mon, 9 Mar 2020 18:34:53 +0000 (18:34 +0000)]
bgpd: clean up import vrf route-map command
Problem seen that if "import vrf route-map RMAP" was entered
without any vrfs being imported, the configuration was displayed
as "route-map vpn import RMAP". Additionally, if "import vrf
route-map" was entered without specifying a route-map name,
the command was accepted and the word "route-map" would be
treated as a vrf name. This fix resolves both of those issues
and also allows deleting the "import vrf route-map" line without
providing the route-map name.
Ticket: CM-28821 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Sun, 22 Mar 2020 03:37:24 +0000 (23:37 -0400)]
bgpd, lib, ripngd: Add agg_node_get_prefix
Modify code to use lookup function agg_node_get_prefix()
as the abstraction layer. When we rework bgp_node to
bgp_dest this will allow us to greatly limit the amount
of work needed to do that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sat, 21 Mar 2020 21:36:48 +0000 (17:36 -0400)]
bgpd: Make bgp_debug_bestpath take a `struct bgp_node`
Defer the grabbing of the prefix for as long as is possible.
This is a long term rework of how we access the `struct bgp_node`
to only use accessor functions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
saravanank [Tue, 24 Mar 2020 02:57:17 +0000 (19:57 -0700)]
pimd: Reg Suppression expiry has to account for couldreg->false while in prune
Problem: This happened in once in a while during testing the scenario multiple
times. When regstop timer expire and at that point if rpf interface doesn't
exist, the register state for the upstream gets struck in reg-prune state indefinitely.
This will not recover even when rpf comes back and traffic resumed because
register state is struck on prune.
RCA: Reg suppression expiry is keeping reg state unchanged when iif is absent.
Fix: When iif is absent during reg suppression expiry, treat it as couldreg
becoming false and move it NO_INFO state.
Signed-off-by: Saravanan K <saravanank@vmware.com>