Stephen Worley [Thu, 5 Dec 2019 22:00:32 +0000 (17:00 -0500)]
pbrd: use spaces in show pbr map vty output
We were using a mix of spaces and tabsin show pbr map vty output.
Tabs can be inconsistent depending on the system settings.
Using spaces is a safer option for more consistent output.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 5 Dec 2019 21:17:42 +0000 (16:17 -0500)]
pbrd: make vty nexthop/nexthop-group output consistent
The vty output for pbr maps with a nexthop-group was not
consistent with those configured with an individual nexthop.
Fix that so its easier for users to read.
Stephen Worley [Thu, 5 Dec 2019 20:52:08 +0000 (15:52 -0500)]
pbrd: make show pbr map detail actually work
The `detail` keyword was doing literally nothing. Changed the
default show to be a bit more user friendly and detail
to give the information you might would need for
debugging.
This commit modified the interface up handling code in
ZAPI such that the zclient handled the decoding for you.
Prior to this commit ospf assumed that it could use the
old ifp pointer to know state before reading the stream.
This lead to a situation where ospf would `smartly` track
and do the right thing in this situation. This commit
changed this assumption and in certain scenarios, say
a interface was changed after it was already up would
lead to situations where ospf would not properly handle
the new interface up.
Modify ospf to track data that is important to it in
it's interface->info pointer.
This code pattern was followed in both eigrp and pim.
In eigrp's case it was just behaving weirdly in any event
so fixing this pattern is not a big deal. In pim's
case it was not properly using this so it's a no-op
to fix.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Fri, 22 Nov 2019 20:30:53 +0000 (15:30 -0500)]
lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.
We would overwrite the first one but never actually install it.
In the `set nexthop commands` we explicitly fail if there is
already a `set nexthop` config present. This patch extends the
match src/dest-ip configs to do the same.
In the future we should make all these commands atomic but for
now its better to not fail silently at the very least.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 25 Nov 2019 05:45:24 +0000 (00:45 -0500)]
pbrd: don't set rule removed on fail
Don't treat a remove failure as a successful remove.
This can cause us to get out of sync with the kernel.
Pbrd makes decisions on rule handling based on its installed
state so this needs to be as close to accurate as possible.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Don Slice [Tue, 3 Dec 2019 14:02:20 +0000 (14:02 +0000)]
zebra: send RA lifetime of 0 before ceasing to advertise RAs
Problem reported by testing agency that RFC4861 section 6.2.5
states that a router should send an RA with a lifetime of 0
before ceasing to send RAs on the interface, or when the interace
is shutdown, or the router is shutdown. This fix adds that capability.
Ticket: CM-27061 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Mon, 2 Dec 2019 14:37:47 +0000 (09:37 -0500)]
bgpd: Prevent crash in bgp_table_range_lookup
The function bgp_table_range_lookup attempts to walk down
the table node data structures to find a list of matching
nodes. We need to guard against the current node from
not matching and not having anything in the child nodes.
Add a bit of code to guard against this.
Traceback that lead me down this path:
Nov 24 12:22:38 frr bgpd[20257]: Received signal 11 at 1574616158 (si_addr 0x2, PC 0x46cdc3); aborting...
Nov 24 12:22:38 frr bgpd[20257]: Backtrace for 11 stack frames:
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_signal+0x113) [0x7fd1ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(+0x70e65) [0x7fd1ad465e65]ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libpthread.so.0(+0xf5f0) [0x7fd1abd605f0]45db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(bgp_table_range_lookup+0x63) [0x46cdc3]445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib64/frr/modules/bgpd_rpki.so(+0x4f0d) [0x7fd1a934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(thread_call+0x60) [0x7fd1ad4736e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(frr_run+0x128) [0x7fd1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(main+0x2e3) [0x41c043]1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd() [0x41d9bb]main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: in thread bgpd_sync_callback scheduled from bgpd/bgp_rpki.c:351#012; aborting...
Nov 24 12:22:38 frr watchfrr[6779]: [EC 268435457] bgpd state -> down : read returned EOF
Nov 24 12:22:38 frr zebra[5952]: [EC 4043309116] Client 'bgp' encountered an error and is shutting down.
Nov 24 12:22:38 frr zebra[5952]: zebra/zebra_ptm.c:1345 failed to find process pid registration
Nov 24 12:22:38 frr zebra[5952]: client 15 disconnected. 0 bgp routes removed from the rib
I am not really 100% sure what we are really trying to do with this function, but we must
guard against child nodes not having any data.
Fixes: #5440 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sun, 1 Dec 2019 14:29:32 +0000 (09:29 -0500)]
bgpd: Fix memory leak in json output of show commands
When dumping a large bit of table data via bgp_show_table
and if there is no information to display for a particular
`struct bgp_node *` the data allocated via json_object_new_array()
is leaked. Not a big deal on small tables but if you have a full
bgp feed and issue a show command that does not match any of
the route nodes ( say `vtysh -c "show bgp ipv4 large-community-list FOO"`)
then we will leak memory.
Before code change and issuing the above show bgp large-community-list command 15-20 times:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: > 2GB
Free small blocks: 31 MiB
Free ordinary blocks: 616 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
After:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 924 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 558 MiB
Free small blocks: 26 MiB
Free ordinary blocks: 340 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
Please note the 340mb of free ordinary blocks is from the fact I issued a
`show bgp ipv4 uni json` command and generated a large amount of data.
Fixes: #5445 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Renato Westphal [Thu, 7 Nov 2019 16:20:06 +0000 (13:20 -0300)]
lib: fix display of candidate configurations
Commit 5e6a9350c16 implemented an optimization where candidate
configurations are validated only before being displayed. The
validation is done only to create default child nodes (due to
how libyang works) and any possible error is ignored (candidate
configurations can be invalid/incomplete).
The problem is that we were calling lyd_validate() only when the
CLI "with-defaults" option was used. But some cli_show() callbacks
assume that default nodes exist and can crash when displaying a
candidate configuration that isn't validated. To fix this, call
lyd_validate() before displaying candidate configuration even when
"with-defaults" is not used (that was a micro-optimization that
shouldn't have been done).
David Lamparter [Fri, 29 Nov 2019 23:36:45 +0000 (00:36 +0100)]
lib: gcc 4.x workaround v2 for frr_interface_info
The previous workaround only works for -O0, at higher optimization
levels gcc reorders the statements in the file global scope which breaks
the asm statement :(.
Fixes: #4563 Fixes: #5074 Signed-off-by: David Lamparter <equinox@diac24.net>
Renato Westphal [Fri, 29 Nov 2019 14:50:07 +0000 (11:50 -0300)]
zebra: support LSPs with multiple outgoing labels
For SR-TE we'll need to create Binding-SIDs which are essentially
LSPs that can push multiple outgoing labels. This commit sets the
groundwork for that. Luckily the netlink code didn't need to be
changed since it already supports pushing label stacks.
David Lamparter [Tue, 26 Nov 2019 16:05:47 +0000 (17:05 +0100)]
lib: add gcc 4.x workaround for frr_interface_info
gcc 4.x does not properly support structs with variable length array
members. Specifically, for global variables, it completely ignores the
array, coming up with a size much smaller than what is correct. This is
broken for both sizeof() as well as ELF object size.
This breaks for frr_interface_info since this variable is in some cases
copy relocated by the linker. (The linker does this to make the address
of the variable a "constant" for the main program.) This copying uses
the ELF object size, thereby copying only the non-array part of the
struct.
Breakage ensues...
(This fix is a bit ugly, but it's limited to very old gcc, and it's
better than changing the array to "nodes[1000]" and wasting memory...)
Fixes: #4563 Fixes: #5074 Signed-off-by: David Lamparter <equinox@diac24.net>
Chirag Shah [Mon, 25 Nov 2019 22:34:29 +0000 (14:34 -0800)]
bgpd: Handle possible non-selection of local route
In rare situations, the local route in a VNI may not get selected as the
best route. One situation is during a race between bgp and zebra which
was addressed in a prior commit. This change addresses another situation
where due to a change of tunnel IP, it is possible that a received route
may be selected as the best route if the path selection needs to take
next hop IPs into consideration. This is a pretty convoluted scenario,
but the code should handle it and delete and withdraw the local route
as well as (re)install the received route.
Ticket: CM-24114
Reviewed By: CCR-9487
Testing Done:
1. Manual tests - note, problem is not readily reproducible
2. evpn-smoke - results documented in the ticket
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Ameya Dharkar [Fri, 22 Nov 2019 23:48:37 +0000 (15:48 -0800)]
bgpd: Do not perform "connected" check for EVPN nexthop
This changeset follows the PR
https://github.com/FRRouting/frr/pull/5334
Above PR adds nexthop tracking support for EVPN RT-5 nexthops.
This route is marked VALID only if the BGP route has a valid nexthop.
If the EVPN peer is an EBGP pee and "disable_connected_check" flag is not set,
"connected" check is performed for the EVPN nexthop.
But, usually EVPN nexthop is not the BGP peering address, but the VTEP address.
Also, NEXTHOP_UNCHANGED flag is enabled by default for EVPN.
As a result, in a common deployment for EVPN, EVPN nexthop is not connected.
Thus, adding a fix to remove the "connected" check for EVPN nexthops.
Don Slice [Fri, 22 Nov 2019 17:31:29 +0000 (17:31 +0000)]
zebra: knob to make ra retransmit interval rfc compliant
Problem reported by testing facility that our sending of Router
Advertisements more frequently than once very three seconds is not
compliant with rfc4861. Added a knob to turn off fast retransmits
in order to meet the requirement of the RFC.
Ticket: CM-27063 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Chirag Shah [Mon, 28 Oct 2019 22:05:05 +0000 (15:05 -0700)]
bgpd: adv pip to throw warning under default vrf
Instead of CMD_WARNING, use CMD_WARNING_CONFIG_FAILED
for any mis-configuration scenario.
Testing Done:
TOR(config)# router bgp 5548
TOR(config-router)# address-family l2vpn evpn
TOR(config-router-af)# no advertise-pip
This command is supported under L3VNI BGP EVPN VRF
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>