anlan_cs [Sat, 17 Dec 2022 08:25:56 +0000 (16:25 +0800)]
zebra: fix wrong gateway for fpm debug
The wrong parameter is passed in `inet_ntop()` of `zfpm_log_route_info()` in
old fpm module, so the display of gateway is always wrong. Just remove
that extra ampersand.
Additionally, use "none" as gateway value for the case of no gateway.
Donald Sharp [Fri, 16 Dec 2022 12:38:58 +0000 (07:38 -0500)]
lib, staticd: return values even after an assert
When compiling with -fsanitize=thread. I started getting this error:
staticd/static_zebra.c: In function ‘static_zebra_nht_get_prefix’:
staticd/static_zebra.c:316:1: error: control reaches end of non-void function [-Werror=return-type]
316 | }
| ^
Just to make future efforts still work, let's just make the compiler happy.
Trey Aspelund [Thu, 15 Dec 2022 18:49:43 +0000 (18:49 +0000)]
bgpd: cleanup macip_path_list for remote macip
ASAN reported the following memleak:
```
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x4d4342 in calloc (/usr/lib/frr/bgpd+0x4d4342)
#1 0xbc3d68 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0xb869f7 in list_new /home/sharpd/frr8/lib/linklist.c:64:9
#3 0x5a38bc in bgp_evpn_remote_ip_hash_alloc /home/sharpd/frr8/bgpd/bgp_evpn.c:6789:24
#4 0xb358d3 in hash_get /home/sharpd/frr8/lib/hash.c:162:13
#5 0x593d39 in bgp_evpn_remote_ip_hash_add /home/sharpd/frr8/bgpd/bgp_evpn.c:6881:7
#6 0x59dbbd in install_evpn_route_entry_in_vni_common /home/sharpd/frr8/bgpd/bgp_evpn.c:3049:2
#7 0x59cfe0 in install_evpn_route_entry_in_vni_ip /home/sharpd/frr8/bgpd/bgp_evpn.c:3126:8
#8 0x59c6f0 in install_evpn_route_entry /home/sharpd/frr8/bgpd/bgp_evpn.c:3318:8
#9 0x59bb52 in install_uninstall_route_in_vnis /home/sharpd/frr8/bgpd/bgp_evpn.c:3888:10
#10 0x59b6d2 in bgp_evpn_install_uninstall_table /home/sharpd/frr8/bgpd/bgp_evpn.c:4019:5
#11 0x578857 in install_uninstall_evpn_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4051:9
#12 0x58ada6 in bgp_evpn_import_route /home/sharpd/frr8/bgpd/bgp_evpn.c:6049:9
#13 0x713794 in bgp_update /home/sharpd/frr8/bgpd/bgp_route.c:4842:3
#14 0x583fa0 in process_type2_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4518:9
#15 0x5824ba in bgp_nlri_parse_evpn /home/sharpd/frr8/bgpd/bgp_evpn.c:5732:8
#16 0x6ae6a2 in bgp_nlri_parse /home/sharpd/frr8/bgpd/bgp_packet.c:363:10
#17 0x6be6fa in bgp_update_receive /home/sharpd/frr8/bgpd/bgp_packet.c:2020:15
#18 0x6b7433 in bgp_process_packet /home/sharpd/frr8/bgpd/bgp_packet.c:2929:11
#19 0xd00146 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
```
The list itself was not being cleaned up when the final list entry was
removed, so make sure we do that instead of leaking memory.
Donald Sharp [Wed, 14 Dec 2022 20:53:06 +0000 (15:53 -0500)]
lib: Fix free function
The list delete function on creation was set to srv6_locator_chunk_free
Which takes a double pointer and dereferences it to free the data.
When list_delete is called it calls the delete function like this:
if (*list->del)
(*list->del)(node->data);
The data is not passed in by reference and as such we do not have
a double pointer. Fortunately this list_delete is only really
called on shutdown when the locator was deleted and we do not
have a fun situation where we were suddenly freeing 'something'.
Donald Sharp [Wed, 14 Dec 2022 19:20:26 +0000 (14:20 -0500)]
zebra: When freeing the early route queue, actually free it right
The early route queue has a series of `struct zebra_early_route *`
entries. Zebra is treating this memory as just a `struct route entry`.
This is wrong. Correct this to free the memory correctly.
Donald Sharp [Wed, 14 Dec 2022 19:05:11 +0000 (14:05 -0500)]
lib, tests, zebra: Remove unused workqueue error function
The wq->spec.errorfunc is never used in the code.
It's been in the code base since 2005 and I also
do not remember ever seeing it being called. No
workqueue process function ever returns error.
Since it's not used let's just remove it from the
code base.
Donald Sharp [Wed, 14 Dec 2022 17:53:53 +0000 (12:53 -0500)]
tests: Limit run of config_timing when building with --enable-address-sanitizer
Building FRR with --enable-address-sanitizer and then running the
config_timing test makes the test run for over an hour on my machine.
The goal of this test is to ensure that the test runs 10000 routes
in/out in a reasonable amount of time. We cannot test this with
address-sanitizer enabled. So just make the test meaningless
from a timing perspective but keep it `alive` from a it might
catch some address sanitizer issue with 50 -vs- 10000 routes
Direct leak of 128 byte(s) in 4 object(s) allocated from:
#0 0x4bd732 in calloc (/usr/lib/frr/zebra+0x4bd732)
#1 0x7feaeab8f798 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0x7feaeaba40f4 in nexthop_group_new /home/sharpd/frr8/lib/nexthop_group.c:270:9
#3 0x56859b in netlink_route_change_read_unicast /home/sharpd/frr8/zebra/rt_netlink.c:950:9
#4 0x5651c2 in netlink_route_change /home/sharpd/frr8/zebra/rt_netlink.c:1204:2
#5 0x54af15 in netlink_information_fetch /home/sharpd/frr8/zebra/kernel_netlink.c:407:10
#6 0x53e7a3 in netlink_parse_info /home/sharpd/frr8/zebra/kernel_netlink.c:1184:12
#7 0x548d46 in kernel_read /home/sharpd/frr8/zebra/kernel_netlink.c:501:2
#8 0x7feaeacc87f6 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
#9 0x7feaeab36503 in frr_run /home/sharpd/frr8/lib/libfrr.c:1198:3
#10 0x550d38 in main /home/sharpd/frr8/zebra/main.c:476:2
#11 0x7feaea492d09 in __libc_start_main csu/../csu/libc-start.c:308:16
Indirect leak of 576 byte(s) in 4 object(s) allocated from:
#0 0x4bd732 in calloc (/usr/lib/frr/zebra+0x4bd732)
#1 0x7feaeab8f798 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0x7feaeab9b3f8 in nexthop_new /home/sharpd/frr8/lib/nexthop.c:373:7
#3 0x56875e in netlink_route_change_read_unicast /home/sharpd/frr8/zebra/rt_netlink.c:960:15
#4 0x5651c2 in netlink_route_change /home/sharpd/frr8/zebra/rt_netlink.c:1204:2
#5 0x54af15 in netlink_information_fetch /home/sharpd/frr8/zebra/kernel_netlink.c:407:10
#6 0x53e7a3 in netlink_parse_info /home/sharpd/frr8/zebra/kernel_netlink.c:1184:12
#7 0x548d46 in kernel_read /home/sharpd/frr8/zebra/kernel_netlink.c:501:2
#8 0x7feaeacc87f6 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
#9 0x7feaeab36503 in frr_run /home/sharpd/frr8/lib/libfrr.c:1198:3
#10 0x550d38 in main /home/sharpd/frr8/zebra/main.c:476:2
#11 0x7feaea492d09 in __libc_start_main csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: 704 byte(s) leaked in 8 allocation(s).
Louis Scalbert [Mon, 14 Feb 2022 13:18:10 +0000 (14:18 +0100)]
bgpd: fix the IGP metric for best path selection on VPN import
Since the commit da0c0ef70c ("bgpd: VRF-Lite fix best path selection"),
the best path selection is made from the comparison of the attributes
of the original route i.e. the ultimate path.
The IGP metric is currently set on the child path instead of the
ultimate path (i.e. the parent path). On eBGP, the ultimate path is the
child path. However, for imported routes, the ultimate path is always
set to 0, which results in skipping the IGP metric comparison when
selecting the best path.
Set the IGP metric on the ultimate path when a BGP nexthop is added or
updated.
Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
bgpd: Add support for flowspec prefixes in bgp_packet_mpattr_prefix_size
Currently, bgp_packet_mpattr_prefix_size (bgpd/bgp_attr.c:3978) always returns zero for Flowspec prefixes.
This is because, for flowspec prefixes, the prefixlen attribute of the prefix struct is always set to 0, and the actual length is bytes is set inside the flowspec_prefix struct instead (see lib/prefix.h:293 and lib/prefix.h:178).
Because of this, with a large number of flowspec NLRIs, bgpd ends up building update messages that exceed the maximum size and cause the peer to drop the connection (bgpd/bgp_updgrp_packet.c:L719).
The proposed change allows the bgp_packet_mpattr_prefix_size to return correct result for flowspec prefixes.
Donald Sharp [Thu, 3 Nov 2022 23:01:36 +0000 (19:01 -0400)]
bgpd: use the enum instead of an int
The bgp_fsm_change_status function takes an int
for the new bgp state, which is an `enum bgp_fsm_status status`
let's convert over to being explicit.bgpd: use the enum instead of an int
Kuldeep Kashyap [Fri, 9 Dec 2022 15:21:29 +0000 (07:21 -0800)]
tests: Topotests daemon start as per feature test
Currently topotests starts all daemons by default,
made changes to f/w so only needed daemons can
be started, daemons which are needed to tests
particular test suite.
David Lamparter [Tue, 13 Dec 2022 17:34:41 +0000 (18:34 +0100)]
accords: CLI coloring
This is pretty much a writeup of the discussion we had on the FRR weekly
call about an hour ago. I added a bunch of detail but hopefully it
still represents the overall consensus.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Rafael Zalamena [Mon, 12 Dec 2022 18:11:27 +0000 (15:11 -0300)]
ospfd: fix memory leak on SPF calculation
Fix the following problems:
- Always free vertex next hops on `vertex_parent_free`
- Signalize failure on `ospf_spf_add_parent` when parent already exists
so the caller has the chance to `free()` any allocated resources.
- Don't reuse vertex next hops without the reference count logic in
`ospf_nexthop_calculation`. Instead allocate a new copy so it can be
`free()`d later without complications
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Donald Sharp [Tue, 4 Oct 2022 19:41:36 +0000 (15:41 -0400)]
zebra: Add ctx to netlink message parsing
Add the initial step of passing in a dplane context
to reading route netlink messages. This code
will be run in two contexts:
a) The normal pthread for reading netlink messages from
the kernel
b) The dplane_fpm_nl pthread.
The goal of this commit is too just allow a) to work
b) will be filled in in the future. Effectively
everything should still be working as it should
pre this change. We will just possibly allow
the passing of the context around( but not used )
Donald Sharp [Mon, 3 Oct 2022 17:22:22 +0000 (13:22 -0400)]
zebra: Rearrange dplane_ctx_route_init
In order for a future commit to abstract the dplane_ctx_route_init
so that the kernel can use it, let's move some stuff around
and add a dplane_ctx_route_init_basic that can be used by multiple
different paths
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
create a dplane_ctx_route_init_basic so it can be used
Volta submitted notification changes for the dplane that had a
special use case for their system. Volta is no more, the code
is not being actively developed and from talking with ex-Volta
employees there is no current plans to even maintain this code.
Wrap the special handling of nexthops that their asic-dataplane
did in a bit of code to isolate it and allow for future removal,
as that I do not actually believe anyone else is using this code.
Add a CPP_NOTICE several years into the future that will tell us
to remove the code. If someone starts using it then they will
have to notice this variable to set it and hopefully they will
see my CPP_NOTICE to come talk to us. If this is being used then
we can just remove this wrapper.
Donatas Abraitis [Sun, 11 Dec 2022 19:00:19 +0000 (21:00 +0200)]
bgpd: Fix graceful-restart JSON outputs and the crash
Without this patch:
```
donatas-pc# show bgp neighbors graceful-restart json
vtysh: error reading from bgpd: Resource temporarily unavailable (11)Warning: closing connection to bgpd because of an I/O error!
donatas-pc#
```
And, invalid JSON generated when multiple neighbors exist due to json_neighbor
being freed in a loop.
In case of BGP unnumbered, BGP fails to free the nexthop
node for peer if the interface is shutdown before
unconfiguring/deleting the BGP neighbor.
This is because, when the interface is shutdown,
peer's LL neighbor address will be cleared. Therefore,
during neighbor deletion, since the peer's neighbor
address is not available, BGP will skip freeing the
nexthop node of this peer. This results in a stale
nexthop node that points to a peer that's already
been freed.
bgpd: Free memory allocated by info_make() when hitting maximum-prefix
```
Direct leak of 112 byte(s) in 1 object(s) allocated from:
0 0x7feb66337a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
1 0x7feb660cbcc3 in qcalloc lib/memory.c:116
2 0x55cc3cba02d1 in info_make bgpd/bgp_route.c:3831
3 0x55cc3cbab4f1 in bgp_update bgpd/bgp_route.c:4733
4 0x55cc3cbb0620 in bgp_nlri_parse_ip bgpd/bgp_route.c:6111
5 0x55cc3cb79473 in bgp_update_receive bgpd/bgp_packet.c:2020
6 0x55cc3cb7c34a in bgp_process_packet bgpd/bgp_packet.c:2929
7 0x7feb6610ecc5 in thread_call lib/thread.c:2006
8 0x7feb660bfb77 in frr_run lib/libfrr.c:1198
9 0x55cc3cb17232 in main bgpd/bgp_main.c:520
10 0x7feb65ae5082 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: 112 byte(s) leaked in 1 allocation(s).
```
Ryoga Saito [Tue, 6 Dec 2022 17:27:16 +0000 (02:27 +0900)]
tests: Add topotest bgp_vrf_leaking_5549_routes
To verify previous changes, this PR introduces topotest to verify
whether imported routes learnt from BGP unnumbered peers will be active
on VPN RIB and other VRF RIB.
Trey Aspelund [Wed, 7 Dec 2022 19:57:09 +0000 (14:57 -0500)]
doc: add FRR/Linux configuration examples for EVPN
The existing EVPN documentation in bgp.rst does not provide a holistic
configuration, just examples of individual features, and doesn't give
an operator any idea of what a compatible Linux netdev configuration
might look like. This introduces evpn.rst which includes a sample
frr.conf and corresponding Linux interface config (via iproute2) that
an operator can use to setup a basic EVPN topology and model their
interface manager's config from.
This initial version of evpn.rst shows Linux netdev config for
traditional bridges (vlan_filtering=0) and traditional vxlan devices
(single VNI). Later changes to this file will cover the use of
VLAN-aware bridges (vlan_filtering=1), single VXLAN devices
(multi VNI), and eventually bonds (for EVPN-MH).
Eventually the plan is to move the existing EVPN content from bgp.rst
into evpn.rst, but for now let's get some user-facing documentation in
place for interface configs.