Donatas Abraitis [Wed, 27 Mar 2024 17:08:38 +0000 (19:08 +0200)]
bgpd: Prevent from one more CVE triggering this place
If we receive an attribute that is handled by bgp_attr_malformed(), use
treat-as-withdraw behavior for unknown (or missing to add - if new) attributes.
Donatas Abraitis [Wed, 27 Mar 2024 16:42:56 +0000 (18:42 +0200)]
bgpd: Fix error handling when receiving BGP Prefix SID attribute
Without this patch, we always set the BGP Prefix SID attribute flag without
checking if it's malformed or not. RFC8669 says that this attribute MUST be discarded.
Also, this fixes the bgpd crash when a malformed Prefix SID attribute is received,
with malformed transitive flags and/or TLVs.
Piotr Suchy [Thu, 28 Mar 2024 11:55:35 +0000 (12:55 +0100)]
vtysh, zebra: Fix malformed json output for multiple vrfs in command 'show ip route vrf all json'
Command 'show ip route vrf <vrf_name> json' returns a valid json object,
however if instead of <vrf_name> we specify 'all', we get an invalid json
object, like:
Philippe Guibert [Fri, 29 Mar 2024 07:35:34 +0000 (08:35 +0100)]
bgpd: fix srv6 memory leak detection
The asan memory leak has been detected:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
> #0 0x7f9066dadd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
> #1 0x7f9066779b5d in qcalloc lib/memory.c:105
> #2 0x556d6ca527c2 in vpn_leak_zebra_vrf_sid_update_per_af bgpd/bgp_mplsvpn.c:389
> #3 0x556d6ca530e1 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:451
> #4 0x556d6ca64b3b in vpn_leak_postchange bgpd/bgp_mplsvpn.h:311
> #5 0x556d6ca64b3b in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3751
> #6 0x556d6cb9f116 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3337
> #7 0x7f906685a6b6 in zclient_read lib/zclient.c:4490
> #8 0x7f9066826a32 in event_call lib/event.c:2011
> #9 0x7f906675c444 in frr_run lib/libfrr.c:1217
> #10 0x556d6c980d52 in main bgpd/bgp_main.c:545
> #11 0x7f9065784c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Fix this by freeing the previous memory chunk.
Fixes: b72c9e14756f ("bgpd: cli for SRv6 SID alloc to redirect to vrf (step4)") Fixes: 527588aa78b2 ("bgpd: add support for per-VRF SRv6 SID") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit eea8a8ae248ed726449878c7a74705d779469fda)
Tomi Salminen [Wed, 13 Dec 2023 09:37:21 +0000 (11:37 +0200)]
zebra: Fix crash on macvlan link down/up
Whenever a link up change was detected on a macvlan device where
the linked device wasn't visible in the namespace zebra was
running in, the linked zebra interface was NULL. This was already
handled in the event of a link down, but was ommitted from the
upside. Added the same null check to the up-side.
Christian Hopps [Wed, 20 Mar 2024 19:20:18 +0000 (19:20 +0000)]
grpc: fix grpc for various failures
lib: don't define a `fallthrough` in c++ to avoid conflict with protobuf c++
check: add link libs required by some versions of grpc++ or it's dependent
linked libs
tests: don't fail the test due to known at exit memleaks Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit 043a4183c2f10e6117695dec7a0373c1b0a63808)
Donatas Abraitis [Fri, 15 Mar 2024 11:49:06 +0000 (13:49 +0200)]
bgpd: Update default-originate route-map actual map structure
If using with `bgp listen range ... peer-group x`, default_rmap[afi][safi] is not
updated, and after the hard-reset in other side, this is flushed and never updated
again without restarting the sender BGP daemon.
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Donald Sharp [Sat, 2 Mar 2024 14:50:38 +0000 (09:50 -0500)]
bgpd: Ensure community data is freed in some cases.
Customer has this valgrind trace:
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
0 in community_new ../bgpd/bgp_community.c:39
1 in community_uniq_sort ../bgpd/bgp_community.c:170
2 in route_set_community ../bgpd/bgp_routemap.c:2342
3 in route_map_apply_ext ../lib/routemap.c:2673
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
7 in hash_walk ../lib/hash.c:285
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
12 in work_queue_run ../lib/workqueue.c:282
The above leak detected by valgrind was from a screenshot so I copied it
by hand. Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists. Code should be able to be followed from there to here.
What is happening:
There is a route-map being applied that modifes the outgoing community
to a peer. This is saved in the attr copy created in
subgroup_process_announce_selected. This community pointer is not
interned. So the community->refcount is still 0. Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned. This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked. The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well. This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.
So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.
Donald Sharp [Wed, 13 Mar 2024 14:26:58 +0000 (10:26 -0400)]
bgpd: Ensure that the correct aspath is free'd
Currently in subgroup_default_originate the attr.aspath
is set in bgp_attr_default_set, which hashs the aspath
and creates a refcount for it. If this is a withdraw
the subgroup_announce_check and bgp_adj_out_set_subgroup
is called which will intern the attribute. This will
cause the the attr.aspath to be set to a new value
finally at the bottom of the function it intentionally
uninterns the aspath which is not the one that was
created for this function. This reduces the other
aspath's refcount by 1 and if a clear bgp * is issued
fast enough the aspath for that will be removed
and the system will crash.
Donatas Abraitis [Thu, 29 Feb 2024 12:37:40 +0000 (14:37 +0200)]
docker: Do not use pip Python package manager
Alpine Linux gets this with 3.19:
This is already installed with `pytest` via apk package manager.
```
15 78.20 error: externally-managed-environment
15 78.20
15 78.20 × This environment is externally managed
15 78.20 ╰─>
15 78.20 The system-wide python installation should be maintained using the system
15 78.20 package manager (apk) only.
15 78.20
15 78.20 If the package in question is not packaged already (and hence installable via
15 78.20 "apk add py3-somepackage"), please consider installing it inside a virtual
15 78.20 environment, e.g.:
15 78.20
15 78.20 python3 -m venv /path/to/venv
15 78.20 . /path/to/venv/bin/activate
15 78.20 pip install mypackage
15 78.20
15 78.20 To exit the virtual environment, run:
15 78.20
15 78.20 deactivate
15 78.20
15 78.20 The virtual environment is not deleted, and can be re-entered by re-sourcing
15 78.20 the activate file.
15 78.20
15 78.20 To automatically manage virtual environments, consider using pipx (from the
15 78.20 pipx package).
15 78.20
15 78.20 note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
```
Donatas Abraitis [Thu, 29 Feb 2024 12:21:27 +0000 (14:21 +0200)]
vtysh: Include fnctl.h for vtysh_main
Fixing compilation for Alpine Linux:
```
25 91.59 vtysh/vtysh_main.c: In function 'vtysh_flock_config':
25 91.59 vtysh/vtysh_main.c:276:20: warning: implicit declaration of function 'open'; did you mean 'popen'? [-Wimplicit-function-declaration]
25 91.59 276 | flock_fd = open(flock_file, O_RDONLY, 0644);
25 91.59 | ^~~~
25 91.59 | popen
25 91.60 vtysh/vtysh_main.c:276:37: error: 'O_RDONLY' undeclared (first use in this function)
25 91.60 276 | flock_fd = open(flock_file, O_RDONLY, 0644);
25 91.60 | ^~~~~~~~
25 91.60 vtysh/vtysh_main.c:276:37: note: each undeclared identifier is reported only once for each function it appears in
25 91.60 CC zebra/if_netlink.o
25 91.61 vtysh/vtysh_main.c: In function 'main':
25 91.61 vtysh/vtysh_main.c:637:49: error: 'O_CREAT' undeclared (first use in this function)
25 91.61 637 | fp = open(history_file, O_CREAT | O_EXCL,
25 91.61 | ^~~~~~~
25 91.62 vtysh/vtysh_main.c:637:59: error: 'O_EXCL' undeclared (first use in this function)
25 91.62 637 | fp = open(history_file, O_CREAT | O_EXCL,
25 91.62 | ^~~~~~
```
Donald Sharp [Fri, 8 Mar 2024 18:04:34 +0000 (18:04 +0000)]
pimd: Cleanup inclusion of headers
FRR needs to properly include the FreeBSD headers for
compilation on FreeBSD. I have setup v6 as well
but I have not even tested it. Since I know
that the form is the same I think this is ok
at the moment. This is a step forward.
Because of this change *clearly* no-one is even
using pim on FreeBSD. <look at the MRT_XXX values
to prove to yourself>. In any event this is a step
in the direction of getting that working again.
Igor Ryzhov [Mon, 4 Mar 2024 18:41:41 +0000 (20:41 +0200)]
lib: fix infinite loop in __darr_in_vsprintf
`darr_avail` returns the available capacity excluding the already
existing terminating NULL byte. Take this into account when using
`darr_avail`. Otherwise, if the error length is a power of 2, the
capacity is never enough and the function stucks in an infinite loop.
Igor Ryzhov [Mon, 5 Feb 2024 17:04:39 +0000 (19:04 +0200)]
lib: fix __darr_in_vsprintf
If the initial darr capacity is not enough for the output, the `ap` is
reused multiple times, which is wrong, because it may be altered by
`vsnprintf`. Make a copy of `ap` each time instead of reusing.
Louis Scalbert [Thu, 15 Feb 2024 12:28:02 +0000 (13:28 +0100)]
bgpd: fix 6vpe nexthop
6vPE enables the announcement of IPv6 VPN prefixes through an IPv4 BGP
session. In this scenario, the next hop addresses for these prefixes are
represented in an IPv4-mapped IPv6 format, noted as ::ffff:[IPv4]. This
format indicates to the peer that it should route these IPv6 addresses
using information from the IPv4 nexthop. For example:
> Path Attribute - MP_REACH_NLRI
> [...]
> Address family identifier (AFI): IPv6 (2)
> Subsequent address family identifier (SAFI): Labeled VPN Unicast (128)
> Next hop: RD=0:0 IPv6=::ffff:192.0.2.5 RD=0:0 Link-local=fe80::501d:42ff:feef:b021
> Number of Subnetwork points of attachment (SNPA): 0
This rule is set out in RFC4798:
> The IPv4 address of the egress 6PE router MUST be encoded as an
> IPv4-mapped IPv6 address in the BGP Next Hop field.
However, in some situations, bgpd sends a standard nexthop IPv6 address
instead of an IPv4-mapped IPv6 address because the outgoing interface for
the BGP session has a valid IPv6 address. This is problematic because
the peer router may not be able to route the nexthop IPv6 address (ie.
if the outgoing interface has not IPv6).
Fix the issue by always sending a IPv4-mapped IPv6 address as nexthop
when the BGP session is on IPv4 and address family IPv6.
Philippe Guibert [Mon, 13 Mar 2023 09:47:16 +0000 (10:47 +0100)]
topotests: add an ebgp 6vpe test
This test uses the connected ipv4 mapped ipv6 prefix
to resolve the received BGP routes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: François Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 4d7df91752d7414d9719a361a2fd4cc30943dc96)
Louis Scalbert [Tue, 20 Feb 2024 16:49:01 +0000 (17:49 +0100)]
zebra: fix crash if macvlan link in another netns
A macvlan interface can have its underlying link-interface in another
namespace (aka. netns). However, by default, zebra does not know the
interface from the other namespaces. It results in a crash the pointer
to the link interface is NULL.
Fix the crash by returning when the macvlan link-interface is in another
namespace. No need to go further because any vxlan under the macvlan
interface would not be accessible by zebra.
Olivier Dugeon [Mon, 26 Feb 2024 09:40:34 +0000 (10:40 +0100)]
ospfd: Solved crash in OSPF TE parsing
Iggy Frankovic discovered an ospfd crash when perfomring fuzzing of OSPF LSA
packets. The crash occurs in ospf_te_parse_te() function when attemping to
create corresponding egde from TE Link parameters. If there is no local
address, an edge is created but without any attributes. During parsing, the
function try to access to this attribute fields which has not been created
causing an ospfd crash.
The patch simply check if the te parser has found a valid local address. If not
found, we stop the parser which avoid the crash.
Igor Ryzhov [Sun, 25 Feb 2024 23:00:17 +0000 (01:00 +0200)]
lib: fix prefix-list entry update
When a prefix-list entry is updated, current NB code calls the
replacement code multiple times, once per each updated field. It means
that when multiple fields of an entry are changed in a single commit,
the replacement is done with an interim state of a prefix-list instead
of a final one. To fix the issue, we should call the replacement code
once, after all fields of an entry are updated.
Igor Ryzhov [Sun, 25 Feb 2024 21:12:14 +0000 (23:12 +0200)]
lib: fix access-list entry update
When an access-list entry is updated, current NB code calls notification
hooks for each updated field. It means that when multiple fields of an
entry are changed in a single commit, the hooks are run with an interim
state of an access-list instead of a final one. To fix the issue, we
should call the hooks once, after all fields of an entry are updated.
adding a tests about:
"no bgp as-path access-list" command.
the folloxing "clear bgp *" command leads to the
crash exhibited above.
a sleep had been added to capture the crash befor the end of scenario.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f5f05cbb9c0 (LWP 1371086))]
(gdb) bt
context=0x7ffcf2c216c0) at lib/sigevent.c:248
acl_list=0x55c976ec03c0) at bgpd/bgp_aspath.c:1688
dummy=0x7ffcf2c22340, object=0x7ffcf2c21e70) at bgpd/bgp_routemap.c:2401
match_object=0x7ffcf2c21e70, set_object=0x7ffcf2c21e70, pref=0x0)
at lib/routemap.c:2687
attr=0x7ffcf2c220b0, afi=AFI_IP, safi=SAFI_UNICAST, rmap_name=0x0, label=0x0,
num_labels=0, dest=0x55c976ebeaf0) at bgpd/bgp_route.c:1807
addpath_id=0, attr=0x7ffcf2c22450, afi=AFI_IP, safi=SAFI_UNICAST, type=10,
sub_type=0, prd=0x0, label=0x0, num_labels=0, soft_reconfig=0, evpn=0x0)
at bgpd/bgp_route.c:4424
packet=0x7ffcf2c22410) at bgpd/bgp_route.c:6266
packet=0x7ffcf2c22410, mp_withdraw=false) at bgpd/bgp_packet.c:341
peer=0x55c976e89ed0, size=43) at bgpd/bgp_packet.c:2414
at bgpd/bgp_packet.c:3899
Igor Ryzhov [Mon, 26 Feb 2024 09:30:57 +0000 (11:30 +0200)]
lib: fix setting temporary log options for libyang
By calling `ly_log_options` with `LY_LOSTORE`, the current code
effectively disables libyang logging and never enables it back. The call
is done to get the current logging options, but we don't really need
that. When looking for a schema node, we don't want neither to log nor
to store the error, so simply set the temporary options to 0.
Igor Ryzhov [Sun, 25 Feb 2024 19:50:41 +0000 (21:50 +0200)]
bgpd, yang: fix missing mandatory/default statements on some leafs
The code expects these leafs to always exist. If they are not set, the
daemon would crash. CLI always sets them, but we should properly mark
them as mandatory/default to prevent them from being missed when using
the API.
Igor Ryzhov [Fri, 23 Feb 2024 22:06:41 +0000 (00:06 +0200)]
lib: fix nb callbacks for containers inside choice case
Containers inside a choice's case must be treated as presence containers
as they can be explicitly created and deleted. They must have `create`
and `destroy` callbacks, otherwise the internal data they represent may
never be deleted.
The issue can be reproduced with the following steps:
- create an access-list with destination-network params
```
# access-list test seq 1 permit ip any 10.10.10.0 0.0.0.255
```
- delete the `destination-network` container
```
# mgmt delete-config /frr-filter:lib/access-list[name='test'][type='ipv4']/entry[sequence='1']/destination-network
# mgmt commit apply
MGMTD: No changes found to be committed!
```
As the `destination-network` container is non-presence, and all its
leafs are mandatory, mgmtd doesn't see any changes to be commited and
simply updates its YANG data tree without passing any updates to backend
daemons.
This commit fixes the issue by requiring `create` and `destroy`
callbacks for containers inside choice's cases.
Igor Ryzhov [Fri, 23 Feb 2024 19:14:26 +0000 (21:14 +0200)]
lib: fix order of northbound operations
When ordering operations, destroys must always come before other
operations, to correctly cover the change of a "case" in a "choice".
The problem can be reproduced with the following commands:
```
access-list test seq 1 permit 10.0.0.0/8
access-list test seq 1 permit host 10.0.0.1
access-list test seq 1 permit 10.0.0.0/8
```
Before this commit, the order of changes would be the following:
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `access-list test seq 1 permit host 10.0.0.1`
- `destroy` for `ipv4-prefix`
- `modify` for `host`
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `destroy` for `host`
As `destroy` for `host` is called last, it rewrites the fields that were
filled by `modify` callback of `ipv4-prefix`. This commit fixes this
problem by always calling `destroy` callbacks first.
Igor Ryzhov [Tue, 20 Feb 2024 20:32:52 +0000 (22:32 +0200)]
lib: fix order of northbound callbacks
When ordering the NB callbacks according to their priorities, if the
operation is "destroy" we should reverse the order, to destroy the
dependants before the dependencies.
This fixes the crash, that can be reproduced with the following steps:
```
frr# conf term file-lock
frr(config)# affinity-map map bit-position 10
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
frr(config)# no affinity-map map
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# no affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
```