Chirag Shah [Sat, 16 Mar 2024 01:18:42 +0000 (18:18 -0700)]
bgpd: do not del peer upon pg remote as change
Currently, when peer-group remote-as is removed, it
deletes all associated neighbors.
Upon re configuring peer-group remote-as, all neighbors
needs to be reconfigured.
Instead, when peer-group remote-as is remove,
cease associated peer's connection and keep in Idle state.
When the peer-group remote-as is (re)configured, trigger
BGP Peer FSM to form neighbor.
Note the connection will be initiated after start timer
expiry.
Acee Lindem [Thu, 14 Mar 2024 10:05:27 +0000 (10:05 +0000)]
ospfd: Send LS Updates in response to LS Request as unicast.
With this fix, OSPF LS Updates sent in response to OSPF LS Requests during the DB Exchange process will be sent as unicasts. Unless the timing of multiple database exchanges coincides, there is little chance that the LSAs in the LS Update are required by OSPF routers other than the one which elicited the LS Update.
This is somewhat ambigous in RFC 2328 and two errata have been filed for clarification:
FRR OSPFv3 (ospf6d) already does it correctly - see ospf6_lsupdate_send_neighbor(struct event *thread). Also, if there is any doubt, one can refer to the C++ code at ospf.org (John Moy's seminal OSPF reference implementation).
Igor Ryzhov [Sun, 17 Mar 2024 20:44:28 +0000 (22:44 +0200)]
lib: remove nb/yang memory cleanup when daemonizing
We're not calling any other termination functions to free allocated
memory when daemonizing except these two. There's no reason for such an
exception, and because of these calls we have the following libyang
warnings every time FRR is started:
```
MGMTD: libyang: String "15" not freed from the dictionary, refcount 2
MGMTD: libyang: String "200" not freed from the dictionary, refcount 2
MGMTD: libyang: String "mrib-then-urib" not freed from the dictionary, refcount 2
MGMTD: libyang: String "1000" not freed from the dictionary, refcount 2
MGMTD: libyang: String "10" not freed from the dictionary, refcount 2
MGMTD: libyang: String "5" not freed from the dictionary, refcount 2
```
Remove these calls to get rid of the unnecessary warnings.
Donald Sharp [Fri, 15 Mar 2024 16:10:58 +0000 (12:10 -0400)]
lib: Prevent crash then another crash from happening
When a memory operation (malloc/free/... ) causes a crash
and the call to core_handler causes another crash then
instead of actually writing a core dump the alarm is
hit and the daemon in trouble will not cause a core dump.
Modify the shutdown code to just try to dump the buffers
and leave instead of cleaning up after itself.
Back Trace:
(gdb) bt
0 0x00007f17082ec056 in __lll_lock_wait_private () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007f17082fc8bd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007f17082fee8f in free () from /lib/x86_64-linux-gnu/libc.so.6
3 0x00007f170866c2ea in qfree (mt=<optimized out>, ptr=<optimized out>) at lib/memory.c:141
4 0x00007f17086c156a in zlog_tls_free (arg=0x55584f816fb0) at lib/zlog.c:390
5 zlog_tls_buffer_fini () at lib/zlog.c:346
6 0x00007f1708695e5f in core_handler (signo=11, siginfo=0x7ffd173229f0, context=<optimized out>) at lib/sigevent.c:264
7 <signal handler called>
8 0x00007f17082fd7bc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
9 0x00007f17082ff6e2 in calloc () from /lib/x86_64-linux-gnu/libc.so.6
10 0x00007f1708451e78 in lh_table_new () from /lib/x86_64-linux-gnu/libjson-c.so.5
11 0x00007f170844c979 in json_object_new_object () from /lib/x86_64-linux-gnu/libjson-c.so.5
12 0x000055584e002fd9 in evpn_show_all_routes (vty=vty@entry=0x55584fb5ea00, bgp=bgp@entry=0x55584f82c600, type=<optimized out>, json=json@entry=0x55584f998130, detail=<optimized out>,
self_orig=<optimized out>) at bgpd/bgp_evpn_vty.c:3192
13 0x000055584e009ed6 in show_bgp_l2vpn_evpn_route (self=<optimized out>, vty=0x55584fb5ea00, argc=6, argv=0x55584f998970) at bgpd/bgp_evpn_vty.c:5048
14 0x00007f170863af60 in cmd_execute_command_real (vline=vline@entry=0x55584fa87cb0, vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x0, up_level=up_level@entry=0, filter=FILTER_RELAXED)
at lib/command.c:1030
15 0x00007f170863b2be in cmd_execute_command (vline=vline@entry=0x55584fa87cb0, vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x0, vtysh=vtysh@entry=0) at lib/command.c:1089
16 0x00007f170863b550 in cmd_execute (vty=vty@entry=0x55584fb5ea00, cmd=cmd@entry=0x55584fb65160 "sh bgp l2vpn evpn route json", matched=matched@entry=0x0, vtysh=vtysh@entry=0)
at lib/command.c:1257
17 0x00007f17086acc77 in vty_command (vty=vty@entry=0x55584fb5ea00, buf=0x55584fb65160 "sh bgp l2vpn evpn route json") at lib/vty.c:503
18 0x00007f17086ad444 in vty_execute (vty=vty@entry=0x55584fb5ea00) at lib/vty.c:1266
19 0x00007f17086b06c8 in vtysh_read (thread=<optimized out>) at lib/vty.c:2165
20 0x00007f17086a798d in thread_call (thread=thread@entry=0x7ffd17325ce0) at lib/thread.c:2008
21 0x00007f1708660568 in frr_run (master=0x55584f22a120) at lib/libfrr.c:1223
22 0x000055584dfc8c96 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:555
Donatas Abraitis [Fri, 15 Mar 2024 11:49:06 +0000 (13:49 +0200)]
bgpd: Update default-originate route-map actual map structure
If using with `bgp listen range ... peer-group x`, default_rmap[afi][safi] is not
updated, and after the hard-reset in other side, this is flushed and never updated
again without restarting the sender BGP daemon.
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Donald Sharp [Mon, 11 Mar 2024 14:40:22 +0000 (10:40 -0400)]
bgpd: When using dev build add pointer information to %pBD
When building FRR with `--enable-dev-build`. Add a bit of
code to include the pointer value as part of the output.
Helps with tracking down issues and let's us see more data
when using the dev build option.
New output:
2024/03/08 19:48:56 BGP: [V0J1J-W5RHA] 11.0.20.1/32(0x5759ddf8d7c0) for 11.0.20.1/32
Donatas Abraitis [Thu, 14 Mar 2024 07:45:18 +0000 (09:45 +0200)]
bgpd: Avoid padding for bgp_paths_limit_capability struct
When sending the packets over the network (dynamic capability) it reports 6 bytes
instead of 5 bytes, and causes some issues between little/big endian machines.
Donald Sharp [Sat, 2 Mar 2024 14:50:38 +0000 (09:50 -0500)]
bgpd: Ensure community data is freed in some cases.
Customer has this valgrind trace:
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
0 in community_new ../bgpd/bgp_community.c:39
1 in community_uniq_sort ../bgpd/bgp_community.c:170
2 in route_set_community ../bgpd/bgp_routemap.c:2342
3 in route_map_apply_ext ../lib/routemap.c:2673
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
7 in hash_walk ../lib/hash.c:285
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
12 in work_queue_run ../lib/workqueue.c:282
The above leak detected by valgrind was from a screenshot so I copied it
by hand. Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists. Code should be able to be followed from there to here.
What is happening:
There is a route-map being applied that modifes the outgoing community
to a peer. This is saved in the attr copy created in
subgroup_process_announce_selected. This community pointer is not
interned. So the community->refcount is still 0. Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned. This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked. The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well. This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.
So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.
Fixes: #15459 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Wed, 13 Mar 2024 14:26:58 +0000 (10:26 -0400)]
bgpd: Ensure that the correct aspath is free'd
Currently in subgroup_default_originate the attr.aspath
is set in bgp_attr_default_set, which hashs the aspath
and creates a refcount for it. If this is a withdraw
the subgroup_announce_check and bgp_adj_out_set_subgroup
is called which will intern the attribute. This will
cause the the attr.aspath to be set to a new value
finally at the bottom of the function it intentionally
uninterns the aspath which is not the one that was
created for this function. This reduces the other
aspath's refcount by 1 and if a clear bgp * is issued
fast enough the aspath for that will be removed
and the system will crash.
Donald Sharp [Tue, 12 Mar 2024 17:12:48 +0000 (13:12 -0400)]
eigrpd, mgmtd, ospf6d: frr_fini is last
I noticed that ospf6d always had a linked list memory leak.
Tracking it down shows that frr_fini() shuts down the memory
system and prints out memory not cleaned up. eigrpd, mgmtd
and ospf6d all called cleanup functions after frr_fini leaving
memory leaked that was not really leaked.
Igor Ryzhov [Sun, 10 Mar 2024 15:35:21 +0000 (17:35 +0200)]
lib: fix initialization of northbound nodes
When actions and notification are defined as descendants of other nodes,
they are not getting initialized, because the iterator skips them. Fix
the iterator to include them when traversing the schema.
This adds specific width length modifiers in the form of wN and wfN
(where N is 8, 16, 32, or 64) which allow printing intN_t and
int_fastN_t without resorting to casts or PRI macros.
FRR changes only include printf(), scanf/strtol are not locally
implemented in FRR. Also added "(void) 0" to empty "else ..." to
avoid a compiler warning.
Donatas Abraitis [Thu, 29 Feb 2024 12:37:40 +0000 (14:37 +0200)]
docker: Do not use pip Python package manager
Alpine Linux gets this with 3.19:
This is already installed with `pytest` via apk package manager.
```
15 78.20 error: externally-managed-environment
15 78.20
15 78.20 × This environment is externally managed
15 78.20 ╰─>
15 78.20 The system-wide python installation should be maintained using the system
15 78.20 package manager (apk) only.
15 78.20
15 78.20 If the package in question is not packaged already (and hence installable via
15 78.20 "apk add py3-somepackage"), please consider installing it inside a virtual
15 78.20 environment, e.g.:
15 78.20
15 78.20 python3 -m venv /path/to/venv
15 78.20 . /path/to/venv/bin/activate
15 78.20 pip install mypackage
15 78.20
15 78.20 To exit the virtual environment, run:
15 78.20
15 78.20 deactivate
15 78.20
15 78.20 The virtual environment is not deleted, and can be re-entered by re-sourcing
15 78.20 the activate file.
15 78.20
15 78.20 To automatically manage virtual environments, consider using pipx (from the
15 78.20 pipx package).
15 78.20
15 78.20 note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
```
Donatas Abraitis [Thu, 29 Feb 2024 12:21:27 +0000 (14:21 +0200)]
vtysh: Include fnctl.h for vtysh_main
Fixing compilation for Alpine Linux:
```
25 91.59 vtysh/vtysh_main.c: In function 'vtysh_flock_config':
25 91.59 vtysh/vtysh_main.c:276:20: warning: implicit declaration of function 'open'; did you mean 'popen'? [-Wimplicit-function-declaration]
25 91.59 276 | flock_fd = open(flock_file, O_RDONLY, 0644);
25 91.59 | ^~~~
25 91.59 | popen
25 91.60 vtysh/vtysh_main.c:276:37: error: 'O_RDONLY' undeclared (first use in this function)
25 91.60 276 | flock_fd = open(flock_file, O_RDONLY, 0644);
25 91.60 | ^~~~~~~~
25 91.60 vtysh/vtysh_main.c:276:37: note: each undeclared identifier is reported only once for each function it appears in
25 91.60 CC zebra/if_netlink.o
25 91.61 vtysh/vtysh_main.c: In function 'main':
25 91.61 vtysh/vtysh_main.c:637:49: error: 'O_CREAT' undeclared (first use in this function)
25 91.61 637 | fp = open(history_file, O_CREAT | O_EXCL,
25 91.61 | ^~~~~~~
25 91.62 vtysh/vtysh_main.c:637:59: error: 'O_EXCL' undeclared (first use in this function)
25 91.62 637 | fp = open(history_file, O_CREAT | O_EXCL,
25 91.62 | ^~~~~~
```