pimd: allow resolving bsr via directly connected secondary address
This only matters to single hop nodes that are adjacent to the bsr. More common
with IPv6 where LL address is used in PIM as the primary address. If the BSR IP
happens to be an address on the same interface, the receiving pim router
rejects the BSR address because it expects the BSR IP to resolve via the LL address
even if we have a connected route for the same BSR IP subnet. Effectively, we want to
allow rpf to be resolved via secondary IPs with connected routes on the same interface,
and not limit them to primary addresses.
Donald Sharp [Thu, 24 Oct 2024 21:44:31 +0000 (17:44 -0400)]
bgpd: Fix wrong pthread event cancelling
0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258
5 <signal handler called>
6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470
14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355
15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610
16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152
17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639
18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111
19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252
20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214
21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180
22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) f 12
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
1428 assert(m->owner == pthread_self());
In this decode the attempt to cancel the connection's events from
the wrong thread is causing the crash. Modify the code to create an
event on the bm->master to cancel the events for the connection.
Donald Sharp [Thu, 24 Oct 2024 18:17:51 +0000 (14:17 -0400)]
bgpd: Fix deadlock in bgp_keepalive and master pthreads
(gdb) bt
0 futex_wait (private=0, expected=2, futex_word=0x5c438e9a98d8) at ../sysdeps/nptl/futex-internal.h:146
1 __GI___lll_lock_wait (futex=futex@entry=0x5c438e9a98d8, private=0) at ./nptl/lowlevellock.c:49
2 0x00007af16d698002 in lll_mutex_lock_optimized (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:48
3 ___pthread_mutex_lock (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:93
4 0x00005c4369c17e70 in _frr_mtx_lock (mutex=0x5c438e9a98d8, func=0x5c4369dc2750 <__func__.265> "bgp_notify_send_internal") at ./lib/frr_pthread.h:258
5 0x00005c4369c1a07a in bgp_notify_send_internal (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000', data=0x0, datalen=0, use_curr=true) at bgpd/bgp_packet.c:928
6 0x00005c4369c1a707 in bgp_notify_send (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_packet.c:1069
7 0x00005c4369bea422 in bgp_stop_with_notify (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1597
8 0x00005c4369c18480 in bgp_packet_add (connection=0x5c438e9a98c0, peer=0x5c438e9b6010, s=0x7af15c06bf70) at bgpd/bgp_packet.c:151
9 0x00005c4369c19816 in bgp_keepalive_send (peer=0x5c438e9b6010) at bgpd/bgp_packet.c:639
10 0x00005c4369bf01fd in peer_process (hb=0x5c438ed05520, arg=0x7af16bdffaf0) at bgpd/bgp_keepalives.c:111
11 0x00007af16dacd8e6 in hash_iterate (hash=0x7af15c000be0, func=0x5c4369bf005e <peer_process>, arg=0x7af16bdffaf0) at lib/hash.c:252
12 0x00005c4369bf0679 in bgp_keepalives_start (arg=0x5c438e0db110) at bgpd/bgp_keepalives.c:214
13 0x00007af16dac9932 in frr_pthread_inner (arg=0x5c438e0db110) at lib/frr_pthread.c:180
14 0x00007af16d694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
15 0x00007af16d726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)
The bgp keepalive pthread gets deadlocked with itself and consequently
the bgp master pthread gets locked when it attempts to lock
the peerhash_mtx, since it is also locked by the keepalive_pthread
The keepalive pthread is locking the peerhash_mtx in
bgp_keepalives_start. Next the connection->io_mtx mutex in
bgp_keepalives_send is locked and then when it notices a problem it invokes
bgp_stop_with_notify which relocks the same mutex ( and of course
the relock causes it to get stuck on itself ). This generates a
deadlock condition.
Modify the code to only hold the connection->io_mtx as short as
possible.
Donald Sharp [Thu, 24 Oct 2024 15:27:24 +0000 (11:27 -0400)]
bgpd: Store aspath count after aspath has changed
When running bestpath on a very large number of ecmp.
BGP ends up calling aspath_count a very very large number
of times, which results in ~15% cpu runtime in aspath_count_hops.
Modify the aspath to keep track of it's own count. This results
in the function now taking up ~1.5% of the cpu runtime. Enough
for the moment to be ignored.
Donald Sharp [Wed, 23 Oct 2024 17:16:29 +0000 (13:16 -0400)]
bgpd: Do not call evpn_overlay_free no matter what
bgp_update is a very expensive call. Calling evpn_overlay_free
even when we have no evpn data to free is not trivial. Let's
limit the call into this function until we actually have data to
free.
Louis Scalbert [Tue, 22 Oct 2024 16:08:13 +0000 (18:08 +0200)]
bgpd: fix uninitialized labels
Fix uninitialized labels that cause multiple valgrind issues.
> ==3729602== Use of uninitialised value of size 8
> ==3729602== at 0x492B493: hash_get (hash.c:140)
> ==3729602== by 0x2629D2: bgp_labels_intern (bgp_label.c:98)
> ==3729602== by 0x2E6C92: bgp_adj_out_set_subgroup (bgp_updgrp_adv.c:622)
> ==3729602== by 0x2A6810: subgroup_process_announce_selected (bgp_route.c:3340)
> ==3729602== by 0x2E5FF6: group_announce_route_walkcb (bgp_updgrp_adv.c:260)
> ==3729602== by 0x2E3E28: update_group_walkcb (bgp_updgrp.c:1759)
> ==3729602== by 0x492B9A0: hash_walk (hash.c:270)
> ==3729602== by 0x2E498C: update_group_af_walk (bgp_updgrp.c:2090)
> ==3729602== by 0x2E7C0D: group_announce_route (bgp_updgrp_adv.c:1119)
> ==3729602== by 0x2A796E: bgp_process_main_one (bgp_route.c:3865)
> ==3729602== by 0x2A808A: bgp_process_wq (bgp_route.c:3991)
> ==3729602== by 0x49CC7CF: work_queue_run (workqueue.c:282)
> ==3729602== by 0x49BBF25: event_call (event.c:2019)
> ==3729602== by 0x49413CA: frr_run (libfrr.c:1238)
> ==3729602== by 0x1FD1D3: main (bgp_main.c:555)
> ==2604268== Use of uninitialised value of size 8
> ==2604268== at 0x4943016: hash_get (hash.c:159)
> ==2604268== by 0x26EFC1: bgp_labels_intern (bgp_label.c:97)
> ==2604268== by 0x28077B: leak_update (bgp_mplsvpn.c:1298)
> ==2604268== by 0x2824A3: vpn_leak_from_vrf_update (bgp_mplsvpn.c:1932)
> ==2604268== by 0x2C281C: bgp_static_update (bgp_route.c:6974)
> ==2604268== by 0x2C366F: bgp_static_set (bgp_route.c:7263)
> ==2604268== by 0x2C435B: bgp_network_magic (bgp_route.c:7556)
> ==2604268== by 0x2ACF09: bgp_network (bgp_route_clippy.c:86)
> ==2604268== by 0x4914EE7: cmd_execute_command_real (command.c:1003)
> ==2604268== by 0x4915060: cmd_execute_command (command.c:1062)
> ==2604268== by 0x4915610: cmd_execute (command.c:1228)
> ==2604268== by 0x49E7C32: vty_command (vty.c:625)
> ==2604268== by 0x49E9B56: vty_execute (vty.c:1388)
> ==2604268== by 0x49EC331: vtysh_read (vty.c:2400)
> ==2604268== by 0x49E06F1: event_call (event.c:2001)
> ==2604268== by 0x495AB8B: frr_run (libfrr.c:1238)
> ==2604268== by 0x200C4B: main (bgp_main.c:555)
Fixes: ddb5b4880b ("bgpd: vpn-vrf route leaking") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Donatas Abraitis [Thu, 17 Oct 2024 08:11:50 +0000 (11:11 +0300)]
bgpd: Handle non-transitive extended communities for link-bandwidth
If we received a non-transitive extended community (in this case it was
extended link-bandwidth), we treated it as unknown because we didn't check for
the transitive flag correctly.
Shbinging [Tue, 22 Oct 2024 05:34:26 +0000 (05:34 +0000)]
ospfd:fix the bug that the empty area was not free after no area range command was executed
When we use the no area X.X.X.X range A.B.C.D/M command, if the area no longer has an interface to which it belongs, then the area should be deleted from the LSDB. This processing logic is consistent with instructions such as no network area and no area authentication.
baozhen-H3C [Mon, 21 Oct 2024 08:20:27 +0000 (16:20 +0800)]
isisd: fix 'show isis route' and 'show isis fast-reroute summary' errors with vrf
When the VRF does not exist, the command will display results for the 'default' VRF.
1.before the commit:
sonic# show vrf
vrf Vrf1 id 41 table 1001
sonic# show isis vrf abc route level-1
Area 10:
IS-IS L1 IPv4 routing table:
2.after the commit:
sonic# show vrf
vrf Vrf1 id 41 table 1001
sonic# show isis vrf abc route level-1
sonic# show isis vrf abc fast-reroute summary level-1
What I do:
Move 'ISIS_FIND_VRF_ARGS(argv, argc, idx, vrf_name, all_vrf);' to the front, otherwise changing 'idx' while searching for other parameters may result in failing to find the vrf parameter.
anlan_cs [Sun, 20 Oct 2024 12:07:25 +0000 (20:07 +0800)]
zebra: drop NEWLINK event handling in the main thread
NEWLINK is only registered by the dplane thread, the main thread
doesn't care about it. So remove the real process of `netlink_link_change()`
for NEWLINK event in main thread.
And move NEWLINK/DELLINK event to the block where the dplane messages
are kept together.
Enke Chen [Tue, 22 Oct 2024 01:03:08 +0000 (18:03 -0700)]
bgpd: fix AIGP calculation in route advertisement
Currently the AIGP is always incremented when a route with the
attribute is advertised. That is incorrect when the nexthop is
unchanged, as is commonly the case in route reflection.
Adjust the AIGP for propagation only when the nexthop is set
to ourselves.
Enke Chen [Tue, 22 Oct 2024 00:49:19 +0000 (17:49 -0700)]
tests: add a topotest bgp_aigp_rr
In this topotest, the route reflector advertises a route with the
aigp attribute to its client, some with the nexthop unchanged and
some with the nexthp changed. Different aigp values are sent to
the clients depending on the nexthop setting.
Enke Chen [Sun, 20 Oct 2024 19:25:46 +0000 (12:25 -0700)]
bgpd: allow value 0 in aigp-metric setting
The value of 0 is accepted from peers, and can also be set by the
route-map "set aigp-metric igp-metric". For coonsistency, it should
be allowed in "set aigp-metric <value>" as well.
Louis Scalbert [Fri, 18 Oct 2024 12:16:50 +0000 (14:16 +0200)]
zebra: fix showing nexthop vrf for ipv6 blackhole
For some reasons the Linux kernel associates the ipv6 blackhole of non
default table the lo interface.
> root@r1# ip -6 route show table 100
> root@r1# ip -6 route add unreachable default metric 4278198272 table 100
> root@r1# ip -6 route show table 100
> unreachable default dev lo metric 4278198272 pref medium
As a consequence, the VRF default that owns the lo interface is shown as
the nexthop VRF:
Donald Sharp [Tue, 8 Oct 2024 01:46:33 +0000 (21:46 -0400)]
lib: Correctly handle ppoll pfds.events == 0
The frrevent system is spitting out this message in bgpd:
20:40:15 mem1-roc-f2-b1-r5-t2-d4 bgpd[13166]: [XETTR-D5MR0][EC 100663316] Attempting to process an I/O event but for fd: 214(8) no thread to handle this!
This is because as each io event is processed, it is possible that a
.events is set to 0. This can leave a situation where we ask
ppoll to handle anything that happens on a fd with a .events of 0,
in this situation ppoll can return POLLERR, which indicates that
something bad has happened on the fd.
Let's set the ppoll fds.fd value to -1 when there are no more
events to be processed. ppoll specifically calls out that
it will just skip this particular one.
David Lamparter [Fri, 26 Jul 2024 23:44:59 +0000 (16:44 -0700)]
lib: refactor memstats logging, fix ACTIVEATEXIT
Move the various destinations handling into lib/memory.c, include
"normal" logging as target, and make `ACTIVEATEXIT` properly non-error
as it was intended to be.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Fri, 26 Jul 2024 23:50:20 +0000 (16:50 -0700)]
lib: do not log_memstats() in crash handler
`log_memstats()` is not AS-safe. It can hang the crash handler (or set
your PC on fire, or cause the sun to go supernova - according to POSIX
specs, anyway.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Shbinging [Wed, 16 Oct 2024 05:28:02 +0000 (05:28 +0000)]
ospfd: update ospf_asbr_status when using no_area_nssa command
In the processing of nssa, if the number of areas that need to be translated is greater than 0, then abr will be regarded as asbr, and it will be marked (0x3) in the flag of router lsa. When a certain area is set from nssa to a normal area, the areas that need to be translated may be reduced. The asbr should be re-interpreted as abr when the translated area is 0.
Enke Chen [Wed, 16 Oct 2024 18:15:28 +0000 (11:15 -0700)]
bgpd: fix several issues in sourcing AIGP attribute
Fix several issues in sourcing AIGP attribute:
1) AIGP should not be set as default for a redistributed route or a
static network. It should be set by config instead.
2) AIGP sourced by "set aigp-metric igp-metric" in a route-map does
not set the correct value for a redistributed route.
3) When redistribute a connected route like loopback, the AGIP (with
value 0) is sourced by "set aigp-metric igp-metric", but the
attribute is not propagated as the attribute flag is not set.
David Lamparter [Wed, 16 Oct 2024 10:50:50 +0000 (12:50 +0200)]
vtysh: make clang-SA happy about reusing stdin
While the logic here is perfectly fine, clang-SA doesn't understand that
the fopen() and fclose() match up with each other. Just use a separate
variable to make clang-SA happy.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 16 Oct 2024 10:46:35 +0000 (12:46 +0200)]
vtysh: remove duplicate nonblocking handling
non-blocking retries are already handled in `vtysh_client_receive()`.
And by the point we're back in `vtysh_client_run()`, errno may have been
overwritten by the close() call in vtysh_client_receive().
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 16 Oct 2024 10:40:01 +0000 (12:40 +0200)]
ldpd: free previous config if it wasn't applied
If a create-config command is received, but the config is never applied,
the config will be leaked on the next create-config command. This
should theoretically never happen, but let's fix it anyway.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>