]> git.puffer.fish Git - mirror/frr.git/log
mirror/frr.git
8 weeks agomgmtd: Prevent use after free 18281/head
Donald Sharp [Wed, 26 Feb 2025 17:34:05 +0000 (12:34 -0500)]
mgmtd: Prevent use after free

ci is picking up this use after free on occasion:

    ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
        0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
        1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
        2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
        3 0x7fab99039bcf in mt_count_free lib/memory.c:78
        4 0x7fab99039bcf in qfree lib/memory.c:130
        5 0x7fab98ff971a in hash_clean lib/hash.c:290
        6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
        15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
        16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)

    0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
    freed by thread T0 here:
        0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
        1 0x7fab99039bf0 in qfree lib/memory.c:131
        2 0x7fab98ff93e1 in hash_release lib/hash.c:227
        3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
        4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
        5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
        6 0x7fab98ff970b in hash_clean lib/hash.c:288
        7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

    previously allocated by thread T0 here:
        0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
        1 0x7fab990392fd in qcalloc lib/memory.c:106
        2 0x7fab98ff8b4f in hash_get lib/hash.c:156
        3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
        4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
        5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
        6 0x7fab990ec6de in event_call lib/event.c:2019
        7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
        8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called.  Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 62f35c7bdb2a6364dd03ab120e7bb685dd317c24)

2 months agoMerge pull request #18250 from FRRouting/mergify/bp/stable/10.1/pr-18216
Donald Sharp [Tue, 25 Feb 2025 15:36:40 +0000 (10:36 -0500)]
Merge pull request #18250 from FRRouting/mergify/bp/stable/10.1/pr-18216

pimd: Fix PIM VRF support (send register/register stop in VRF) (backport #18216)

2 months agopimd: Fix PIM VRF support (send register/register stop in VRF) 18250/head
Martin Buck [Fri, 21 Feb 2025 07:54:49 +0000 (08:54 +0100)]
pimd: Fix PIM VRF support (send register/register stop in VRF)

In 946195391406269003275850e1a4d550ea8db38b and
8ebcc02328c6b63ecf85e44fdfbf3365be27c127, transmission of PIM register and
register stop messages was changed to use a separate socket. However, that
socket is not bound to a possible VRF, so the messages were sent in the
default VRF instead. Call vrf_bind() once after socket creation and when the
VRF is ready to ensure transmission in the correct VRF. vrf_bind() handles
the non-VRF case (i.e. VRF_DEFAULT) automatically, so it may be called
unconditionally.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 5a01011e0d2db538a8ba523904bd4f08b786edfb)

2 months agoMerge pull request #18229 from FRRouting/mergify/bp/stable/10.1/pr-18210
Donald Sharp [Sat, 22 Feb 2025 23:19:07 +0000 (18:19 -0500)]
Merge pull request #18229 from FRRouting/mergify/bp/stable/10.1/pr-18210

bgpd: remove dmed check not required in bestpath selection (backport #18210)

2 months agobgpd: remove dmed check not required in bestpath selection 18229/head
Donald Sharp [Thu, 20 Feb 2025 19:28:15 +0000 (14:28 -0500)]
bgpd: remove dmed check not required in bestpath selection

As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.

As an example:

During the bestpath selection local route looses to another path due
to dmed condition being hit.

The snippet of the logs:

2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0   <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 83ad94694bc061e1ff5f43db42cba46320e0df73)

2 months agoMerge pull request #18209 from FRRouting/mergify/bp/stable/10.1/pr-17666
Donald Sharp [Thu, 20 Feb 2025 21:19:12 +0000 (16:19 -0500)]
Merge pull request #18209 from FRRouting/mergify/bp/stable/10.1/pr-17666

pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED sta… (backport #17666)

2 months agoMerge pull request #18205 from FRRouting/mergify/bp/stable/10.1/pr-14227
Donald Sharp [Thu, 20 Feb 2025 19:19:53 +0000 (14:19 -0500)]
Merge pull request #18205 from FRRouting/mergify/bp/stable/10.1/pr-14227

pimd: Fix for data packet loss when FHR is LHR and RP (backport #14227)

2 months agopimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED state (conformance... 18209/head
Rajesh Varatharaj [Wed, 21 Jun 2023 17:59:12 +0000 (10:59 -0700)]
pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED state (conformance issue)

Issue:
If there are any changes to the prefix list, we perform a re-lookup to map the correct RP for the group.
Even if the S,G entry is PIM_UPSTREAM_NOTJOINED and in FHR, In the case of IGMPv3, an S,G entry can be
created with no joins. this is not necessary.
 https://www.rfc-editor.org/rfc/rfc4601#section-4.5.7 says no op in case of NOTJOINED

Solution:
To solve this issue, Stop RP mapping when the state is NOTJOINED

Ticket: #3496931

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 51f26d17da288af44a8a0e536dbe317a7e678514)

2 months agopimd: Fix for data packet loss when FHR is LHR and RP 18205/head
Rajesh Varatharaj [Thu, 17 Aug 2023 20:11:42 +0000 (13:11 -0700)]
pimd: Fix for data packet loss when FHR is LHR and RP

Topology:
A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.

RC and Issue:
When an upstream S,G is in join state, it sends a register message to the RP.
If the RP has the receiver, it sends a register stop message and switches to the shortest path.
When the register stop message is processed, it removes pimreg, moves to prune,
and starts the reg stop timer.

When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL
register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.

The problem is when the register stop timer pops and state is in Join Pending.
According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1,
we need to put back the pimreg reg tunnel into the S,G mroute.
This causes data to be sent to the control plane and subsequently interrupts the line rate.

Fix:
If the router is FHR and RP to the group,
ignore SPT status and send out a register stop message back to the DR (in this context, the same router).

Ticket: #3506780

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 8280257cc99e071c205e469399f2fb41671b30eb)

2 months agoMerge pull request #18201 from FRRouting/revert-18156-mergify/bp/stable/10.1/pr-18121
Jafar Al-Gharaibeh [Wed, 19 Feb 2025 19:08:09 +0000 (13:08 -0600)]
Merge pull request #18201 from FRRouting/revert-18156-mergify/bp/stable/10.1/pr-18121

Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"

2 months agoRevert "bgpd: release manual vpn label on instance deletion (backport #18121)" 18201/head
Donald Sharp [Wed, 19 Feb 2025 16:25:29 +0000 (11:25 -0500)]
Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"

2 months agoMerge pull request #18145 from FRRouting/mergify/bp/stable/10.1/pr-18079
Russ White [Tue, 18 Feb 2025 15:28:25 +0000 (10:28 -0500)]
Merge pull request #18145 from FRRouting/mergify/bp/stable/10.1/pr-18079

bgpd: Fix crash in bgp_labelpool (backport #18079)

2 months agoMerge pull request #18156 from FRRouting/mergify/bp/stable/10.1/pr-18121
Russ White [Tue, 18 Feb 2025 15:27:56 +0000 (10:27 -0500)]
Merge pull request #18156 from FRRouting/mergify/bp/stable/10.1/pr-18121

bgpd: release manual vpn label on instance deletion (backport #18121)

2 months agoMerge pull request #18185 from FRRouting/mergify/bp/stable/10.1/pr-18109
Donald Sharp [Sun, 16 Feb 2025 13:09:45 +0000 (08:09 -0500)]
Merge pull request #18185 from FRRouting/mergify/bp/stable/10.1/pr-18109

bgpd: fix vty output of evpn route-target AS4 (backport #18109)

2 months agobgpd: fix vty output of evpn route-target AS4 18185/head
Mark Stapp [Tue, 11 Feb 2025 19:35:28 +0000 (14:35 -0500)]
bgpd: fix vty output of evpn route-target AS4

evpn route-targets are decoded in  ... multiple places; at least
two have a bug where the AS4 form doesn't have its AS decoded.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit 9943a08720ccbed87cd6938791066a0de94a92c6)

2 months agoMerge pull request #18175 from cscarpitta/fix/backport_srv6_route_dump_for_10.1
Donald Sharp [Sat, 15 Feb 2025 14:16:44 +0000 (09:16 -0500)]
Merge pull request #18175 from cscarpitta/fix/backport_srv6_route_dump_for_10.1

lib: fix false context information for SRv6 route (backport #18023 for 10.1)

2 months agoMerge pull request #18168 from FRRouting/mergify/bp/stable/10.1/pr-18160
Donald Sharp [Sat, 15 Feb 2025 14:14:46 +0000 (09:14 -0500)]
Merge pull request #18168 from FRRouting/mergify/bp/stable/10.1/pr-18160

bgpd: When removing the prefix list drop the pointer (backport #18160)

2 months agolib: fix false context information for SRv6 route 18175/head
Philippe Guibert [Wed, 5 Feb 2025 08:52:59 +0000 (09:52 +0100)]
lib: fix false context information for SRv6 route

The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.

> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01

Fix this by suppressing the USP information from the output.

Fixes: e496b4203055 ("bgpd: prefix-sid srv6 l3vpn service tlv")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2 months agobgpd: When removing the prefix list drop the pointer 18168/head
Donald Sharp [Fri, 14 Feb 2025 12:55:09 +0000 (07:55 -0500)]
bgpd: When removing the prefix list drop the pointer

We are very very rarely seeing this crash:

    0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
    1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
    2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
    3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
    4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
    5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
    6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
    12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
    0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    1 0x7f36ba439bd7 in qfree lib/memory.c:131
    2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
    3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
    4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
    5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
    6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
    0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
    2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
    3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
    4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
    5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
    6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
    7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    8 0x7f36ba4ec239 in event_call lib/event.c:2019
    9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's just stop trying to save the pointer around in the peer->orf_plist
data structure.  There are other design problems but at least lets
stop the crash from possibly happening.

Fixes: #18138
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3d43d7b78971520854903c11b6aec23754fdca34)

2 months agobgpd: release manual vpn label on instance deletion 18156/head
Louis Scalbert [Wed, 12 Feb 2025 12:49:50 +0000 (13:49 +0100)]
bgpd: release manual vpn label on instance deletion

When a BGP instance with a manually assigned VPN label is deleted, the
label is not released from the Zebra label registry. As a result,
reapplying a configuration with the same manual label leads to VPN
prefix export failures.

For example, with the following configuration:

> router bgp 65000 vrf BLUE
>  address-family ipv4 unicast
>   label vpn export <int>

Release zebra label registry on unconfiguration.

Fixes: d162d5f6f5 ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d6363625c35a99933bf60c9cf0b79627b468c9f7)

# Conflicts:
# bgpd/bgpd.c

2 months agobgpd: Fix crash in bgp_labelpool 18145/head
Donald Sharp [Mon, 10 Feb 2025 17:02:00 +0000 (12:02 -0500)]
bgpd: Fix crash in bgp_labelpool

The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure.  If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.

Modify the labelpool code to store the vrf id associated
with the request on the workqueue.  When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 14eac319e8ae9314f5270f871106a70c4986c60c)

2 months agoMerge pull request #18135 from FRRouting/mergify/bp/stable/10.1/pr-18120
Donald Sharp [Thu, 13 Feb 2025 16:19:11 +0000 (11:19 -0500)]
Merge pull request #18135 from FRRouting/mergify/bp/stable/10.1/pr-18120

bgpd: fix incorrect JSON in bgp_show_table_rd (backport #18120)

2 months agoMerge pull request #18137 from opensourcerouting/fix/backport_bgp_bfd_10.1
Donald Sharp [Thu, 13 Feb 2025 16:18:33 +0000 (11:18 -0500)]
Merge pull request #18137 from opensourcerouting/fix/backport_bgp_bfd_10.1

bgp/bfd backports for stable/10.1

2 months agobgpd: fix bfd with update-source in peer-group 18137/head
Louis Scalbert [Wed, 22 Jan 2025 12:30:55 +0000 (13:30 +0100)]
bgpd: fix bfd with update-source in peer-group

Fix BFD session not created when the peer is in update-group with the
update-source option.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2 months agobgpd: When bgp notices a change to shared_network inform bfd of it
Donald Sharp [Thu, 5 Dec 2024 15:16:03 +0000 (10:16 -0500)]
bgpd: When bgp notices a change to shared_network inform bfd of it

When bgp is started up and reads the config in *before* it has
received interface addresses from zebra, shared_network can
be set to false in this case.  Later on once bgp attempts to
reconnect it will refigure out the shared_network again( because
it has received the data from zebra now ).  In this case
tell bfd about it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2 months agobgpd: Allow bfd to work if peer known but interface address not yet
Donald Sharp [Wed, 20 Nov 2024 21:07:34 +0000 (16:07 -0500)]
bgpd: Allow bfd to work if peer known but interface address not yet

If bgp is coming up and bgp has not received the interface address yet
but bgp has knowledge about a bfd peering, allow it to set the peering
data appropriately.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2 months agobgpd: Update source address for BFD session
Donatas Abraitis [Tue, 12 Nov 2024 11:09:09 +0000 (13:09 +0200)]
bgpd: Update source address for BFD session

If BFD is down, we should try to detect the source automatically from the given
interface.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: Reset BGP session only if it was a real BFD DOWN event
Donatas Abraitis [Tue, 5 Nov 2024 13:51:58 +0000 (15:51 +0200)]
bgpd: Reset BGP session only if it was a real BFD DOWN event

Without this patch we always see a double-reset, e.g.:

```
2024/11/04 12:42:43.010 BGP: [VQY9X-CQZKG] bgp_peer_bfd_update_source: address [0.0.0.0->172.18.0.3] to [172.18.0.2->172.18.0.3]
2024/11/04 12:42:43.010 BGP: [X8BD9-8RKN4] bgp_peer_bfd_update_source: interface none to eth0
2024/11/04 12:42:43.010 BFD: [MSVDW-Y8Z5Q] ptm-del-dest: deregister peer [mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default cbit:0x00 minimum-ttl:255]
2024/11/04 12:42:43.010 BFD: [NYF5K-SE3NS] ptm-del-session: [mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default] refcount=0
2024/11/04 12:42:43.010 BFD: [NW21R-MRYNT] session-delete: mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default
2024/11/04 12:42:43.010 BGP: [P3D3N-3277A] 172.18.0.3 [FSM] Timer (routeadv timer expire)
2024/11/04 12:42:43.010 BFD: [YA0Q5-C0BPV] control-packet: no session found [mhop:no peer:172.18.0.3 local:172.18.0.2 port:11]
2024/11/04 12:42:43.010 BFD: [MSVDW-Y8Z5Q] ptm-add-dest: register peer [mhop:no peer:172.18.0.3 local:172.18.0.2 vrf:default cbit:0x00 minimum-ttl:255]
2024/11/04 12:42:43.011 BFD: [PSB4R-8T1TJ] session-new: mhop:no peer:172.18.0.3 local:172.18.0.2 vrf:default ifname:eth0
2024/11/04 12:42:43.011 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 172.18.0.2/32 -> 172.18.0.3/32 (interface eth0) VRF default(0) (CPI bit no): Down
2024/11/04 12:42:43.011 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.18.0.3 vrf default(0) bfd state Up -> Down
2024/11/04 12:42:43.011 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 172.18.0.3 6/10 (Cease/BFD Down) 0 bytes
2024/11/04 12:42:43.011 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2024/11/04 12:42:43.011 BGP: [ZWCSR-M7FG9] 172.18.0.3 [FSM] BGP_Stop (Established->Clearing), fd 22
```

Reset is due to the source address change.

With this patch, we reset the session only if it's a _REAL_ BFD down event, which
means we trigger session reset if BFD session is established earlier than BGP.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: fix incorrect json in bgp_show_table_rd 18135/head
Louis Scalbert [Wed, 12 Feb 2025 11:50:42 +0000 (12:50 +0100)]
bgpd: fix incorrect json in bgp_show_table_rd

In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.

However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.

Updates the condition to also verify the existence of the next bgp_dest
table.

Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cf0269649cdd09b8d3f2dd8815caf6ecf9cdeef9)

2 months agoMerge pull request #18055 from FRRouting/mergify/bp/stable/10.1/pr-14105
Donald Sharp [Wed, 12 Feb 2025 17:39:43 +0000 (12:39 -0500)]
Merge pull request #18055 from FRRouting/mergify/bp/stable/10.1/pr-14105

pimd: Fix for FHR mroute taking longer to age out (backport #14105)

2 months agoMerge pull request #18058 from FRRouting/mergify/bp/stable/10.1/pr-18048
Donald Sharp [Wed, 12 Feb 2025 17:38:48 +0000 (12:38 -0500)]
Merge pull request #18058 from FRRouting/mergify/bp/stable/10.1/pr-18048

pimd: fix DR election race on startup (backport #18048)

2 months agoMerge pull request #18090 from FRRouting/mergify/bp/stable/10.1/pr-17935
Donald Sharp [Wed, 12 Feb 2025 13:19:25 +0000 (08:19 -0500)]
Merge pull request #18090 from FRRouting/mergify/bp/stable/10.1/pr-17935

zebra: include resolving nexthops in nhg hash (backport #17935)

2 months agoMerge pull request #18114 from FRRouting/mergify/bp/stable/10.1/pr-18078
Donald Sharp [Wed, 12 Feb 2025 13:16:42 +0000 (08:16 -0500)]
Merge pull request #18114 from FRRouting/mergify/bp/stable/10.1/pr-18078

nhrpd: fix dont consider incomplete L2 entry (backport #18078)

2 months agonhrpd: fix dont consider incomplete L2 entry 18114/head
Philippe Guibert [Mon, 10 Feb 2025 15:15:44 +0000 (16:15 +0100)]
nhrpd: fix dont consider incomplete L2 entry

Sometimes, NHRP receives L2 information on a cache entry with the
0.0.0.0 IP address. NHRP considers it as valid and updates the binding
with the new IP address.

> Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4
> Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec)
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4
> Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4

Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is
wrong because the nud state value means that value is incomplete and
should not be handled as a valid entry. Instead of considering it, fix
this by by invalidating the current binding. This step is necessary in
order to permit NHRP to trigger resolution requests again.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 3202323052485d8138a3440e9c9907594ad99c57)

2 months agoMerge pull request #18103 from FRRouting/mergify/bp/stable/10.1/pr-18060
Jafar Al-Gharaibeh [Wed, 12 Feb 2025 02:53:18 +0000 (20:53 -0600)]
Merge pull request #18103 from FRRouting/mergify/bp/stable/10.1/pr-18060

lib: crash handlers must be allowed on threads (backport #18060)

2 months agoMerge pull request #18085 from FRRouting/mergify/bp/stable/10.1/pr-17901
Jafar Al-Gharaibeh [Tue, 11 Feb 2025 17:58:00 +0000 (11:58 -0600)]
Merge pull request #18085 from FRRouting/mergify/bp/stable/10.1/pr-17901

lib: actually hash all 16 bytes of IPv6 addresses, not just 4 (backport #17901)

2 months agolib: crash handlers must be allowed on threads 18103/head
David Lamparter [Fri, 7 Feb 2025 12:22:25 +0000 (13:22 +0100)]
lib: crash handlers must be allowed on threads

Blocking all signals on non-main threads is not the way to go, at least
the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run
so we get backtraces.  Otherwise the process just exits.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 13a6ac5b4ca8fc08b348f64de64a787982f24250)

2 months agotests: Add a test that shows the v6 recursive nexthop problem 18090/head
Donald Sharp [Mon, 27 Jan 2025 15:34:31 +0000 (10:34 -0500)]
tests: Add a test that shows the v6 recursive nexthop problem

Currently FRR does not handle v6 recurisive resolution properly
when the route being recursed through changes and the most
significant bits of the route are not changed.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 73ab6a46c51db91df297774221053ab8fc4d12ae)

2 months agozebra: include resolving nexthops in nhg hash
Mark Stapp [Mon, 27 Jan 2025 19:17:24 +0000 (14:17 -0500)]
zebra: include resolving nexthops in nhg hash

Ensure that the nhg hash comparison function includes all
nexthops, including recursive-resolving nexthops.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit cb7cf73992847cfd4af796085bf14f2fdc4fa8db)

2 months agolib: clean up nexthop hashing mess 18085/head
David Lamparter [Wed, 22 Jan 2025 10:23:31 +0000 (11:23 +0100)]
lib: clean up nexthop hashing mess

We were hashing 4 bytes of the address.  Even for IPv6 addresses.

Oops.

The reason this was done was to try to make it faster, but made a
complex maze out of everything.  Time for a refactor.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 001fcfa1dd9f7dc2639b4f5c7a52ab59cc425452)

2 months agolib: guard against padding garbage in ZAPI read
David Lamparter [Wed, 22 Jan 2025 10:19:04 +0000 (11:19 +0100)]
lib: guard against padding garbage in ZAPI read

When reading in a nexthop from ZAPI, only set the fields that actually
have meaning.  While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 4a0e1419a69d07496c7adfb744beecd00e1efef2)

2 months agozebra: guard against junk in nexthop->rmap_src
David Lamparter [Wed, 22 Jan 2025 10:17:21 +0000 (11:17 +0100)]
zebra: guard against junk in nexthop->rmap_src

rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry.  Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit b666ee510eb480da50476b1bbc84bdf8365df95c)

2 months agopbrd: initialize structs used in hash_lookup
David Lamparter [Wed, 22 Jan 2025 10:16:10 +0000 (11:16 +0100)]
pbrd: initialize structs used in hash_lookup

Doesn't seem to break anything but really poor style to pass potentially
uninitialized data to hash_lookup.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit c88589f5e9351654c04322eb395003297656989d)

2 months agofpm: guard against garbage in unused address bytes
David Lamparter [Wed, 22 Jan 2025 10:15:17 +0000 (11:15 +0100)]
fpm: guard against garbage in unused address bytes

Zero out the 12 unused bytes (for the IPv6 address) when reading in an
IPv4 address.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 95cf0b227980999e2af22a2c171e5237e5ffca8e)

2 months agobgpd: don't reuse nexthop variable in loop/switch
David Lamparter [Wed, 22 Jan 2025 10:13:21 +0000 (11:13 +0100)]
bgpd: don't reuse nexthop variable in loop/switch

While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop.  This would leave nexthop with partial data from the previous
use.  Make it local where needed instead.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit ce7f5b21221f0b3557d1f4a40793230d8bc4cf02)

2 months agopimd: fix DR election race on startup 18058/head
Rafael Zalamena [Thu, 6 Feb 2025 22:28:50 +0000 (19:28 -0300)]
pimd: fix DR election race on startup

In case interface address is learnt during configuration, make sure to
run DR election when configuring PIM/PIM passive on interface.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
(cherry picked from commit 86445246062583197d4a6dff7b8c74003cd8049d)

2 months agopimd: Fix for FHR mroute taking longer to age out 18055/head
Rajesh Varatharaj [Thu, 27 Jul 2023 06:57:04 +0000 (23:57 -0700)]
pimd: Fix for FHR mroute taking longer to age out

Issue:
When there is no traffic for a group, the LHR and RP take the default KAT+Join timer expiry of
a maximum of 480 seconds to clear the S,G . However, in the FHR, we update the state from JOINED
to NOT Joined, downstream state from PPto NOINFO.  This restarts the ET timer, causing S,G on FHR to
take more than 10 minutes to age out.

In other words,
Consider a case where (S,G) is in Join state. When the traffic stops and the KAT (210) expires,
 the Join expiry timer restarts. At this time, if we receive a prune, the expectation is to set
 PPT to 0 (RFC 4601 sec 4.5.2).
 When the PPT expires, we move to the noinfo state and restart the expiry timer one more time. We remove the
 (S,G) entry only after ~10 minutes when there is no active traffic.

Summary:
KAT Join ET 210 + PP ET 210 + NOINFO ET 210.

Solution:
Delete the ifchannel when in noinfo state, and KAT is not running.

Ticket: #13703

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit afed39ea2be25bf30d50ac49b4edf424deadcb17)

2 months agoMerge pull request #18036 from opensourcerouting/fix/stabilize_10.1_again
Russ White [Fri, 7 Feb 2025 18:59:11 +0000 (13:59 -0500)]
Merge pull request #18036 from opensourcerouting/fix/stabilize_10.1_again

Stabilize 10.1 branch

2 months agoRevert "bgpd: Do not ignore auto generated VRF instances when deleting" 18036/head
Donatas Abraitis [Thu, 6 Feb 2025 09:10:13 +0000 (11:10 +0200)]
Revert "bgpd: Do not ignore auto generated VRF instances when deleting"

This reverts commit 0a923af56dbe43fdb4e9184c3525d0537740aef9.

2 months agoRevert "bgpd: fix duplicate BGP instance created with unified config"
Donatas Abraitis [Thu, 6 Feb 2025 09:09:49 +0000 (11:09 +0200)]
Revert "bgpd: fix duplicate BGP instance created with unified config"

This reverts commit aba588dd09aa098a88ba1355798c0e784e91ebc8.

2 months agoRevert "bgpd: fix import vrf creates multiple bgp instances"
Donatas Abraitis [Thu, 6 Feb 2025 09:09:43 +0000 (11:09 +0200)]
Revert "bgpd: fix import vrf creates multiple bgp instances"

This reverts commit 8c187fb4f838d8d8a21f8608c3a510136764b122.

2 months agoReapply "bgpd: fix duplicate BGP instance created with unified config"
Donatas Abraitis [Thu, 6 Feb 2025 09:09:30 +0000 (11:09 +0200)]
Reapply "bgpd: fix duplicate BGP instance created with unified config"

This reverts commit daa68852a2a78acf103e8ae1127953b2870c6772.

2 months agoRevert "bgpd: fix duplicate BGP instance created with unified config"
Donatas Abraitis [Thu, 6 Feb 2025 09:09:25 +0000 (11:09 +0200)]
Revert "bgpd: fix duplicate BGP instance created with unified config"

This reverts commit 3abd84ef5be1ef56b66f0e7617f8afab6da6c5cc.

2 months agoMerge pull request #18016 from opensourcerouting/fix/backport_bgpd_10.1
Russ White [Wed, 5 Feb 2025 13:33:21 +0000 (08:33 -0500)]
Merge pull request #18016 from opensourcerouting/fix/backport_bgpd_10.1

bgpd: Recent failed backports for 10.1

2 months agobgpd: fix duplicate BGP instance created with unified config 18016/head
Philippe Guibert [Wed, 5 Feb 2025 12:15:51 +0000 (14:15 +0200)]
bgpd: fix duplicate BGP instance created with unified config

When running the bgp_evpn_rt5 setup with unified config, memory leak
about a non deleted BGP instance happens.

> root@ubuntu2204hwe:~/frr/tests/topotests/bgp_evpn_rt5# cat /tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1.asan.bgpd.1164105
>
> =================================================================
> ==1164105==ERROR: LeakSanitizer: detected memory leaks
>
> Indirect leak of 12496 byte(s) in 1 object(s) allocated from:
>     #0 0x7f358eeb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7f358e877233 in qcalloc lib/memory.c:106
>     #2 0x55d06c95680a in bgp_create bgpd/bgpd.c:3405
>     #3 0x55d06c95a7b3 in bgp_get bgpd/bgpd.c:3805
>     #4 0x55d06c87a9b5 in bgp_get_vty bgpd/bgp_vty.c:603
>     #5 0x55d06c68dc71 in bgp_evpn_local_l3vni_add bgpd/bgp_evpn.c:7032
>     #6 0x55d06c92989b in bgp_zebra_process_local_l3vni bgpd/bgp_zebra.c:3204
>     #7 0x7f358e9e3feb in zclient_read lib/zclient.c:4626
>     #8 0x7f358e98082d in event_call lib/event.c:1996
>     #9 0x7f358e848931 in frr_run lib/libfrr.c:1232
>     #10 0x55d06c60eae1 in main bgpd/bgp_main.c:557
>     #11 0x7f358e229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Actually, a BGP VRF Instance is created in auto mode when creating the
global BGP instance for the L3 VNI. And again, an other BGP VRF instance
is created. Fix this by ensuring that a non existing BGP instance is not
present. If it is present, and with auto mode or in hidden mode, then
override the AS value.

Fixes: f153b9a9b636 ("bgpd: Ignore auto created VRF BGP instances")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agoRevert "bgpd: fix duplicate BGP instance created with unified config"
Donatas Abraitis [Wed, 5 Feb 2025 12:13:01 +0000 (14:13 +0200)]
Revert "bgpd: fix duplicate BGP instance created with unified config"

This reverts commit aba588dd09aa098a88ba1355798c0e784e91ebc8.

2 months agobgpd: fix add label support to EVPN AD routes
Philippe Guibert [Mon, 3 Feb 2025 13:49:53 +0000 (14:49 +0100)]
bgpd: fix add label support to EVPN AD routes

When peering with an EVPN device from other vendor, FRR acting as route
reflector is not able to read nor transmit the label value.

Actually, EVPN AD routes completely ignore the label value in the code,
whereas in some functionalities like evpn-vpws, it is authorised to
carry and propagate label value.

Fix this by handling the label value.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2 months agobgpd: Do not start BGP session if BGP identifier is not set
Donatas Abraitis [Wed, 29 Jan 2025 21:03:06 +0000 (23:03 +0200)]
bgpd: Do not start BGP session if BGP identifier is not set

If we have IPv6-only network and no IPv4 addresses at all, then by default
0.0.0.0 is created which is treated as malformed according to RFC 6286.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agoMerge pull request #17995 from FRRouting/mergify/bp/stable/10.1/pr-17991
Russ White [Tue, 4 Feb 2025 16:16:27 +0000 (11:16 -0500)]
Merge pull request #17995 from FRRouting/mergify/bp/stable/10.1/pr-17991

zebra: fix evpn svd hash avoid double free (backport #17991)

2 months agoMerge pull request #17998 from FRRouting/mergify/bp/stable/10.1/pr-17992
Jafar Al-Gharaibeh [Tue, 4 Feb 2025 16:15:06 +0000 (10:15 -0600)]
Merge pull request #17998 from FRRouting/mergify/bp/stable/10.1/pr-17992

bgpd: fix route-distinguisher in vrf leak json cmd (backport #17992)

2 months agoMerge pull request #17984 from opensourcerouting/fix/backports_auto_vrf_10.1
Russ White [Tue, 4 Feb 2025 16:03:46 +0000 (11:03 -0500)]
Merge pull request #17984 from opensourcerouting/fix/backports_auto_vrf_10.1

bgpd: Auto vrf instance (backports)

2 months agobgpd: fix route-distinguisher in vrf leak json cmd 17998/head
Chirag Shah [Mon, 3 Feb 2025 20:00:41 +0000 (12:00 -0800)]
bgpd: fix route-distinguisher in vrf leak json cmd

For auto configured value RD value comes as NULL,
switching back to original change will ensure to cover
for both auto and user configured RD value in JSON.

tor-11# show bgp vrf blue ipv4 unicast route-leak json
{
  "vrf":"blue",
  "afiSafi":"ipv4Unicast",
  "importFromVrfs":[
    "purple"
  ],
  "importRts":"10.10.3.11:6",
  "exportToVrfs":[
    "purple"
  ],
  "routeDistinguisher":"(null)", <<<<<
  "exportRts":"10.10.3.11:10"
}

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 892704d07f5286464728720648ad392b485a9966)

2 months agozebra: evpn svd hash avoid double free 17995/head
Chirag Shah [Fri, 31 Jan 2025 01:26:46 +0000 (17:26 -0800)]
zebra: evpn svd hash avoid double free

Upon zebra shutdown hash_clean_and_free is called
where user free function is passed,
The free function should not call hash_release
which lead to double free of hash bucket.

Fix:
The fix is to avoid calling hash_release from
free function if its called from hash_clean_and_free
path.

10 0x00007f0422b7df1f in free () from /lib/x86_64-linux-gnu/libc.so.6
11 0x00007f0422edd779 in qfree (mt=0x7f0423047ca0 <MTYPE_HASH_BUCKET>,
    ptr=0x55fc8bc81980) at ../lib/memory.c:130
12 0x00007f0422eb97e2 in hash_clean (hash=0x55fc8b979a60,
    free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:290
13 0x00007f0422eb98a1 in hash_clean_and_free (hash=0x55fc8a675920
    <svd_nh_table>, free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:305
14 0x000055fc8a5323a5 in zebra_vxlan_terminate () at
    ../zebra/zebra_vxlan.c:6099
15 0x000055fc8a4c9227 in zebra_router_terminate () at
    ../zebra/zebra_router.c:276
16 0x000055fc8a4413b3 in zebra_finalize (dummy=0x7fffb881c1d0) at
    ../zebra/main.c:269
17 0x00007f0422f44387 in event_call (thread=0x7fffb881c1d0) at
    ../lib/event.c:2011
18 0x00007f0422ecb6fa in frr_run (master=0x55fc8b733cb0) at
    ../lib/libfrr.c:1243
19 0x000055fc8a441987 in main (argc=14, argv=0x7fffb881c4a8) at
    ../zebra/main.c:584

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 1d4f5b9b19588d77d3eaf06440c26a8c974831a3)

2 months agobgpd: Do not ignore auto generated VRF instances when deleting 17984/head
Donatas Abraitis [Tue, 28 Jan 2025 15:11:58 +0000 (17:11 +0200)]
bgpd: Do not ignore auto generated VRF instances when deleting

When VRF instance is going to be deleted inside bgp_vrf_disable(), it uses
a helper method that skips auto created VRF instances and that leads to STALE
issue.

When creating a VNI for a particular VRF vrfX with e.g. `advertise-all-vni`,
auto VRF instance is created, and then we do `router bgp ASN vrf vrfX`.

But when we do a reload bgp_vrf_disable() is called, and we miss previously
created auto instance.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: fix import vrf creates multiple bgp instances
Philippe Guibert [Thu, 9 Jan 2025 09:26:02 +0000 (10:26 +0100)]
bgpd: fix import vrf creates multiple bgp instances

The more the vrf green is referenced in the import bgp command, the more
there are instances created. The below configuration shows that the vrf
green is referenced twice, and two BGP instances of vrf green are
created.

The below configuration:
> router bgp 99
> [..]
>  import vrf green
> exit
> router bgp 99 vrf blue
> [..]
>  import vrf green
> exit
> router bgp 99 vrf green
> [..]
> exit
>
> r4# show bgp vrfs
> Type  Id     routerId          #PeersCfg  #PeersEstb  Name
>              L3-VNI            RouterMAC              Interface
> DFLT  0      10.0.3.4          0          0           default
>              0                 00:00:00:00:00:00      unknown
>  VRF  5      10.0.40.4         0          0           blue
>              0                 00:00:00:00:00:00      unknown
>  VRF  6      0.0.0.0           0          0           green
>              0                 00:00:00:00:00:00      unknown
>  VRF  6      10.0.94.4         0          0           green
>              0                 00:00:00:00:00:00      unknown

Fix this at import command, by looking at an already present bgp
instance.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2 months agobgpd: fix duplicate BGP instance created with unified config
Philippe Guibert [Tue, 31 Dec 2024 13:38:11 +0000 (14:38 +0100)]
bgpd: fix duplicate BGP instance created with unified config

When running the bgp_evpn_rt5 setup with unified config, memory leak
about a non deleted BGP instance happens.

> root@ubuntu2204hwe:~/frr/tests/topotests/bgp_evpn_rt5# cat /tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1.asan.bgpd.1164105
>
> =================================================================
> ==1164105==ERROR: LeakSanitizer: detected memory leaks
>
> Indirect leak of 12496 byte(s) in 1 object(s) allocated from:
>     #0 0x7f358eeb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7f358e877233 in qcalloc lib/memory.c:106
>     #2 0x55d06c95680a in bgp_create bgpd/bgpd.c:3405
>     #3 0x55d06c95a7b3 in bgp_get bgpd/bgpd.c:3805
>     #4 0x55d06c87a9b5 in bgp_get_vty bgpd/bgp_vty.c:603
>     #5 0x55d06c68dc71 in bgp_evpn_local_l3vni_add bgpd/bgp_evpn.c:7032
>     #6 0x55d06c92989b in bgp_zebra_process_local_l3vni bgpd/bgp_zebra.c:3204
>     #7 0x7f358e9e3feb in zclient_read lib/zclient.c:4626
>     #8 0x7f358e98082d in event_call lib/event.c:1996
>     #9 0x7f358e848931 in frr_run lib/libfrr.c:1232
>     #10 0x55d06c60eae1 in main bgpd/bgp_main.c:557
>     #11 0x7f358e229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Actually, a BGP VRF Instance is created in auto mode when creating the
global BGP instance for the L3 VNI. And again, an other BGP VRF instance
is created. Fix this by ensuring that a non existing BGP instance is not
present. If it is present, and with auto mode or in hidden mode, then
override the AS value.

Fixes: f153b9a9b636 ("bgpd: Ignore auto created VRF BGP instances")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agoMerge pull request #17974 from FRRouting/mergify/bp/stable/10.1/pr-17971
Donatas Abraitis [Sun, 2 Feb 2025 07:51:48 +0000 (09:51 +0200)]
Merge pull request #17974 from FRRouting/mergify/bp/stable/10.1/pr-17971

bgpd: With suppress-fib-pending ensure withdrawal is sent (backport #17971)

2 months agobgpd: With suppress-fib-pending ensure withdrawal is sent 17974/head
Donald Sharp [Fri, 31 Jan 2025 23:53:30 +0000 (18:53 -0500)]
bgpd: With suppress-fib-pending ensure withdrawal is sent

When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.

Here is the timing that I believe is happening:

a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
   and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
   old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
   the code to see if we should advertise the path.  It sees the
   BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
   BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info.  Dest now
   has no path infos.
h) BGP receives the route install(from step b) and unsets the
   BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
   unsets the flag again.

We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 4e8eda74ec7d30ba84e7f53f077f4b896728505a)

2 months agoMerge pull request #17950 from FRRouting/mergify/bp/stable/10.1/pr-17946
Donatas Abraitis [Wed, 29 Jan 2025 20:20:46 +0000 (22:20 +0200)]
Merge pull request #17950 from FRRouting/mergify/bp/stable/10.1/pr-17946

tools: Fix frr-reload for ebgp-multihop TTL reconfiguration. (backport #17946)

2 months agotools: Fix frr-reload for ebgp-multihop TTL reconfiguration. 17950/head
Nobuhiro MIKI [Wed, 29 Jan 2025 04:31:53 +0000 (04:31 +0000)]
tools: Fix frr-reload for ebgp-multihop TTL reconfiguration.

In ebgp-multihop, there is a difference in reload behavior when TTL is
unspecified (meaning default 255) and when 255 is explicitly specified.
For example, when reloading with 'neighbor <neighbor> ebgp-multihop
255' in the config, the following difference is created. This commit
fixes that.

    Lines To Delete
    ===============
    router bgp 65001
     no neighbor 10.0.0.4 ebgp-multihop
    exit

    Lines To Add
    ============
    router bgp 65001
     neighbor 10.0.0.4 ebgp-multihop 255
    exit

The commit 767aaa3a8048 is not sufficient and frr-reload needs to be
fixed to handle both unspecified and specified cases.

Signed-off-by: Nobuhiro MIKI <nob@bobuhiro11.net>
(cherry picked from commit 594e917656da5502b302309aed3cf596df24713f)

2 months agoMerge pull request #17939 from opensourcerouting/fix/revert_4338e21aa2feba57ea7004c36...
Donald Sharp [Tue, 28 Jan 2025 14:35:35 +0000 (09:35 -0500)]
Merge pull request #17939 from opensourcerouting/fix/revert_4338e21aa2feba57ea7004c36362e5d8186340b8_10.1

Revert "bgpd: Handle Addpath capability using dynamic capabilities" (backport)

3 months agoRevert "bgpd: Handle Addpath capability using dynamic capabilities" 17939/head
Donatas Abraitis [Sat, 25 Jan 2025 18:28:26 +0000 (20:28 +0200)]
Revert "bgpd: Handle Addpath capability using dynamic capabilities"

This reverts commit 05cf9d03b345393b8d63ffe9345c42debd8362b6.

TL;DR; Handling BGP AddPath capability is not trivial (possible) dynamically.

When the sender is AddPath-capable and sends NLRIs encoded with AddPath ID,
and at the same time the receiver sends AddPath capability "disable-addpath-rx"
(flag update) via dynamic capabilities, both peers are out of sync about the
AddPath state. The receiver thinks already he's not AddPath-capable anymore,
hence it tries to parse NLRIs as non-AddPath, while they are actually encoded
as AddPath.

AddPath capability itself does not provide (in RFC) any mechanism on backward
compatible way to handle NLRIs if they come mixed (AddPath + non-AddPath).

This explains why we have failures in our CI periodically.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
3 months agoMerge pull request #17923 from donaldsharp/backport_17229_some_to_10_1
Jafar Al-Gharaibeh [Sat, 25 Jan 2025 17:12:45 +0000 (11:12 -0600)]
Merge pull request #17923 from donaldsharp/backport_17229_some_to_10_1

Backport 17229 some to 10 1

3 months agobgpd: Fix wrong pthread event cancelling 17923/head
Donald Sharp [Thu, 24 Oct 2024 21:44:31 +0000 (17:44 -0400)]
bgpd: Fix wrong pthread event cancelling

0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3  0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4  0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258
5  <signal handler called>
6  __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
7  __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
8  __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9  0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470
14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355
15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610
16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152
17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639
18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111
19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252
20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214
21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180
22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) f 12
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
1428 assert(m->owner == pthread_self());

In this decode the attempt to cancel the connection's events from
the wrong thread is causing the crash.  Modify the code to create an
event on the bm->master to cancel the events for the connection.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
3 months agobgpd: Fix deadlock in bgp_keepalive and master pthreads
Donald Sharp [Thu, 24 Oct 2024 18:17:51 +0000 (14:17 -0400)]
bgpd: Fix deadlock in bgp_keepalive and master pthreads

(gdb) bt
0  futex_wait (private=0, expected=2, futex_word=0x5c438e9a98d8) at ../sysdeps/nptl/futex-internal.h:146
1  __GI___lll_lock_wait (futex=futex@entry=0x5c438e9a98d8, private=0) at ./nptl/lowlevellock.c:49
2  0x00007af16d698002 in lll_mutex_lock_optimized (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:48
3  ___pthread_mutex_lock (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:93
4  0x00005c4369c17e70 in _frr_mtx_lock (mutex=0x5c438e9a98d8, func=0x5c4369dc2750 <__func__.265> "bgp_notify_send_internal") at ./lib/frr_pthread.h:258
5  0x00005c4369c1a07a in bgp_notify_send_internal (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000', data=0x0, datalen=0, use_curr=true) at bgpd/bgp_packet.c:928
6  0x00005c4369c1a707 in bgp_notify_send (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_packet.c:1069
7  0x00005c4369bea422 in bgp_stop_with_notify (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1597
8  0x00005c4369c18480 in bgp_packet_add (connection=0x5c438e9a98c0, peer=0x5c438e9b6010, s=0x7af15c06bf70) at bgpd/bgp_packet.c:151
9  0x00005c4369c19816 in bgp_keepalive_send (peer=0x5c438e9b6010) at bgpd/bgp_packet.c:639
10 0x00005c4369bf01fd in peer_process (hb=0x5c438ed05520, arg=0x7af16bdffaf0) at bgpd/bgp_keepalives.c:111
11 0x00007af16dacd8e6 in hash_iterate (hash=0x7af15c000be0, func=0x5c4369bf005e <peer_process>, arg=0x7af16bdffaf0) at lib/hash.c:252
12 0x00005c4369bf0679 in bgp_keepalives_start (arg=0x5c438e0db110) at bgpd/bgp_keepalives.c:214
13 0x00007af16dac9932 in frr_pthread_inner (arg=0x5c438e0db110) at lib/frr_pthread.c:180
14 0x00007af16d694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
15 0x00007af16d726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)

The bgp keepalive pthread gets deadlocked with itself and consequently
the bgp master pthread gets locked when it attempts to lock
the peerhash_mtx, since it is also locked by the keepalive_pthread

The keepalive pthread is locking the peerhash_mtx in
bgp_keepalives_start.  Next the connection->io_mtx mutex in
bgp_keepalives_send is locked and then when it notices a problem it invokes
bgp_stop_with_notify which relocks the same mutex ( and of course
the relock causes it to get stuck on itself ).  This generates a
deadlock condition.

Modify the code to only hold the connection->io_mtx as short as
possible.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
3 months agoMerge pull request #17892 from FRRouting/mergify/bp/stable/10.1/pr-17888
Donatas Abraitis [Wed, 22 Jan 2025 05:04:18 +0000 (07:04 +0200)]
Merge pull request #17892 from FRRouting/mergify/bp/stable/10.1/pr-17888

bgpd: Fix for local interface MAC cache issue in 'bgp mac hash' table (backport #17888)

3 months agobgpd: Fix for local interface MAC cache issue in 'bgp mac hash' table 17892/head
Krishnasamy R [Tue, 21 Jan 2025 09:06:53 +0000 (01:06 -0800)]
bgpd: Fix for local interface MAC cache issue in 'bgp mac hash' table

Issue:
During FRR restart, we fail to add some of the local interface's MAC
to the 'bgp mac hash'. Not having local MAC in the hash table can cause
lookup issues while receiving EVPN RT-2.

Currently, we have code to add local MAC(bgp_mac_add_mac_entry) while handling
interface add/up events in BGP(bgp_ifp_up/bgp_ifp_create). But the code
'bgp_mac_add_mac_entry' in bgp_ifp_create is not getting invoked as it
is placed under a specific check(vrf->bgp link check).

Fix:
We can skip this check 'vrf->bgp link existence' as the tenant VRF might
not have BGP instance but still we want to cache the tenant VRF local
MACs. So keeping this check in bgp_ifp_create inline with bgp_ifp_up.

Ticket: #4204154

Signed-off-by: Krishnasamy R <krishnasamyr@nvidia.com>
(cherry picked from commit 016528364e686fb3b23a688707bd6ae6c5ea5f41)

3 months agoMerge pull request #17851 from FRRouting/mergify/bp/stable/10.1/pr-17832
Russ White [Tue, 14 Jan 2025 16:09:13 +0000 (11:09 -0500)]
Merge pull request #17851 from FRRouting/mergify/bp/stable/10.1/pr-17832

bgpd: Aggregate backports (backport #17832)

3 months agobgpd: fix memory leak in bgp_aggregate_install() 17851/head
Enke Chen [Thu, 9 Jan 2025 22:48:35 +0000 (14:48 -0800)]
bgpd: fix memory leak in bgp_aggregate_install()

Potential memory leak with as-set and matching-MED-only config.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 94ca6ddfae959a08e84a7a5a070f44ddba70f156)

3 months agobgpd: apply route-map for aggregate before attribute comparison
Enke Chen [Thu, 9 Jan 2025 01:34:29 +0000 (17:34 -0800)]
bgpd: apply route-map for aggregate before attribute comparison

Currently when re-evaluating an aggregate route, the full attribute of
the aggregate route is not compared with the existing one in the BGP
table. That can result in unnecessary churns (un-install and then
install) of the aggregate route when a more specific route is added or
deleted, or when the route-map for the aggregate changes. The churn
would impact route installation and route advertisement.

The fix is to apply the route-map for the aggregate first and then
compare the attribute.

Here is an example of the churn:

debug bgp aggregate prefix 5.5.5.0/24
!
route-map set-comm permit 10
 set community 65004:200
!
router bgp 65001
 address-family ipv4 unicast
  redistribute static
  aggregate-address 5.5.5.0/24 route-map set-comm
!

Step 1:
  ip route 5.5.5.1/32 Null0

Jan  8 10:28:49 enke-vm1 bgpd[285786]: [J7PXJ-A7YA2] bgp_aggregate_install: aggregate 5.5.5.0/24, count 1
Jan  8 10:28:49 enke-vm1 bgpd[285786]: [Y444T-HEVNG]   aggregate 5.5.5.0/24: installed

Step 2:
  ip route 5.5.5.2/32 Null0

Jan  8 10:29:03 enke-vm1 bgpd[285786]: [J7PXJ-A7YA2] bgp_aggregate_install: aggregate 5.5.5.0/24, count 2
Jan  8 10:29:03 enke-vm1 bgpd[285786]: [S2EH5-EQSX6]   aggregate 5.5.5.0/24: existing, removed
Jan  8 10:29:03 enke-vm1 bgpd[285786]: [Y444T-HEVNG]   aggregate 5.5.5.0/24: installed
---

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit 22d95f4ba8444171944eab29e99dfa6087813d6f)

3 months agoRevert "bgpd: Reinstall aggregated routes if using route-maps and it was changed"
Enke Chen [Wed, 8 Jan 2025 17:12:56 +0000 (09:12 -0800)]
Revert "bgpd: Reinstall aggregated routes if using route-maps and it was changed"

This reverts commit ee1986f1b5ae6b94b446b12e1b77cc30d8f5f46d.

The fix is incomplete, and is no longer needed with the fix that applies
the route-map for an aggregate and then compares the attribute.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit 74c9d89aaf3df1b583de341169c4cb77eaa1b3b4)

3 months agoMerge pull request #17834 from FRRouting/mergify/bp/stable/10.1/pr-17813
Donald Sharp [Fri, 10 Jan 2025 18:35:34 +0000 (13:35 -0500)]
Merge pull request #17834 from FRRouting/mergify/bp/stable/10.1/pr-17813

bgpd: use igpmetric in bgp_aigp_metric_total() (backport #17813)

3 months agobgpd: use igpmetric in bgp_aigp_metric_total() 17834/head
Enke Chen [Thu, 9 Jan 2025 20:02:02 +0000 (12:02 -0800)]
bgpd: use igpmetric in bgp_aigp_metric_total()

Use igpmetric from bgp_path_info in bgp_igp_metric_total() to be
consistent with all other cases, e.g., as in bgp_path_info_cmp().

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit b89e66a3bcd5644278f34ec5899b071066e102a1)

3 months agoMerge pull request #17816 from FRRouting/mergify/bp/stable/10.1/pr-17807
Donatas Abraitis [Fri, 10 Jan 2025 07:42:59 +0000 (09:42 +0200)]
Merge pull request #17816 from FRRouting/mergify/bp/stable/10.1/pr-17807

bgpd: fix crash in displaying json orf prefix-list (backport #17807)

3 months agobgpd: fix crash in displaying json orf prefix-list 17816/head
Louis Scalbert [Thu, 9 Jan 2025 17:28:53 +0000 (18:28 +0100)]
bgpd: fix crash in displaying json orf prefix-list

bgpd crashes when there is several entries in the prefix-list. No
backtrace is provided because the issue was catched from a code review.

Fixes: 856ca177c4 ("Added json formating support to show-...-neighbors-... bgp commands.")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 8ccf60921b85893d301186a0f8156fb702da379f)

3 months agobgpd: fix bgp orf prefix-list json prefix
Louis Scalbert [Thu, 9 Jan 2025 17:24:39 +0000 (18:24 +0100)]
bgpd: fix bgp orf prefix-list json prefix

0x<address>FX was displayed instead of the prefix.

Fixes: b219dda129 ("lib: Convert usage of strings to %pFX and %pRN")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit b7e843d7e8afe57d3815dbb44e30307654e73711)

3 months agoMerge pull request #17785 from FRRouting/mergify/bp/stable/10.1/pr-17780
Donald Sharp [Tue, 7 Jan 2025 18:11:58 +0000 (13:11 -0500)]
Merge pull request #17785 from FRRouting/mergify/bp/stable/10.1/pr-17780

bgpd: fix a bug in peer_allowas_in_set() (backport #17780)

3 months agoMerge pull request #17789 from FRRouting/mergify/bp/stable/10.1/pr-17725
Donald Sharp [Tue, 7 Jan 2025 18:08:43 +0000 (13:08 -0500)]
Merge pull request #17789 from FRRouting/mergify/bp/stable/10.1/pr-17725

isisd: Allow full `no` form for `domain-password` and `area-password` (backport #17725)

3 months agoisisd: Allow full `no` form for `domain-password` and `area-password` 17789/head
Donatas Abraitis [Thu, 26 Dec 2024 15:33:03 +0000 (17:33 +0200)]
isisd: Allow full `no` form for `domain-password` and `area-password`

Before:

```
LR1.wue3(config)# router isis VyOS
LR1.wue3(config-router)# no  area-password clear
% Unknown command: no  area-password clear
LR1.wue3(config-router)# no  area-password clear foo
% Unknown command: no  area-password clear foo
LR1.wue3(config-router)#
```

Closes https://github.com/FRRouting/frr/issues/17722

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit a696547d6c78d4140649f96d6bef9a335fe5dfa5)

3 months agobgpd: fix a bug in peer_allowas_in_set() 17785/head
Enke Chen [Tue, 7 Jan 2025 05:01:14 +0000 (21:01 -0800)]
bgpd: fix a bug in peer_allowas_in_set()

Fix a bug in peer_allowas_in_set() so that the config takes effect
for peer-group members.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit bcd10177940223d86cbcfbe1818b2a0b29e0552b)

3 months agoMerge pull request #17764 from FRRouting/mergify/bp/stable/10.1/pr-17750
Jafar Al-Gharaibeh [Sun, 5 Jan 2025 22:06:09 +0000 (16:06 -0600)]
Merge pull request #17764 from FRRouting/mergify/bp/stable/10.1/pr-17750

tools: Add missing rpki keyword to vrf in frr-reload (backport #17750)

3 months agotools: Add missing rpki keyword to vrf in frr-reload 17764/head
Jonathan Voss [Fri, 3 Jan 2025 03:19:30 +0000 (03:19 +0000)]
tools: Add missing rpki keyword to vrf in frr-reload

When reloading the following configuration:
```
vrf red
 rpki
  rpki cache tcp 172.65.0.2 8282 preference 1
 exit
exit-vrf
```
frr-reload.py does not properly enter the `rpki` context
within a `vrf`. Because of this, it fails to apply RPKI
configurations.

Signed-off-by: Jonathan Voss <jvoss@onvox.net>
(cherry picked from commit 975ee8ed6eb22f68538f3446b29ca34d65bec72f)

3 months agoMerge pull request #17741 from FRRouting/mergify/bp/stable/10.1/pr-17731
Donatas Abraitis [Sat, 4 Jan 2025 11:52:05 +0000 (13:52 +0200)]
Merge pull request #17741 from FRRouting/mergify/bp/stable/10.1/pr-17731

zebra: Fix resetting valid flags for NHG dependents (backport #17731)

3 months agoMerge pull request #17756 from FRRouting/mergify/bp/stable/10.1/pr-17732
Donatas Abraitis [Sat, 4 Jan 2025 11:49:38 +0000 (13:49 +0200)]
Merge pull request #17756 from FRRouting/mergify/bp/stable/10.1/pr-17732

isisd: Show correct level information for `show isis interface detail json` (backport #17732)

3 months agoisisd: Show correct level information for `show isis interface detail json` 17756/head
Donatas Abraitis [Mon, 30 Dec 2024 08:31:44 +0000 (10:31 +0200)]
isisd: Show correct level information for `show isis interface detail json`

Having this configuration:

```
!
interface r1-eth0
 ip address 10.0.0.1/30
 ip router isis 1
 isis priority 44 level-1
 isis priority 88 level-2
 isis csnp-interval 90 level-1
 isis csnp-interval 99 level-2
 isis psnp-interval 70 level-1
 isis psnp-interval 50 level-2
 isis hello-interval level-1 120
 isis hello-interval level-2 150

!
interface r1-eth1
 ip address 10.0.0.10/30
 ip router isis 1
!
interface lo
 ip address 192.0.2.1/32
 ip router isis 1
 isis passive
!
router isis 1
net 49.0000.0000.0000.0001.00
 metric-style wide
```

Produces:

```
{
 "areas":[
   {
     "area":"1",
     "circuits":[
       {
         "circuit":2,
         "interface":{
           "name":"r1-eth0",
           "state":"Up",
           "is-passive":"active",
           "circuit-id":"0x2",
           "type":"lan",
           "level":"L1L2",
           "snpa":"6e28.9c92.da5e",
           "levels":[
             {
               "level":"L1",
               "metric":10,
               "active-neighbors":1,
               "hello-interval":120,
               "holddown":{
                 "count":10,
                 "pad":"yes"
               },
               "cnsp-interval":90,
               "psnp-interval":70,
               "lan":{
                 "priority":44,
                 "is-dis":"no"
               }
             },
             {
               "level":"L2",
               "metric":10,
               "active-neighbors":1,
               "hello-interval":120, <<<<<<<<<<<<<<<<<<
               "holddown":{
                 "count":10,
                 "pad":"yes"
               },
               "cnsp-interval":90, <<<<<<<<<<<<<<<<<<
               "psnp-interval":70, <<<<<<<<<<<<<<<<<<
               "lan":{
                 "priority":44, <<<<<<<<<<<<<<<<<<
                 "is-dis":"no"
               }
             }
           ],
...
```

Fixes: 9fee4d4c6038ef6b14e9f509d6b04d189660c4cd ("isisd: Add json to show isis interface command.")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 360a0d6f4ca68fda0eb5d64a8633018a3b5a4a1d)

3 months agozebra: Fix resetting valid flags for NHG dependents 17741/head
Donald Sharp [Sun, 29 Dec 2024 06:40:37 +0000 (22:40 -0800)]
zebra: Fix resetting valid flags for NHG dependents

Upon if_down, we don't reset the valid flag for dependents
and unset the INSTALLED flag.

So when its time for the NHG to be deleted (routes dereferenced),
zebra deletes it since refcnt goes to 0, but stale NHG remains in kernel.

Ticket :#4200788

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 54ec9f38884fb63e045732537c4c1f4a94608987)

4 months agoFRR Release 10.1.2 rc/10.1.2 docker/10.1.2 frr-10.1.2
Donatas Abraitis [Mon, 23 Dec 2024 21:07:38 +0000 (23:07 +0200)]
FRR Release 10.1.2

- babeld
-   Do not remove route when replacing
-   Send the route's metric down to zebra.
- bfdd
-   Add no variants to interval configurations
-   Retain remote dplane client socket
- bgpd
-   Actually make ` --v6-with-v4-nexthops` it work
-   Add `bgp ipv6-auto-ra` command
-   Allow value 0 in aigp-metric setting
-   Avoid use-after-free when doing `no router bgp` with auto created instances
-   Fix to pop items off zebra_announce FIFO for few EVPN triggers
-   Clear all paths including addpath once GR expires
-   Compare aigp after local route check in bgp_path_info_cmp()
-   Do not filter no-export community for BGP OAD (one administration domain)
-   Do not reset peers on suppress-fib toggling
-   EVPN fix per rd specific type-2 json output
-   Fix bgp core with a possible Intf delete
-   Fix blank line in running-config with bmp listener cmd
-   Fix crash when polling bgp4v2PathAttrTable
-   Fix display of local label in show bgp
-   Fix `enforce-first-as` per peer-group removal
-   Fix evpn bestpath calculation when path is not established
-   Fix evpn mh esi flap remove local routes
-   Fix for match source-protocol in route-map for redistribute cmd
-   Fix memory leak when creating BMP connection with a source interface
-   Fix memory leak when reconfiguring a route distinguisher
-   Fix printfrr_bp for non initialized peers
-   Fix resolvedPrefix in show nexthop json output
-   Fix route selection with AIGP
-   Fix several issues in sourcing AIGP attribute
-   Fix unconfigure asdot neighbor
-   Fix use single whitespace when displaying flowspec entries
-   Fix version attribute is an int, not a string
-   Include structure when installing End.DT4/6 SID
-   Include structure when installing End.DT46 SID
-   Include structure when removing End.DT4/6 SID
-   Include structure when removing End.DT46 SID
-   Move some non BGP-specific route-map functions to lib
-   Set LLGR stale routes for all the paths including addpath
-   Treat numbered community-list only if it's in a range 1-500
-   Validate both nexthop information (NEXTHOP and NLRI)
-   Validate only affected RPKI prefixes instead of a full RIB
- isisd
-   Fix change flex-algorithm number from uint32 to uint8
-   Fix memory leaks when the transition of neighbor state from non-UP to DOWN
-   Fix rcap tlv double-free crash
-   Fix wrong check for MT commands
- lib
-   Attach stdout to child only if --log=stdout and stdout FD is a tty
-   Include SID structure in seg6local nexthop
-   Take ge/le into consideration when checking the prefix with the prefix-list
-   Keep `zebra on-rib-process script` in frr.conf
- nhrpd
-   Fixes duplicate auth extension
- ospfd
-   Add a hidden command for old `no router-id`
-   Fix heap corruption vulnerability when parsing SR-Algorithm TLV
-   Fix missing '[no]ip ospf graceful-restart hello-delay <N>' commands
-   Interface 'ip ospf neighbor-filter' startup config not applied.
-   Use router_id what Zebra has if we remove a static router_id
- pimd
-   Allow resolving bsr via directly connected secondary address
-   Fix access-list memory leak in pimd
- vrrpd
-   Iterate over all ancillary messages
- zebra
-   Add missing new line for help string
-   Add missing proto translations
-   Correctly report metrics
-   Fix crash during reconnect
-   Fix heap-use-after free on ns shutdown
-   Fix snmp walk of zebra rib
-   Let's use memset instead of walking bytes and setting to 0
-   Separate zebra ZAPI server open and accept
-   Unlock node only after operation in zebra_free_rnh()

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
4 months agoMerge pull request #17697 from FRRouting/mergify/bp/stable/10.1/pr-17586
Donatas Abraitis [Mon, 23 Dec 2024 20:10:07 +0000 (22:10 +0200)]
Merge pull request #17697 from FRRouting/mergify/bp/stable/10.1/pr-17586

bgpd: Validate only affected RPKI prefixes instead of a full RIB (backport #17586)

4 months agoMerge pull request #17679 from FRRouting/mergify/bp/stable/10.1/pr-17675
Jafar Al-Gharaibeh [Mon, 23 Dec 2024 04:50:40 +0000 (22:50 -0600)]
Merge pull request #17679 from FRRouting/mergify/bp/stable/10.1/pr-17675

bgpd: Fix memory leak when creating BMP connection with a source interface (backport #17675)

4 months agoMerge pull request #17713 from opensourcerouting/fix/backport_b6dcf618777bb7a11176617...
Jafar Al-Gharaibeh [Mon, 23 Dec 2024 04:48:37 +0000 (22:48 -0600)]
Merge pull request #17713 from opensourcerouting/fix/backport_b6dcf618777bb7a11176617d647e16ab64f49b7b_10.1

bgpd: Fix `enforce-first-as` per peer-group removal (backport)