]> git.puffer.fish Git - mirror/frr.git/log
mirror/frr.git
5 weeks agozebra: Do not flush an existing vni configuration trying to remove wrong vni mergify/bp/stable/10.2/pr-18108 18456/head
Donatas Abraitis [Tue, 11 Feb 2025 19:22:12 +0000 (21:22 +0200)]
zebra: Do not flush an existing vni configuration trying to remove wrong vni

Before:

```
pc.donatas.net(config)# do sh run | include vni
vni 1
pc.donatas.net(config)# no vni 2
pc.donatas.net(config)# do sh run | include vni
pc.donatas.net(config)#
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 44fe3981ee388f7c60ab2635309bce34774116e1)

5 weeks agoMerge pull request #18434 from FRRouting/mergify/bp/stable/10.2/pr-18430
Donald Sharp [Thu, 20 Mar 2025 00:42:58 +0000 (20:42 -0400)]
Merge pull request #18434 from FRRouting/mergify/bp/stable/10.2/pr-18430

lib: Create VRF if needed (backport #18430)

5 weeks agolib: Create VRF if needed mergify/bp/stable/10.2/pr-18430 18434/head
Nathan Bahr [Wed, 19 Mar 2025 16:07:37 +0000 (16:07 +0000)]
lib: Create VRF if needed

When creating a control plane protocol through NB, create the vrf
if needed instead of only looking up and asserting if it doesn't
exist yet.
Fixes 18429.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit b6ae01f907c071be6cd197df0f3ca6fe9baa631a)

5 weeks agoMerge pull request #18423 from FRRouting/mergify/bp/stable/10.2/pr-18393
Jafar Al-Gharaibeh [Wed, 19 Mar 2025 14:55:58 +0000 (09:55 -0500)]
Merge pull request #18423 from FRRouting/mergify/bp/stable/10.2/pr-18393

ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration. (backport #18393)

5 weeks ago ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configur... 18423/head
Acee Lindem [Fri, 14 Mar 2025 16:02:28 +0000 (16:02 +0000)]
   ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration.

        This fix will delete an OSPFv3 area when all the interfaces and
        configuration (ranges, NSSA ranges, stub area, NSSA area, filter-list,
        import-list and export-list) have been removed. The changes provides
        a general solution to https://github.com/FRRouting/frr/issues/18324.

Signed-off-by: Acee Lindem <acee@lindem.com>
(cherry picked from commit 04994891fe164b4d5a2819d3bc90e5346f94dc53)

5 weeks agoMerge pull request #18411 from opensourcerouting/fix/cherry-picks_bgp_10.2
Russ White [Tue, 18 Mar 2025 12:47:12 +0000 (08:47 -0400)]
Merge pull request #18411 from opensourcerouting/fix/cherry-picks_bgp_10.2

bgpd: Backport recent changes for 10.2 regarding EVPN pointer changes

6 weeks agobgpd: Do not call evpn_overlay_free no matter what 18411/head
Donald Sharp [Wed, 23 Oct 2024 17:16:29 +0000 (13:16 -0400)]
bgpd: Do not call evpn_overlay_free no matter what

bgp_update is a very expensive call.  Calling evpn_overlay_free
even when we have no evpn data to free is not trivial.  Let's
limit the call into this function until we actually have data to
free.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
6 weeks agobgpd: In bgp_update() for mac addrs ensure we are dealing with evpn
Donald Sharp [Wed, 30 Oct 2024 17:11:35 +0000 (13:11 -0400)]
bgpd: In bgp_update() for mac addrs ensure we are dealing with evpn

The code is just arbitrarily checking to see if there are any
mac addresses associated with a prefix.  This makes no
sense from the perspective that it can only happen as
an evpn route.  Let's not make non-evpn people pay
the price to check this data.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
6 weeks agoMerge pull request #18404 from FRRouting/mergify/bp/stable/10.2/pr-18387
Donald Sharp [Mon, 17 Mar 2025 12:15:24 +0000 (08:15 -0400)]
Merge pull request #18404 from FRRouting/mergify/bp/stable/10.2/pr-18387

bgpd: Fixed crash upon bgp network import-check command (backport #18387)

6 weeks agobgpd: Fixed crash upon bgp network import-check command 18404/head
Manpreet Kaur [Thu, 13 Mar 2025 11:14:24 +0000 (04:14 -0700)]
bgpd: Fixed crash upon bgp network import-check command

BT:
```
3  <signal handler called>
4  0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
    bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5  0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6  0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
    argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7  0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```

The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.

Signed-off-by: Manpreet Kaur <manpreetk@nvidia.com>
(cherry picked from commit bc1008b970541c090e36fc1d50c720df822fcb99)

6 weeks agoMerge pull request #18392 from FRRouting/mergify/bp/stable/10.2/pr-18360
Donatas Abraitis [Sat, 15 Mar 2025 17:33:07 +0000 (18:33 +0100)]
Merge pull request #18392 from FRRouting/mergify/bp/stable/10.2/pr-18360

zebra: ensure proper return for failure for Sid allocation (backport #18360)

6 weeks agozebra: ensure proper return for failure for Sid allocation 18392/head
Rajasekar Raja [Mon, 10 Mar 2025 22:26:38 +0000 (15:26 -0700)]
zebra: ensure proper return for failure for Sid allocation

The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool
but we have places where we return a -1 or NULL which the caller is
assuming as a True/Valid and ending up allocating Sid

Without Fix:
2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32
2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088
2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000]
2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2
2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients

With Fix:
2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056
2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000]
2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32
2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN
root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1#

Ticket :#
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 5a63cf4c0d1e7b84f59003877599c6575ba08a25)

7 weeks agoFRR Release 10.2.2 rc/10.2 docker/10.2.2 frr-10.2.2
Jafar Al-Gharaibeh [Mon, 10 Mar 2025 04:58:21 +0000 (23:58 -0500)]
FRR Release 10.2.2

Changelog:

bgpd
    Allow bfd to work if peer known but interface address not yet
    Apply route-map for aggregate before attribute comparison
    Do not ignore auto generated vrf instances when deleting
    Do not start bgp session if bgp identifier is not set
    Do not try to uninstall bfd session if the peer is not established
    Don't reuse nexthop variable in loop/switch
    Fix a bug in peer_allowas_in_set()
    Fix add label support to evpn ad routes
    Fix bfd with update-source in peer-group
    Fix bgp label evpn cid 1636504
    Fix bgp orf prefix-list json prefix
    Fix bgp peer solo option
    Fix bgp vrf instance creation from implicit
    Fix crash in bgp_labelpool
    Fix crash in displaying json orf prefix-list
    Fix deadlock in bgp_keepalive and master pthreads
    Fix duplicate bgp instance created with unified config
    Fix for local interface mac cache issue in 'bgp mac hash' table
    Fix import vrf creates multiple bgp instances
    Fix incorrect json in bgp_show_table_rd
    Fix memory leak in bgp_aggregate_install()
    Fix route-distinguisher in vrf leak json cmd
    Fix static analyzer issues around bgp pointer
    Fix table-map option
    Fix vty output of evpn route-target as4
    Fix wrong pthread event cancelling
    Remove dmed check not required in bestpath selection
    Request srv6 locator after zebra connection
    Reset bgp session only if it was a real bfd down event
    Respect allowas-in value from the source vrf's peer
    Simplify bgp_evpn_process_rt1 with label
    Update source address for bfd session
    Use igpmetric in bgp_aigp_metric_total()
    When bgp notices a change to shared_network inform bfd of it
    When removing the prefix list drop the pointer
    With suppress-fib-pending ensure withdrawal is sent
    Revert: Handle addpath capability using dynamic capabilities"
    Revert: Reinstall aggregated routes if using route-maps and it was changed"

isisd
    Add helper function to request srv6 locator information
    Allow full `no` form for `domain-password` and `area-password`
    Correct edge insertion into ted
    Request srv6 locator after zebra connection
    Show correct level information for `show isis interface detail json`

lib
    Clean up nexthop hashing mess
    Crash handlers must be allowed on threads
    Fix false context information for srv6 route
    Guard against padding garbage in zapi read
    Nb: call child destroy cbs when yang container is deleted

mgmtd
    Prevent use after free

nhrpd
    Fix dont consider incomplete l2 entry

ospf6d
    Fix use after free of router in ospfv3 abr route calculation.

pbrd
    Initialize structs used in hash_lookup

pimd
    Always write cand-rp group config even when rp is inactive
    Close autorp socket when not needed
    During prefix-list update, behave as pim_upstream_notjoined state (conformance issue)
    Explicitly ensure the rp src is bsr
    Fix autorp group joins
    Fix bsr rps timing out
    Fix dr election race on startup
    Fix for data packet loss when fhr is lhr and rp
    Fix for fhr mroute taking longer to age out
    Fix memory leak and assign allocation type
    Fix pim vrf support (send register/register stop in vrf)
    Fix pim6 mld vrf support (use recvmsg() pktinfo)
    Fix vrf binding of autorp and mroute socket

tests
    Add a test that shows the v6 recursive nexthop problem
    Bgp_srv6_sid_reachability should give more time
    Bgp_srv6l3vpn_to_bgp_vrf3 needs more time
    Check if allow as-in works when importing between local vrfs

tools
    Add missing formats keyword to segment-routing in frr-reload
    Add missing rpki keyword to vrf in frr-reload
    Fix frr-reload for ebgp-multihop ttl reconfiguration.

zebra
    Ensure dplane does not send work back to master at wrong time
    Evpn svd hash avoid double free
    Fix leaked nhe
    Fix resetting valid flags for nhg dependents
    Guard against junk in nexthop->rmap_src
    Include resolving nexthops in nhg hash

Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
7 weeks agoMerge pull request #18333 from FRRouting/mergify/bp/stable/10.2/pr-18315
Donald Sharp [Fri, 7 Mar 2025 01:32:39 +0000 (20:32 -0500)]
Merge pull request #18333 from FRRouting/mergify/bp/stable/10.2/pr-18315

pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo) (backport #18315)

7 weeks agopimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo) 18333/head
Martin Buck [Tue, 4 Mar 2025 13:24:33 +0000 (14:24 +0100)]
pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo)

When receiving MLD messages, prefer pktinfo over msghdr.msg_name for
determining the source interface. The latter is just the VRF master
interface in case of VRF and we need the true interface the packet was
received on instead.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 374c8dc4dbc8a560036fecdfb3213f690099b869)

8 weeks agoMerge pull request #18299 from FRRouting/mergify/bp/stable/10.2/pr-18294
Donald Sharp [Mon, 3 Mar 2025 15:35:07 +0000 (10:35 -0500)]
Merge pull request #18299 from FRRouting/mergify/bp/stable/10.2/pr-18294

isisd: Correct edge insertion into TED (backport #18294)

8 weeks agoisisd: Correct edge insertion into TED 18299/head
Olivier Dugeon [Mon, 3 Mar 2025 09:08:17 +0000 (10:08 +0100)]
isisd: Correct edge insertion into TED

Edges are not correctly linked to Vertices during LSP processing. In function
lsp_to_edge_cb(), once edge created or updated from the LSP TLVs, the code try
to link the edge to destination vertices. In case the revert edge is not found,
the code try to found a destination vertex to link to. But, the sys_id used
for this operation corresponds to the source vertex. As a result, the edge is
attached as source and destination of the vertex. When Traffic Engineering is
stopped, TED is deleted which result into a double free of the edge attributes.
This cause a crash when attempt to free extended admin groupi the second time.

This patch removed wrong code which link twice the edge to the source vertex.

Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
(cherry picked from commit 605fc1dd6404b6ed51691c647568939adde4962a)

8 weeks agoMerge pull request #18280 from FRRouting/mergify/bp/stable/10.2/pr-18264
Jafar Al-Gharaibeh [Sat, 1 Mar 2025 21:00:40 +0000 (15:00 -0600)]
Merge pull request #18280 from FRRouting/mergify/bp/stable/10.2/pr-18264

mgmtd: Prevent use after free (backport #18264)

8 weeks agomgmtd: Prevent use after free 18280/head
Donald Sharp [Wed, 26 Feb 2025 17:34:05 +0000 (12:34 -0500)]
mgmtd: Prevent use after free

ci is picking up this use after free on occasion:

    ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
        0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
        1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
        2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
        3 0x7fab99039bcf in mt_count_free lib/memory.c:78
        4 0x7fab99039bcf in qfree lib/memory.c:130
        5 0x7fab98ff971a in hash_clean lib/hash.c:290
        6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
        15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
        16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)

    0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
    freed by thread T0 here:
        0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
        1 0x7fab99039bf0 in qfree lib/memory.c:131
        2 0x7fab98ff93e1 in hash_release lib/hash.c:227
        3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
        4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
        5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
        6 0x7fab98ff970b in hash_clean lib/hash.c:288
        7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

    previously allocated by thread T0 here:
        0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
        1 0x7fab990392fd in qcalloc lib/memory.c:106
        2 0x7fab98ff8b4f in hash_get lib/hash.c:156
        3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
        4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
        5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
        6 0x7fab990ec6de in event_call lib/event.c:2019
        7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
        8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called.  Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 62f35c7bdb2a6364dd03ab120e7bb685dd317c24)

8 weeks agoMerge pull request #18266 from FRRouting/mergify/bp/stable/10.2/pr-18254
Jafar Al-Gharaibeh [Thu, 27 Feb 2025 16:12:52 +0000 (10:12 -0600)]
Merge pull request #18266 from FRRouting/mergify/bp/stable/10.2/pr-18254

ospf6d: Fix use after free of router in OSPFv3 ABR route calculation. (backport #18254)

2 months agoospf6d: Fix use after free of router in OSPFv3 ABR route calculation. 18266/head
Acee Lindem [Mon, 24 Feb 2025 21:44:32 +0000 (21:44 +0000)]
ospf6d: Fix use after free of router in OSPFv3 ABR route calculation.

This PR fixes FRR issue https://github.com/FRRouting/frr/issues/18040. The
OSPFv3 route is locked during the ABR calculation since there are
scenarios under which it is freed. The OSPFv3 ABR computation is
sub-optimal and this PR doesn't attempt to rework it.

Signed-off-by: Acee Lindem <acee@lindem.com>
(cherry picked from commit 06af50eacec8660fada0d4fd5cd11f0ade4e3c6c)

2 months agoMerge pull request #18251 from nabahr/pr-18225-10.2-backport-fixed
Donald Sharp [Tue, 25 Feb 2025 15:21:16 +0000 (10:21 -0500)]
Merge pull request #18251 from nabahr/pr-18225-10.2-backport-fixed

pim: Fix autorp group joins (backport #18225)

2 months agoMerge pull request #18252 from nabahr/pr-18226-10.2-backport-fixed
Donald Sharp [Tue, 25 Feb 2025 15:20:46 +0000 (10:20 -0500)]
Merge pull request #18252 from nabahr/pr-18226-10.2-backport-fixed

pim: Fix vrf binding of autorp and mroute socket (backport #18226)

2 months agoMerge pull request #18249 from FRRouting/mergify/bp/stable/10.2/pr-18216
Jafar Al-Gharaibeh [Mon, 24 Feb 2025 21:33:54 +0000 (15:33 -0600)]
Merge pull request #18249 from FRRouting/mergify/bp/stable/10.2/pr-18216

pimd: Fix PIM VRF support (send register/register stop in VRF) (backport #18216)

2 months agopim: Fix vrf binding of autorp and mroute socket 18252/head
Nathan Bahr [Mon, 24 Feb 2025 20:23:52 +0000 (20:23 +0000)]
pim: Fix vrf binding of autorp and mroute socket

Bind the autorp socket to the vrf device.
Also fixed mroute socket to use vrf_bind instead of directly
setting the socket option.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit 7e181a771c2e525aeda6e8f6c2d58e9ee2503949)

Fixed merge conflicts

2 months agopim: Fix autorp group joins 18251/head
Nathan Bahr [Mon, 24 Feb 2025 20:02:54 +0000 (20:02 +0000)]
pim: Fix autorp group joins

Group joining got broken when moving the autorp socket to open/close
as needed. This fixes it so autorp group joining is properly handled
as part of opening the socket.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit d840560b74e3a6117aa1e4b1203dcdd8fb254ef6)

Fixed merge conflicts for backport

2 months agopimd: Fix PIM VRF support (send register/register stop in VRF) 18249/head
Martin Buck [Fri, 21 Feb 2025 07:54:49 +0000 (08:54 +0100)]
pimd: Fix PIM VRF support (send register/register stop in VRF)

In 946195391406269003275850e1a4d550ea8db38b and
8ebcc02328c6b63ecf85e44fdfbf3365be27c127, transmission of PIM register and
register stop messages was changed to use a separate socket. However, that
socket is not bound to a possible VRF, so the messages were sent in the
default VRF instead. Call vrf_bind() once after socket creation and when the
VRF is ready to ensure transmission in the correct VRF. vrf_bind() handles
the non-VRF case (i.e. VRF_DEFAULT) automatically, so it may be called
unconditionally.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 5a01011e0d2db538a8ba523904bd4f08b786edfb)

2 months agoMerge pull request #18228 from FRRouting/mergify/bp/stable/10.2/pr-18210
Jafar Al-Gharaibeh [Sat, 22 Feb 2025 20:13:08 +0000 (14:13 -0600)]
Merge pull request #18228 from FRRouting/mergify/bp/stable/10.2/pr-18210

bgpd: remove dmed check not required in bestpath selection (backport #18210)

2 months agobgpd: remove dmed check not required in bestpath selection 18228/head
Donald Sharp [Thu, 20 Feb 2025 19:28:15 +0000 (14:28 -0500)]
bgpd: remove dmed check not required in bestpath selection

As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.

As an example:

During the bestpath selection local route looses to another path due
to dmed condition being hit.

The snippet of the logs:

2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0   <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 83ad94694bc061e1ff5f43db42cba46320e0df73)

2 months agoMerge pull request #18208 from FRRouting/mergify/bp/stable/10.2/pr-17666
Donald Sharp [Thu, 20 Feb 2025 21:19:23 +0000 (16:19 -0500)]
Merge pull request #18208 from FRRouting/mergify/bp/stable/10.2/pr-17666

pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED sta… (backport #17666)

2 months agoMerge pull request #18204 from FRRouting/mergify/bp/stable/10.2/pr-14227
Donald Sharp [Thu, 20 Feb 2025 19:20:10 +0000 (14:20 -0500)]
Merge pull request #18204 from FRRouting/mergify/bp/stable/10.2/pr-14227

pimd: Fix for data packet loss when FHR is LHR and RP (backport #14227)

2 months agopimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED state (conformance... 18208/head
Rajesh Varatharaj [Wed, 21 Jun 2023 17:59:12 +0000 (10:59 -0700)]
pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED state (conformance issue)

Issue:
If there are any changes to the prefix list, we perform a re-lookup to map the correct RP for the group.
Even if the S,G entry is PIM_UPSTREAM_NOTJOINED and in FHR, In the case of IGMPv3, an S,G entry can be
created with no joins. this is not necessary.
 https://www.rfc-editor.org/rfc/rfc4601#section-4.5.7 says no op in case of NOTJOINED

Solution:
To solve this issue, Stop RP mapping when the state is NOTJOINED

Ticket: #3496931

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 51f26d17da288af44a8a0e536dbe317a7e678514)

2 months agopimd: Fix for data packet loss when FHR is LHR and RP 18204/head
Rajesh Varatharaj [Thu, 17 Aug 2023 20:11:42 +0000 (13:11 -0700)]
pimd: Fix for data packet loss when FHR is LHR and RP

Topology:
A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.

RC and Issue:
When an upstream S,G is in join state, it sends a register message to the RP.
If the RP has the receiver, it sends a register stop message and switches to the shortest path.
When the register stop message is processed, it removes pimreg, moves to prune,
and starts the reg stop timer.

When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL
register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.

The problem is when the register stop timer pops and state is in Join Pending.
According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1,
we need to put back the pimreg reg tunnel into the S,G mroute.
This causes data to be sent to the control plane and subsequently interrupts the line rate.

Fix:
If the router is FHR and RP to the group,
ignore SPT status and send out a register stop message back to the DR (in this context, the same router).

Ticket: #3506780

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 8280257cc99e071c205e469399f2fb41671b30eb)

2 months agoMerge pull request #18200 from FRRouting/revert-18155-mergify/bp/stable/10.2/pr-18121
Jafar Al-Gharaibeh [Wed, 19 Feb 2025 19:00:45 +0000 (13:00 -0600)]
Merge pull request #18200 from FRRouting/revert-18155-mergify/bp/stable/10.2/pr-18121

Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"

2 months agoRevert "bgpd: release manual vpn label on instance deletion (backport #18121)" 18200/head
Donald Sharp [Wed, 19 Feb 2025 16:22:03 +0000 (11:22 -0500)]
Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"

2 months agoMerge pull request #18155 from FRRouting/mergify/bp/stable/10.2/pr-18121
Russ White [Tue, 18 Feb 2025 15:28:10 +0000 (10:28 -0500)]
Merge pull request #18155 from FRRouting/mergify/bp/stable/10.2/pr-18121

bgpd: release manual vpn label on instance deletion (backport #18121)

2 months agoMerge pull request #18192 from FRRouting/mergify/bp/stable/10.2/pr-18082
Jafar Al-Gharaibeh [Tue, 18 Feb 2025 05:14:27 +0000 (23:14 -0600)]
Merge pull request #18192 from FRRouting/mergify/bp/stable/10.2/pr-18082

lib: nb: call child destroy CBs when YANG container is deleted (backport #18082)

2 months agolib: nb: call child destroy CBs when YANG container is deleted 18192/head
Christian Hopps [Tue, 11 Feb 2025 07:12:06 +0000 (07:12 +0000)]
lib: nb: call child destroy CBs when YANG container is deleted

Previously the code was only calling the child destroy callbacks if the target
deleted node was a non-presence container. We now add a flag to the callback
structure to instruct northbound to perform the rescursive delete for code that
wishes for this to happen.

- Fix wrong relative path lookup in keychain destroy callback

Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit d03ecf4562ef3ade6b7b83bf6c683c4741f395ba)

2 months agoMerge pull request #18180 from FRRouting/mergify/bp/stable/10.2/pr-18178
Donatas Abraitis [Sun, 16 Feb 2025 16:21:59 +0000 (18:21 +0200)]
Merge pull request #18180 from FRRouting/mergify/bp/stable/10.2/pr-18178

isisd: Request SRv6 locator after zebra connection (backport #18178)

2 months agoMerge pull request #18184 from FRRouting/mergify/bp/stable/10.2/pr-18109
Donald Sharp [Sun, 16 Feb 2025 13:09:57 +0000 (08:09 -0500)]
Merge pull request #18184 from FRRouting/mergify/bp/stable/10.2/pr-18109

bgpd: fix vty output of evpn route-target AS4 (backport #18109)

2 months agobgpd: fix vty output of evpn route-target AS4 18184/head
Mark Stapp [Tue, 11 Feb 2025 19:35:28 +0000 (14:35 -0500)]
bgpd: fix vty output of evpn route-target AS4

evpn route-targets are decoded in  ... multiple places; at least
two have a bug where the AS4 form doesn't have its AS decoded.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit 9943a08720ccbed87cd6938791066a0de94a92c6)

2 months agoisisd: Request SRv6 locator after zebra connection 18180/head
Carmine Scarpitta [Sat, 15 Feb 2025 09:39:40 +0000 (10:39 +0100)]
isisd: Request SRv6 locator after zebra connection

When SRv6 is enabled and an SRv6 locator is specified in the IS-IS
configuration, IS-IS may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/14 21:41:20 ISIS: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, IS-IS is unable to obtain the locator information,
preventing SRv6 from working.

This commit fixes the issue by ensuring IS-IS requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit f02dba19d20b0a53645a439924e736155c8de63f)

2 months agoisisd: Add helper function to request SRv6 locator information
Carmine Scarpitta [Sat, 15 Feb 2025 09:39:30 +0000 (10:39 +0100)]
isisd: Add helper function to request SRv6 locator information

This commit adds a function that iterates over all IS-IS areas and asks
the SRv6 Manager for information about the configured locators.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 0b76fb3c133951c8d1203dbe7c2e5a4e1b67dffe)

2 months agoMerge pull request #18167 from FRRouting/mergify/bp/stable/10.2/pr-18160
Donald Sharp [Sat, 15 Feb 2025 14:14:18 +0000 (09:14 -0500)]
Merge pull request #18167 from FRRouting/mergify/bp/stable/10.2/pr-18160

bgpd: When removing the prefix list drop the pointer (backport #18160)

2 months agobgpd: When removing the prefix list drop the pointer 18167/head
Donald Sharp [Fri, 14 Feb 2025 12:55:09 +0000 (07:55 -0500)]
bgpd: When removing the prefix list drop the pointer

We are very very rarely seeing this crash:

    0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
    1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
    2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
    3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
    4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
    5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
    6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
    12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
    0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    1 0x7f36ba439bd7 in qfree lib/memory.c:131
    2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
    3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
    4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
    5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
    6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
    0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
    2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
    3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
    4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
    5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
    6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
    7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    8 0x7f36ba4ec239 in event_call lib/event.c:2019
    9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's just stop trying to save the pointer around in the peer->orf_plist
data structure.  There are other design problems but at least lets
stop the crash from possibly happening.

Fixes: #18138
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3d43d7b78971520854903c11b6aec23754fdca34)

2 months agoMerge pull request #18147 from FRRouting/mergify/bp/stable/10.2/pr-18023
Jafar Al-Gharaibeh [Fri, 14 Feb 2025 01:04:59 +0000 (19:04 -0600)]
Merge pull request #18147 from FRRouting/mergify/bp/stable/10.2/pr-18023

lib: fix false context information for SRv6 route (backport #18023)

2 months agoMerge pull request #18144 from FRRouting/mergify/bp/stable/10.2/pr-18079
Jafar Al-Gharaibeh [Fri, 14 Feb 2025 01:03:33 +0000 (19:03 -0600)]
Merge pull request #18144 from FRRouting/mergify/bp/stable/10.2/pr-18079

bgpd: Fix crash in bgp_labelpool (backport #18079)

2 months agobgpd: release manual vpn label on instance deletion 18155/head
Louis Scalbert [Wed, 12 Feb 2025 12:49:50 +0000 (13:49 +0100)]
bgpd: release manual vpn label on instance deletion

When a BGP instance with a manually assigned VPN label is deleted, the
label is not released from the Zebra label registry. As a result,
reapplying a configuration with the same manual label leads to VPN
prefix export failures.

For example, with the following configuration:

> router bgp 65000 vrf BLUE
>  address-family ipv4 unicast
>   label vpn export <int>

Release zebra label registry on unconfiguration.

Fixes: d162d5f6f5 ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d6363625c35a99933bf60c9cf0b79627b468c9f7)

# Conflicts:
# bgpd/bgpd.c

2 months agolib: fix false context information for SRv6 route 18147/head
Philippe Guibert [Wed, 5 Feb 2025 08:52:59 +0000 (09:52 +0100)]
lib: fix false context information for SRv6 route

The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.

> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01

Fix this by suppressing the USP information from the output.

Fixes: e496b4203055 ("bgpd: prefix-sid srv6 l3vpn service tlv")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 658bf0281d99461849453628ddc792ec424d0bd4)

2 months agobgpd: Fix crash in bgp_labelpool 18144/head
Donald Sharp [Mon, 10 Feb 2025 17:02:00 +0000 (12:02 -0500)]
bgpd: Fix crash in bgp_labelpool

The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure.  If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.

Modify the labelpool code to store the vrf id associated
with the request on the workqueue.  When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 14eac319e8ae9314f5270f871106a70c4986c60c)

2 months agoMerge pull request #18134 from FRRouting/mergify/bp/stable/10.2/pr-18120
Jafar Al-Gharaibeh [Thu, 13 Feb 2025 16:04:06 +0000 (10:04 -0600)]
Merge pull request #18134 from FRRouting/mergify/bp/stable/10.2/pr-18120

bgpd: fix incorrect JSON in bgp_show_table_rd (backport #18120)

2 months agoMerge pull request #18057 from FRRouting/mergify/bp/stable/10.2/pr-18048
Jafar Al-Gharaibeh [Thu, 13 Feb 2025 04:28:24 +0000 (22:28 -0600)]
Merge pull request #18057 from FRRouting/mergify/bp/stable/10.2/pr-18048

pimd: fix DR election race on startup (backport #18048)

2 months agobgpd: fix incorrect json in bgp_show_table_rd 18134/head
Louis Scalbert [Wed, 12 Feb 2025 11:50:42 +0000 (12:50 +0100)]
bgpd: fix incorrect json in bgp_show_table_rd

In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.

However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.

Updates the condition to also verify the existence of the next bgp_dest
table.

Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cf0269649cdd09b8d3f2dd8815caf6ecf9cdeef9)

2 months agoMerge pull request #18076 from opensourcerouting/fix/bfd_backports_10.2
Donald Sharp [Wed, 12 Feb 2025 17:58:15 +0000 (12:58 -0500)]
Merge pull request #18076 from opensourcerouting/fix/bfd_backports_10.2

bgp/bfd backports for stable/10.2

2 months agoMerge pull request #18124 from FRRouting/mergify/bp/stable/10.2/pr-18062
Donald Sharp [Wed, 12 Feb 2025 17:28:34 +0000 (12:28 -0500)]
Merge pull request #18124 from FRRouting/mergify/bp/stable/10.2/pr-18062

Cid 1636504 (backport #18062)

2 months agobgpd: fix bgp label evpn CID 1636504 18124/head
Philippe Guibert [Fri, 7 Feb 2025 14:49:10 +0000 (15:49 +0100)]
bgpd: fix bgp label evpn CID 1636504

The following static analysis can be seen :

> *** CID 1636504:    (ARRAY_VS_SINGLETON)
> /bgpd/bgp_evpn_mh.c: 1241 in bgp_evpn_type1_route_process()
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_withdraw" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }
> 1244            return 0;
> 1245     }
> 1246
> /bgpd/bgp_evpn_mh.c: 1238 in bgp_evpn_type1_route_process()
> 1232             * table
> 1233             */
> 1234            vtep_ip.s_addr = INADDR_ANY;
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_update" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }

Fix this by declaring a label array instead of a single array.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit ba462af2e3a4a242b2ffcde6074750f851632777)

2 months agobgpd: simplify bgp_evpn_process_rt1 with label
Philippe Guibert [Fri, 7 Feb 2025 14:40:29 +0000 (15:40 +0100)]
bgpd: simplify bgp_evpn_process_rt1 with label

Remove the num_labels variable, the received bgp_update() and
bgp_withdraw() function will read the message as including one
label or vni value.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 82d28f137aed2e60380807a302e2b312408eff6e)

2 months agoMerge pull request #18113 from FRRouting/mergify/bp/stable/10.2/pr-18078
Donald Sharp [Wed, 12 Feb 2025 13:16:56 +0000 (08:16 -0500)]
Merge pull request #18113 from FRRouting/mergify/bp/stable/10.2/pr-18078

nhrpd: fix dont consider incomplete L2 entry (backport #18078)

2 months agoMerge pull request #18116 from FRRouting/mergify/bp/stable/10.2/pr-18069
Donald Sharp [Wed, 12 Feb 2025 13:14:35 +0000 (08:14 -0500)]
Merge pull request #18116 from FRRouting/mergify/bp/stable/10.2/pr-18069

bgpd: Request SRv6 locator after zebra connection (backport #18069)

2 months agobgpd: Request SRv6 locator after zebra connection 18116/head
Carmine Scarpitta [Sat, 8 Feb 2025 23:44:01 +0000 (00:44 +0100)]
bgpd: Request SRv6 locator after zebra connection

When SRv6 is enabled and an SRv6 locator is specified in the BGP
configuration, BGP may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/06 16:37:32 BGP: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, BGP is unable to obtain the locator information,
preventing SRv6 VPN from working.

This commit fixes the issue by ensuring BGP requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 16640b615dfabfd8e18dd091b1d4a63dfa7bf9fe)

2 months agonhrpd: fix dont consider incomplete L2 entry 18113/head
Philippe Guibert [Mon, 10 Feb 2025 15:15:44 +0000 (16:15 +0100)]
nhrpd: fix dont consider incomplete L2 entry

Sometimes, NHRP receives L2 information on a cache entry with the
0.0.0.0 IP address. NHRP considers it as valid and updates the binding
with the new IP address.

> Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4
> Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec)
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4
> Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4

Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is
wrong because the nud state value means that value is incomplete and
should not be handled as a valid entry. Instead of considering it, fix
this by by invalidating the current binding. This step is necessary in
order to permit NHRP to trigger resolution requests again.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 3202323052485d8138a3440e9c9907594ad99c57)

2 months agoMerge pull request #18102 from FRRouting/mergify/bp/stable/10.2/pr-18060
Jafar Al-Gharaibeh [Wed, 12 Feb 2025 02:52:43 +0000 (20:52 -0600)]
Merge pull request #18102 from FRRouting/mergify/bp/stable/10.2/pr-18060

lib: crash handlers must be allowed on threads (backport #18060)

2 months agoMerge pull request #18084 from FRRouting/mergify/bp/stable/10.2/pr-17901
Jafar Al-Gharaibeh [Tue, 11 Feb 2025 17:57:41 +0000 (11:57 -0600)]
Merge pull request #18084 from FRRouting/mergify/bp/stable/10.2/pr-17901

lib: actually hash all 16 bytes of IPv6 addresses, not just 4 (backport #17901)

2 months agoMerge pull request #18089 from FRRouting/mergify/bp/stable/10.2/pr-17935
Jafar Al-Gharaibeh [Tue, 11 Feb 2025 17:57:23 +0000 (11:57 -0600)]
Merge pull request #18089 from FRRouting/mergify/bp/stable/10.2/pr-17935

zebra: include resolving nexthops in nhg hash (backport #17935)

2 months agoMerge pull request #18100 from FRRouting/mergify/bp/stable/10.2/pr-18081
Russ White [Tue, 11 Feb 2025 17:28:52 +0000 (12:28 -0500)]
Merge pull request #18100 from FRRouting/mergify/bp/stable/10.2/pr-18081

bgpd: fix bgp vrf instance creation from implicit (backport #18081)

2 months agolib: crash handlers must be allowed on threads 18102/head
David Lamparter [Fri, 7 Feb 2025 12:22:25 +0000 (13:22 +0100)]
lib: crash handlers must be allowed on threads

Blocking all signals on non-main threads is not the way to go, at least
the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run
so we get backtraces.  Otherwise the process just exits.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 13a6ac5b4ca8fc08b348f64de64a787982f24250)

2 months agobgpd: fix bgp vrf instance creation from implicit 18100/head
Chirag Shah [Tue, 11 Feb 2025 02:56:15 +0000 (18:56 -0800)]
bgpd: fix bgp vrf instance creation from implicit

In bgp route leak, when import vrf x is executed,
it creates bgp instance as hidden with asn value as unspecified.

When router bgp x is configured ensure the correct as,
asnotation is applied otherwise running config shows asn value as 0.

This can lead to frr-reload failure when any FRR config change.

Fix:
Move asn and asnotiation, as_pretty value in common done section,
so when bgp_create gets existing instance but before returning
update asn and required fields in common section.

In bgp_create(): when returning for hidden at least update asn
and required when bgp instance created implicitly due to vrf leak.

if (hidden) {
    bgp = bgp_old;
    goto peer_init; <<<
}

Before fix:
show running:

router bgp 0 vrf purple
 bgp router-id 10.10.3.11
 !
 address-family ipv4 unicast
  redistribute static
  import vrf blue
 exit-address-family
 !
 address-family ipv6 unicast
  import vrf blue
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
  advertise ipv6 unicast
 exit-address-family
exit

Testing:

1) following snippet config:
router bgp 63420 vrf blue
 import vrf purple
router bgp 63420 vrf purple
 import vrf blue
2) restart frr leads to the running config with 0 asn value.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 2ff08af78e315c69795417d150cd23649f68c655)

2 months agotests: Add a test that shows the v6 recursive nexthop problem 18089/head
Donald Sharp [Mon, 27 Jan 2025 15:34:31 +0000 (10:34 -0500)]
tests: Add a test that shows the v6 recursive nexthop problem

Currently FRR does not handle v6 recurisive resolution properly
when the route being recursed through changes and the most
significant bits of the route are not changed.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 73ab6a46c51db91df297774221053ab8fc4d12ae)

2 months agozebra: include resolving nexthops in nhg hash
Mark Stapp [Mon, 27 Jan 2025 19:17:24 +0000 (14:17 -0500)]
zebra: include resolving nexthops in nhg hash

Ensure that the nhg hash comparison function includes all
nexthops, including recursive-resolving nexthops.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit cb7cf73992847cfd4af796085bf14f2fdc4fa8db)

2 months agolib: clean up nexthop hashing mess 18084/head
David Lamparter [Wed, 22 Jan 2025 10:23:31 +0000 (11:23 +0100)]
lib: clean up nexthop hashing mess

We were hashing 4 bytes of the address.  Even for IPv6 addresses.

Oops.

The reason this was done was to try to make it faster, but made a
complex maze out of everything.  Time for a refactor.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 001fcfa1dd9f7dc2639b4f5c7a52ab59cc425452)

2 months agolib: guard against padding garbage in ZAPI read
David Lamparter [Wed, 22 Jan 2025 10:19:04 +0000 (11:19 +0100)]
lib: guard against padding garbage in ZAPI read

When reading in a nexthop from ZAPI, only set the fields that actually
have meaning.  While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 4a0e1419a69d07496c7adfb744beecd00e1efef2)

2 months agozebra: guard against junk in nexthop->rmap_src
David Lamparter [Wed, 22 Jan 2025 10:17:21 +0000 (11:17 +0100)]
zebra: guard against junk in nexthop->rmap_src

rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry.  Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit b666ee510eb480da50476b1bbc84bdf8365df95c)

2 months agopbrd: initialize structs used in hash_lookup
David Lamparter [Wed, 22 Jan 2025 10:16:10 +0000 (11:16 +0100)]
pbrd: initialize structs used in hash_lookup

Doesn't seem to break anything but really poor style to pass potentially
uninitialized data to hash_lookup.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit c88589f5e9351654c04322eb395003297656989d)

2 months agofpm: guard against garbage in unused address bytes
David Lamparter [Wed, 22 Jan 2025 10:15:17 +0000 (11:15 +0100)]
fpm: guard against garbage in unused address bytes

Zero out the 12 unused bytes (for the IPv6 address) when reading in an
IPv4 address.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 95cf0b227980999e2af22a2c171e5237e5ffca8e)

2 months agobgpd: don't reuse nexthop variable in loop/switch
David Lamparter [Wed, 22 Jan 2025 10:13:21 +0000 (11:13 +0100)]
bgpd: don't reuse nexthop variable in loop/switch

While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop.  This would leave nexthop with partial data from the previous
use.  Make it local where needed instead.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit ce7f5b21221f0b3557d1f4a40793230d8bc4cf02)

2 months agobgpd: Reset BGP session only if it was a real BFD DOWN event 18076/head
Donatas Abraitis [Tue, 5 Nov 2024 13:51:58 +0000 (15:51 +0200)]
bgpd: Reset BGP session only if it was a real BFD DOWN event

Without this patch we always see a double-reset, e.g.:

```
2024/11/04 12:42:43.010 BGP: [VQY9X-CQZKG] bgp_peer_bfd_update_source: address [0.0.0.0->172.18.0.3] to [172.18.0.2->172.18.0.3]
2024/11/04 12:42:43.010 BGP: [X8BD9-8RKN4] bgp_peer_bfd_update_source: interface none to eth0
2024/11/04 12:42:43.010 BFD: [MSVDW-Y8Z5Q] ptm-del-dest: deregister peer [mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default cbit:0x00 minimum-ttl:255]
2024/11/04 12:42:43.010 BFD: [NYF5K-SE3NS] ptm-del-session: [mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default] refcount=0
2024/11/04 12:42:43.010 BFD: [NW21R-MRYNT] session-delete: mhop:no peer:172.18.0.3 local:0.0.0.0 vrf:default
2024/11/04 12:42:43.010 BGP: [P3D3N-3277A] 172.18.0.3 [FSM] Timer (routeadv timer expire)
2024/11/04 12:42:43.010 BFD: [YA0Q5-C0BPV] control-packet: no session found [mhop:no peer:172.18.0.3 local:172.18.0.2 port:11]
2024/11/04 12:42:43.010 BFD: [MSVDW-Y8Z5Q] ptm-add-dest: register peer [mhop:no peer:172.18.0.3 local:172.18.0.2 vrf:default cbit:0x00 minimum-ttl:255]
2024/11/04 12:42:43.011 BFD: [PSB4R-8T1TJ] session-new: mhop:no peer:172.18.0.3 local:172.18.0.2 vrf:default ifname:eth0
2024/11/04 12:42:43.011 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 172.18.0.2/32 -> 172.18.0.3/32 (interface eth0) VRF default(0) (CPI bit no): Down
2024/11/04 12:42:43.011 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.18.0.3 vrf default(0) bfd state Up -> Down
2024/11/04 12:42:43.011 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 172.18.0.3 6/10 (Cease/BFD Down) 0 bytes
2024/11/04 12:42:43.011 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2024/11/04 12:42:43.011 BGP: [ZWCSR-M7FG9] 172.18.0.3 [FSM] BGP_Stop (Established->Clearing), fd 22
```

Reset is due to the source address change.

With this patch, we reset the session only if it's a _REAL_ BFD down event, which
means we trigger session reset if BFD session is established earlier than BGP.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: Update source address for BFD session
Donatas Abraitis [Tue, 12 Nov 2024 11:09:09 +0000 (13:09 +0200)]
bgpd: Update source address for BFD session

If BFD is down, we should try to detect the source automatically from the given
interface.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: Allow bfd to work if peer known but interface address not yet
Donald Sharp [Wed, 20 Nov 2024 21:07:34 +0000 (16:07 -0500)]
bgpd: Allow bfd to work if peer known but interface address not yet

If bgp is coming up and bgp has not received the interface address yet
but bgp has knowledge about a bfd peering, allow it to set the peering
data appropriately.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2 months agobgpd: When bgp notices a change to shared_network inform bfd of it
Donald Sharp [Thu, 5 Dec 2024 15:16:03 +0000 (10:16 -0500)]
bgpd: When bgp notices a change to shared_network inform bfd of it

When bgp is started up and reads the config in *before* it has
received interface addresses from zebra, shared_network can
be set to false in this case.  Later on once bgp attempts to
reconnect it will refigure out the shared_network again( because
it has received the data from zebra now ).  In this case
tell bfd about it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2 months agobgpd: fix bfd with update-source in peer-group
Louis Scalbert [Wed, 22 Jan 2025 12:30:55 +0000 (13:30 +0100)]
bgpd: fix bfd with update-source in peer-group

Fix BFD session not created when the peer is in update-group with the
update-source option.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2 months agoMerge pull request #18054 from FRRouting/mergify/bp/stable/10.2/pr-14105
Donatas Abraitis [Fri, 7 Feb 2025 14:10:59 +0000 (16:10 +0200)]
Merge pull request #18054 from FRRouting/mergify/bp/stable/10.2/pr-14105

pimd: Fix for FHR mroute taking longer to age out (backport #14105)

2 months agopimd: fix DR election race on startup 18057/head
Rafael Zalamena [Thu, 6 Feb 2025 22:28:50 +0000 (19:28 -0300)]
pimd: fix DR election race on startup

In case interface address is learnt during configuration, make sure to
run DR election when configuring PIM/PIM passive on interface.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
(cherry picked from commit 86445246062583197d4a6dff7b8c74003cd8049d)

2 months agopimd: Fix for FHR mroute taking longer to age out 18054/head
Rajesh Varatharaj [Thu, 27 Jul 2023 06:57:04 +0000 (23:57 -0700)]
pimd: Fix for FHR mroute taking longer to age out

Issue:
When there is no traffic for a group, the LHR and RP take the default KAT+Join timer expiry of
a maximum of 480 seconds to clear the S,G . However, in the FHR, we update the state from JOINED
to NOT Joined, downstream state from PPto NOINFO.  This restarts the ET timer, causing S,G on FHR to
take more than 10 minutes to age out.

In other words,
Consider a case where (S,G) is in Join state. When the traffic stops and the KAT (210) expires,
 the Join expiry timer restarts. At this time, if we receive a prune, the expectation is to set
 PPT to 0 (RFC 4601 sec 4.5.2).
 When the PPT expires, we move to the noinfo state and restart the expiry timer one more time. We remove the
 (S,G) entry only after ~10 minutes when there is no active traffic.

Summary:
KAT Join ET 210 + PP ET 210 + NOINFO ET 210.

Solution:
Delete the ifchannel when in noinfo state, and KAT is not running.

Ticket: #13703

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit afed39ea2be25bf30d50ac49b4edf424deadcb17)

2 months agoMerge pull request #18044 from FRRouting/mergify/bp/stable/10.2/pr-18038
Jafar Al-Gharaibeh [Thu, 6 Feb 2025 22:55:32 +0000 (16:55 -0600)]
Merge pull request #18044 from FRRouting/mergify/bp/stable/10.2/pr-18038

pimd: fix memory leak and assign allocation type (backport #18038)

2 months agopimd: fix memory leak and assign allocation type 18044/head
Rafael Zalamena [Thu, 6 Feb 2025 13:14:55 +0000 (10:14 -0300)]
pimd: fix memory leak and assign allocation type

Use a memory allocation specific type for filter names (to help detect memory
leaks) and fix a memory leak when releasing peer memory.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
(cherry picked from commit d1440dadffe90dc743c5b83126b021d7a4a08766)

2 months agoMerge pull request #18015 from opensourcerouting/fix/backport_5330a41c1e41fc2bd042447...
Jafar Al-Gharaibeh [Wed, 5 Feb 2025 05:19:15 +0000 (23:19 -0600)]
Merge pull request #18015 from opensourcerouting/fix/backport_5330a41c1e41fc2bd0424474d210232ffbed8b5f_10.2

bgpd: Do not start BGP session if BGP identifier is not set (backport)

2 months agobgpd: Do not start BGP session if BGP identifier is not set 18015/head
Donatas Abraitis [Wed, 29 Jan 2025 21:03:06 +0000 (23:03 +0200)]
bgpd: Do not start BGP session if BGP identifier is not set

If we have IPv6-only network and no IPv4 addresses at all, then by default
0.0.0.0 is created which is treated as malformed according to RFC 6286.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 739f2b566a8217acce84d4c21aaf033314f535bb)

2 months agoMerge pull request #18002 from FRRouting/mergify/bp/stable/10.2/pr-17969
Donald Sharp [Tue, 4 Feb 2025 20:18:41 +0000 (15:18 -0500)]
Merge pull request #18002 from FRRouting/mergify/bp/stable/10.2/pr-17969

zebra: Ensure dplane does not send work back to master at wrong time (backport #17969)

2 months agoMerge pull request #18007 from FRRouting/mergify/bp/stable/10.2/pr-17985
Russ White [Tue, 4 Feb 2025 16:46:04 +0000 (11:46 -0500)]
Merge pull request #18007 from FRRouting/mergify/bp/stable/10.2/pr-17985

bgpd: fix add label support to EVPN AD routes (backport #17985)

2 months agobgpd: fix add label support to EVPN AD routes 18007/head
Philippe Guibert [Mon, 3 Feb 2025 13:49:53 +0000 (14:49 +0100)]
bgpd: fix add label support to EVPN AD routes

When peering with an EVPN device from other vendor, FRR acting as route
reflector is not able to read nor transmit the label value.

Actually, EVPN AD routes completely ignore the label value in the code,
whereas in some functionalities like evpn-vpws, it is authorised to
carry and propagate label value.

Fix this by handling the label value.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 1cefe94dd3991e7e606df8da6b59707295218e55)

2 months agozebra: Ensure dplane does not send work back to master at wrong time 18002/head
Donald Sharp [Fri, 31 Jan 2025 17:38:20 +0000 (12:38 -0500)]
zebra: Ensure dplane does not send work back to master at wrong time

When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit c41155221e7fb7890fecc37f1685063dce6caaca)

2 months agoMerge pull request #17994 from FRRouting/mergify/bp/stable/10.2/pr-17991
Jafar Al-Gharaibeh [Tue, 4 Feb 2025 16:15:34 +0000 (10:15 -0600)]
Merge pull request #17994 from FRRouting/mergify/bp/stable/10.2/pr-17991

zebra: fix evpn svd hash avoid double free (backport #17991)

2 months agoMerge pull request #17997 from FRRouting/mergify/bp/stable/10.2/pr-17992
Jafar Al-Gharaibeh [Tue, 4 Feb 2025 16:14:57 +0000 (10:14 -0600)]
Merge pull request #17997 from FRRouting/mergify/bp/stable/10.2/pr-17992

bgpd: fix route-distinguisher in vrf leak json cmd (backport #17992)

2 months agobgpd: fix route-distinguisher in vrf leak json cmd 17997/head
Chirag Shah [Mon, 3 Feb 2025 20:00:41 +0000 (12:00 -0800)]
bgpd: fix route-distinguisher in vrf leak json cmd

For auto configured value RD value comes as NULL,
switching back to original change will ensure to cover
for both auto and user configured RD value in JSON.

tor-11# show bgp vrf blue ipv4 unicast route-leak json
{
  "vrf":"blue",
  "afiSafi":"ipv4Unicast",
  "importFromVrfs":[
    "purple"
  ],
  "importRts":"10.10.3.11:6",
  "exportToVrfs":[
    "purple"
  ],
  "routeDistinguisher":"(null)", <<<<<
  "exportRts":"10.10.3.11:10"
}

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 892704d07f5286464728720648ad392b485a9966)

2 months agozebra: evpn svd hash avoid double free 17994/head
Chirag Shah [Fri, 31 Jan 2025 01:26:46 +0000 (17:26 -0800)]
zebra: evpn svd hash avoid double free

Upon zebra shutdown hash_clean_and_free is called
where user free function is passed,
The free function should not call hash_release
which lead to double free of hash bucket.

Fix:
The fix is to avoid calling hash_release from
free function if its called from hash_clean_and_free
path.

10 0x00007f0422b7df1f in free () from /lib/x86_64-linux-gnu/libc.so.6
11 0x00007f0422edd779 in qfree (mt=0x7f0423047ca0 <MTYPE_HASH_BUCKET>,
    ptr=0x55fc8bc81980) at ../lib/memory.c:130
12 0x00007f0422eb97e2 in hash_clean (hash=0x55fc8b979a60,
    free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:290
13 0x00007f0422eb98a1 in hash_clean_and_free (hash=0x55fc8a675920
    <svd_nh_table>, free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:305
14 0x000055fc8a5323a5 in zebra_vxlan_terminate () at
    ../zebra/zebra_vxlan.c:6099
15 0x000055fc8a4c9227 in zebra_router_terminate () at
    ../zebra/zebra_router.c:276
16 0x000055fc8a4413b3 in zebra_finalize (dummy=0x7fffb881c1d0) at
    ../zebra/main.c:269
17 0x00007f0422f44387 in event_call (thread=0x7fffb881c1d0) at
    ../lib/event.c:2011
18 0x00007f0422ecb6fa in frr_run (master=0x55fc8b733cb0) at
    ../lib/libfrr.c:1243
19 0x000055fc8a441987 in main (argc=14, argv=0x7fffb881c4a8) at
    ../zebra/main.c:584

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 1d4f5b9b19588d77d3eaf06440c26a8c974831a3)

2 months agoMerge pull request #17983 from opensourcerouting/fix/backports_auto_vrf
Russ White [Tue, 4 Feb 2025 12:06:58 +0000 (07:06 -0500)]
Merge pull request #17983 from opensourcerouting/fix/backports_auto_vrf

bgpd: Auto vrf instance (backports)

2 months agoMerge pull request #17968 from nabahr/merge-pr-17934
Donatas Abraitis [Mon, 3 Feb 2025 08:58:31 +0000 (10:58 +0200)]
Merge pull request #17968 from nabahr/merge-pr-17934

pimd: Close AutoRP socket when not needed (backport #17934)

2 months agobgpd: fix static analyzer issues around bgp pointer 17983/head
Philippe Guibert [Thu, 9 Jan 2025 20:31:01 +0000 (21:31 +0100)]
bgpd: fix static analyzer issues around bgp pointer

Some static analyzer issues can be observed in BGP code:

> In file included from ./lib/zebra.h:13,
>                  from lib/event.c:8:
> ./lib/compiler.h:222:26: note: '#pragma message: Remove `clear thread cpu` command'
>   222 | #define CPP_NOTICE(text) _Pragma(CPP_STR(message text))
>       |                          ^~~~~~~
> lib/event.c:433:1: note: in expansion of macro 'CPP_NOTICE'
>   433 | CPP_NOTICE("Remove `clear thread cpu` command")
>       | ^~~~~~~~~~
> bgpd/bgp_vty.c:1592:5: warning: Access to field 'as_pretty' results in a dereference of a null pointer (loaded from variable 'bgp') [core.NullDereference]
> 1592 |                                 bgp->as_pretty);
>       |                                 ^~~~~~~~~~~~~~
> bgpd/bgp_vty.c:1599:5: warning: Access to field 'as_pretty' results in a dereference of a null pointer (loaded from variable 'bgp') [core.NullDereference]
> 1599 |                                 bgp->as_pretty);
>       |                                 ^~~~~~~~~~~~~~
> bgpd/bgp_vty.c:1612:7: warning: Access to field 'flags' results in a dereference of a null pointer (loaded from variable 'bgp') [core.NullDereference]
> 1612 |                     IS_BGP_INSTANCE_HIDDEN(bgp)) {
>       |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./bgpd/bgpd.h:2906:3: note: expanded from macro 'IS_BGP_INSTANCE_HIDDEN'
> 2906 |         (CHECK_FLAG(_bgp->flags, BGP_FLAG_INSTANCE_HIDDEN) &&                  \
>       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./lib/zebra.h:274:31: note: expanded from macro 'CHECK_FLAG'
>   274 | #define CHECK_FLAG(V,F)      ((V) & (F))
>       |                               ^~~
> bgpd/bgp_vty.c:1614:4: warning: Access to field 'flags' results in a dereference of a null pointer (loaded from variable 'bgp') [core.NullDereference]
> 1614 |                         UNSET_FLAG(bgp->flags, BGP_FLAG_INSTANCE_HIDDEN);
>       |                         ^          ~~~
> ./lib/zebra.h:276:34: note: expanded from macro 'UNSET_FLAG'
>   276 | #define UNSET_FLAG(V,F)      (V) &= ~(F)
>       |                               ~  ^
> 4 warnings generated.
> Static Analysis warning summary compared to base:

Fix those issues by protecting the bgp pointer.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2 months agobgpd: Do not ignore auto generated VRF instances when deleting
Donatas Abraitis [Tue, 28 Jan 2025 15:11:58 +0000 (17:11 +0200)]
bgpd: Do not ignore auto generated VRF instances when deleting

When VRF instance is going to be deleted inside bgp_vrf_disable(), it uses
a helper method that skips auto created VRF instances and that leads to STALE
issue.

When creating a VNI for a particular VRF vrfX with e.g. `advertise-all-vni`,
auto VRF instance is created, and then we do `router bgp ASN vrf vrfX`.

But when we do a reload bgp_vrf_disable() is called, and we miss previously
created auto instance.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2 months agobgpd: fix import vrf creates multiple bgp instances
Philippe Guibert [Thu, 9 Jan 2025 09:26:02 +0000 (10:26 +0100)]
bgpd: fix import vrf creates multiple bgp instances

The more the vrf green is referenced in the import bgp command, the more
there are instances created. The below configuration shows that the vrf
green is referenced twice, and two BGP instances of vrf green are
created.

The below configuration:
> router bgp 99
> [..]
>  import vrf green
> exit
> router bgp 99 vrf blue
> [..]
>  import vrf green
> exit
> router bgp 99 vrf green
> [..]
> exit
>
> r4# show bgp vrfs
> Type  Id     routerId          #PeersCfg  #PeersEstb  Name
>              L3-VNI            RouterMAC              Interface
> DFLT  0      10.0.3.4          0          0           default
>              0                 00:00:00:00:00:00      unknown
>  VRF  5      10.0.40.4         0          0           blue
>              0                 00:00:00:00:00:00      unknown
>  VRF  6      0.0.0.0           0          0           green
>              0                 00:00:00:00:00:00      unknown
>  VRF  6      10.0.94.4         0          0           green
>              0                 00:00:00:00:00:00      unknown

Fix this at import command, by looking at an already present bgp
instance.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>