| Age | Commit message (Collapse) | Author |
|
Before:
```
pc.donatas.net(config)# do sh run | include vni
vni 1
pc.donatas.net(config)# no vni 2
pc.donatas.net(config)# do sh run | include vni
pc.donatas.net(config)#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 44fe3981ee388f7c60ab2635309bce34774116e1)
|
|
lib: Create VRF if needed (backport #18430)
|
|
When creating a control plane protocol through NB, create the vrf
if needed instead of only looking up and asserting if it doesn't
exist yet.
Fixes 18429.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit b6ae01f907c071be6cd197df0f3ca6fe9baa631a)
|
|
ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration. (backport #18393)
|
|
configuration.
This fix will delete an OSPFv3 area when all the interfaces and
configuration (ranges, NSSA ranges, stub area, NSSA area, filter-list,
import-list and export-list) have been removed. The changes provides
a general solution to https://github.com/FRRouting/frr/issues/18324.
Signed-off-by: Acee Lindem <acee@lindem.com>
(cherry picked from commit 04994891fe164b4d5a2819d3bc90e5346f94dc53)
|
|
bgpd: Backport recent changes for 10.2 regarding EVPN pointer changes
|
|
bgp_update is a very expensive call. Calling evpn_overlay_free
even when we have no evpn data to free is not trivial. Let's
limit the call into this function until we actually have data to
free.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
The code is just arbitrarily checking to see if there are any
mac addresses associated with a prefix. This makes no
sense from the perspective that it can only happen as
an evpn route. Let's not make non-evpn people pay
the price to check this data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
bgpd: Fixed crash upon bgp network import-check command (backport #18387)
|
|
BT:
```
3 <signal handler called>
4 0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5 0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6 0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7 0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```
The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.
Signed-off-by: Manpreet Kaur <manpreetk@nvidia.com>
(cherry picked from commit bc1008b970541c090e36fc1d50c720df822fcb99)
|
|
zebra: ensure proper return for failure for Sid allocation (backport #18360)
|
|
The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool
but we have places where we return a -1 or NULL which the caller is
assuming as a True/Valid and ending up allocating Sid
Without Fix:
2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32
2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088
2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000]
2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2
2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients
With Fix:
2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056
2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000]
2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32
2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN
root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1#
Ticket :#
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 5a63cf4c0d1e7b84f59003877599c6575ba08a25)
|
|
Changelog:
bgpd
Allow bfd to work if peer known but interface address not yet
Apply route-map for aggregate before attribute comparison
Do not ignore auto generated vrf instances when deleting
Do not start bgp session if bgp identifier is not set
Do not try to uninstall bfd session if the peer is not established
Don't reuse nexthop variable in loop/switch
Fix a bug in peer_allowas_in_set()
Fix add label support to evpn ad routes
Fix bfd with update-source in peer-group
Fix bgp label evpn cid 1636504
Fix bgp orf prefix-list json prefix
Fix bgp peer solo option
Fix bgp vrf instance creation from implicit
Fix crash in bgp_labelpool
Fix crash in displaying json orf prefix-list
Fix deadlock in bgp_keepalive and master pthreads
Fix duplicate bgp instance created with unified config
Fix for local interface mac cache issue in 'bgp mac hash' table
Fix import vrf creates multiple bgp instances
Fix incorrect json in bgp_show_table_rd
Fix memory leak in bgp_aggregate_install()
Fix route-distinguisher in vrf leak json cmd
Fix static analyzer issues around bgp pointer
Fix table-map option
Fix vty output of evpn route-target as4
Fix wrong pthread event cancelling
Remove dmed check not required in bestpath selection
Request srv6 locator after zebra connection
Reset bgp session only if it was a real bfd down event
Respect allowas-in value from the source vrf's peer
Simplify bgp_evpn_process_rt1 with label
Update source address for bfd session
Use igpmetric in bgp_aigp_metric_total()
When bgp notices a change to shared_network inform bfd of it
When removing the prefix list drop the pointer
With suppress-fib-pending ensure withdrawal is sent
Revert: Handle addpath capability using dynamic capabilities"
Revert: Reinstall aggregated routes if using route-maps and it was changed"
isisd
Add helper function to request srv6 locator information
Allow full `no` form for `domain-password` and `area-password`
Correct edge insertion into ted
Request srv6 locator after zebra connection
Show correct level information for `show isis interface detail json`
lib
Clean up nexthop hashing mess
Crash handlers must be allowed on threads
Fix false context information for srv6 route
Guard against padding garbage in zapi read
Nb: call child destroy cbs when yang container is deleted
mgmtd
Prevent use after free
nhrpd
Fix dont consider incomplete l2 entry
ospf6d
Fix use after free of router in ospfv3 abr route calculation.
pbrd
Initialize structs used in hash_lookup
pimd
Always write cand-rp group config even when rp is inactive
Close autorp socket when not needed
During prefix-list update, behave as pim_upstream_notjoined state (conformance issue)
Explicitly ensure the rp src is bsr
Fix autorp group joins
Fix bsr rps timing out
Fix dr election race on startup
Fix for data packet loss when fhr is lhr and rp
Fix for fhr mroute taking longer to age out
Fix memory leak and assign allocation type
Fix pim vrf support (send register/register stop in vrf)
Fix pim6 mld vrf support (use recvmsg() pktinfo)
Fix vrf binding of autorp and mroute socket
tests
Add a test that shows the v6 recursive nexthop problem
Bgp_srv6_sid_reachability should give more time
Bgp_srv6l3vpn_to_bgp_vrf3 needs more time
Check if allow as-in works when importing between local vrfs
tools
Add missing formats keyword to segment-routing in frr-reload
Add missing rpki keyword to vrf in frr-reload
Fix frr-reload for ebgp-multihop ttl reconfiguration.
zebra
Ensure dplane does not send work back to master at wrong time
Evpn svd hash avoid double free
Fix leaked nhe
Fix resetting valid flags for nhg dependents
Guard against junk in nexthop->rmap_src
Include resolving nexthops in nhg hash
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
|
|
pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo) (backport #18315)
|
|
When receiving MLD messages, prefer pktinfo over msghdr.msg_name for
determining the source interface. The latter is just the VRF master
interface in case of VRF and we need the true interface the packet was
received on instead.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 374c8dc4dbc8a560036fecdfb3213f690099b869)
|
|
isisd: Correct edge insertion into TED (backport #18294)
|
|
Edges are not correctly linked to Vertices during LSP processing. In function
lsp_to_edge_cb(), once edge created or updated from the LSP TLVs, the code try
to link the edge to destination vertices. In case the revert edge is not found,
the code try to found a destination vertex to link to. But, the sys_id used
for this operation corresponds to the source vertex. As a result, the edge is
attached as source and destination of the vertex. When Traffic Engineering is
stopped, TED is deleted which result into a double free of the edge attributes.
This cause a crash when attempt to free extended admin groupi the second time.
This patch removed wrong code which link twice the edge to the source vertex.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
(cherry picked from commit 605fc1dd6404b6ed51691c647568939adde4962a)
|
|
mgmtd: Prevent use after free (backport #18264)
|
|
ci is picking up this use after free on occasion:
ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
3 0x7fab99039bcf in mt_count_free lib/memory.c:78
4 0x7fab99039bcf in qfree lib/memory.c:130
5 0x7fab98ff971a in hash_clean lib/hash.c:290
6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)
0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
freed by thread T0 here:
0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7fab99039bf0 in qfree lib/memory.c:131
2 0x7fab98ff93e1 in hash_release lib/hash.c:227
3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
6 0x7fab98ff970b in hash_clean lib/hash.c:288
7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7fab990392fd in qcalloc lib/memory.c:106
2 0x7fab98ff8b4f in hash_get lib/hash.c:156
3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
6 0x7fab990ec6de in event_call lib/event.c:2019
7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called. Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 62f35c7bdb2a6364dd03ab120e7bb685dd317c24)
|
|
ospf6d: Fix use after free of router in OSPFv3 ABR route calculation. (backport #18254)
|
|
This PR fixes FRR issue https://github.com/FRRouting/frr/issues/18040. The
OSPFv3 route is locked during the ABR calculation since there are
scenarios under which it is freed. The OSPFv3 ABR computation is
sub-optimal and this PR doesn't attempt to rework it.
Signed-off-by: Acee Lindem <acee@lindem.com>
(cherry picked from commit 06af50eacec8660fada0d4fd5cd11f0ade4e3c6c)
|
|
pim: Fix autorp group joins (backport #18225)
|
|
pim: Fix vrf binding of autorp and mroute socket (backport #18226)
|
|
pimd: Fix PIM VRF support (send register/register stop in VRF) (backport #18216)
|
|
Bind the autorp socket to the vrf device.
Also fixed mroute socket to use vrf_bind instead of directly
setting the socket option.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit 7e181a771c2e525aeda6e8f6c2d58e9ee2503949)
Fixed merge conflicts
|
|
Group joining got broken when moving the autorp socket to open/close
as needed. This fixes it so autorp group joining is properly handled
as part of opening the socket.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit d840560b74e3a6117aa1e4b1203dcdd8fb254ef6)
Fixed merge conflicts for backport
|
|
In 946195391406269003275850e1a4d550ea8db38b and
8ebcc02328c6b63ecf85e44fdfbf3365be27c127, transmission of PIM register and
register stop messages was changed to use a separate socket. However, that
socket is not bound to a possible VRF, so the messages were sent in the
default VRF instead. Call vrf_bind() once after socket creation and when the
VRF is ready to ensure transmission in the correct VRF. vrf_bind() handles
the non-VRF case (i.e. VRF_DEFAULT) automatically, so it may be called
unconditionally.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 5a01011e0d2db538a8ba523904bd4f08b786edfb)
|
|
bgpd: remove dmed check not required in bestpath selection (backport #18210)
|
|
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.
As an example:
During the bestpath selection local route looses to another path due
to dmed condition being hit.
The snippet of the logs:
2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0 <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 83ad94694bc061e1ff5f43db42cba46320e0df73)
|
|
pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED sta… (backport #17666)
|
|
pimd: Fix for data packet loss when FHR is LHR and RP (backport #14227)
|
|
(conformance issue)
Issue:
If there are any changes to the prefix list, we perform a re-lookup to map the correct RP for the group.
Even if the S,G entry is PIM_UPSTREAM_NOTJOINED and in FHR, In the case of IGMPv3, an S,G entry can be
created with no joins. this is not necessary.
https://www.rfc-editor.org/rfc/rfc4601#section-4.5.7 says no op in case of NOTJOINED
Solution:
To solve this issue, Stop RP mapping when the state is NOTJOINED
Ticket: #3496931
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 51f26d17da288af44a8a0e536dbe317a7e678514)
|
|
Topology:
A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.
RC and Issue:
When an upstream S,G is in join state, it sends a register message to the RP.
If the RP has the receiver, it sends a register stop message and switches to the shortest path.
When the register stop message is processed, it removes pimreg, moves to prune,
and starts the reg stop timer.
When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL
register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.
The problem is when the register stop timer pops and state is in Join Pending.
According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1,
we need to put back the pimreg reg tunnel into the S,G mroute.
This causes data to be sent to the control plane and subsequently interrupts the line rate.
Fix:
If the router is FHR and RP to the group,
ignore SPT status and send out a register stop message back to the DR (in this context, the same router).
Ticket: #3506780
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 8280257cc99e071c205e469399f2fb41671b30eb)
|
|
FRRouting/revert-18155-mergify/bp/stable/10.2/pr-18121
Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"
|
|
|
|
bgpd: release manual vpn label on instance deletion (backport #18121)
|
|
lib: nb: call child destroy CBs when YANG container is deleted (backport #18082)
|
|
Previously the code was only calling the child destroy callbacks if the target
deleted node was a non-presence container. We now add a flag to the callback
structure to instruct northbound to perform the rescursive delete for code that
wishes for this to happen.
- Fix wrong relative path lookup in keychain destroy callback
Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit d03ecf4562ef3ade6b7b83bf6c683c4741f395ba)
|
|
isisd: Request SRv6 locator after zebra connection (backport #18178)
|
|
bgpd: fix vty output of evpn route-target AS4 (backport #18109)
|
|
evpn route-targets are decoded in ... multiple places; at least
two have a bug where the AS4 form doesn't have its AS decoded.
Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit 9943a08720ccbed87cd6938791066a0de94a92c6)
|
|
When SRv6 is enabled and an SRv6 locator is specified in the IS-IS
configuration, IS-IS may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:
```
2025/02/14 21:41:20 ISIS: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````
As a result, IS-IS is unable to obtain the locator information,
preventing SRv6 from working.
This commit fixes the issue by ensuring IS-IS requests SRv6 locator
information once the connection with zebra is successfully established.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit f02dba19d20b0a53645a439924e736155c8de63f)
|
|
This commit adds a function that iterates over all IS-IS areas and asks
the SRv6 Manager for information about the configured locators.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 0b76fb3c133951c8d1203dbe7c2e5a4e1b67dffe)
|
|
bgpd: When removing the prefix list drop the pointer (backport #18160)
|
|
We are very very rarely seeing this crash:
0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7f36ba439bd7 in qfree lib/memory.c:131
2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
7 0x7f36ba4ec239 in event_call lib/event.c:2019
8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
8 0x7f36ba4ec239 in event_call lib/event.c:2019
9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Let's just stop trying to save the pointer around in the peer->orf_plist
data structure. There are other design problems but at least lets
stop the crash from possibly happening.
Fixes: #18138
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3d43d7b78971520854903c11b6aec23754fdca34)
|
|
lib: fix false context information for SRv6 route (backport #18023)
|
|
bgpd: Fix crash in bgp_labelpool (backport #18079)
|
|
When a BGP instance with a manually assigned VPN label is deleted, the
label is not released from the Zebra label registry. As a result,
reapplying a configuration with the same manual label leads to VPN
prefix export failures.
For example, with the following configuration:
> router bgp 65000 vrf BLUE
> address-family ipv4 unicast
> label vpn export <int>
Release zebra label registry on unconfiguration.
Fixes: d162d5f6f5 ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d6363625c35a99933bf60c9cf0b79627b468c9f7)
# Conflicts:
# bgpd/bgpd.c
|
|
The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.
> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01
Fix this by suppressing the USP information from the output.
Fixes: e496b4203055 ("bgpd: prefix-sid srv6 l3vpn service tlv")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 658bf0281d99461849453628ddc792ec424d0bd4)
|
|
The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure. If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.
Modify the labelpool code to store the vrf id associated
with the request on the workqueue. When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 14eac319e8ae9314f5270f871106a70c4986c60c)
|