Donatas Abraitis [Fri, 15 Mar 2024 11:49:06 +0000 (13:49 +0200)]
bgpd: Update default-originate route-map actual map structure
If using with `bgp listen range ... peer-group x`, default_rmap[afi][safi] is not
updated, and after the hard-reset in other side, this is flushed and never updated
again without restarting the sender BGP daemon.
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Donald Sharp [Sat, 2 Mar 2024 14:50:38 +0000 (09:50 -0500)]
bgpd: Ensure community data is freed in some cases.
Customer has this valgrind trace:
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
0 in community_new ../bgpd/bgp_community.c:39
1 in community_uniq_sort ../bgpd/bgp_community.c:170
2 in route_set_community ../bgpd/bgp_routemap.c:2342
3 in route_map_apply_ext ../lib/routemap.c:2673
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
7 in hash_walk ../lib/hash.c:285
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
12 in work_queue_run ../lib/workqueue.c:282
The above leak detected by valgrind was from a screenshot so I copied it
by hand. Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists. Code should be able to be followed from there to here.
What is happening:
There is a route-map being applied that modifes the outgoing community
to a peer. This is saved in the attr copy created in
subgroup_process_announce_selected. This community pointer is not
interned. So the community->refcount is still 0. Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned. This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked. The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well. This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.
So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.
Donald Sharp [Wed, 13 Mar 2024 14:26:58 +0000 (10:26 -0400)]
bgpd: Ensure that the correct aspath is free'd
Currently in subgroup_default_originate the attr.aspath
is set in bgp_attr_default_set, which hashs the aspath
and creates a refcount for it. If this is a withdraw
the subgroup_announce_check and bgp_adj_out_set_subgroup
is called which will intern the attribute. This will
cause the the attr.aspath to be set to a new value
finally at the bottom of the function it intentionally
uninterns the aspath which is not the one that was
created for this function. This reduces the other
aspath's refcount by 1 and if a clear bgp * is issued
fast enough the aspath for that will be removed
and the system will crash.
Louis Scalbert [Thu, 15 Feb 2024 12:28:02 +0000 (13:28 +0100)]
bgpd: fix 6vpe nexthop
6vPE enables the announcement of IPv6 VPN prefixes through an IPv4 BGP
session. In this scenario, the next hop addresses for these prefixes are
represented in an IPv4-mapped IPv6 format, noted as ::ffff:[IPv4]. This
format indicates to the peer that it should route these IPv6 addresses
using information from the IPv4 nexthop. For example:
> Path Attribute - MP_REACH_NLRI
> [...]
> Address family identifier (AFI): IPv6 (2)
> Subsequent address family identifier (SAFI): Labeled VPN Unicast (128)
> Next hop: RD=0:0 IPv6=::ffff:192.0.2.5 RD=0:0 Link-local=fe80::501d:42ff:feef:b021
> Number of Subnetwork points of attachment (SNPA): 0
This rule is set out in RFC4798:
> The IPv4 address of the egress 6PE router MUST be encoded as an
> IPv4-mapped IPv6 address in the BGP Next Hop field.
However, in some situations, bgpd sends a standard nexthop IPv6 address
instead of an IPv4-mapped IPv6 address because the outgoing interface for
the BGP session has a valid IPv6 address. This is problematic because
the peer router may not be able to route the nexthop IPv6 address (ie.
if the outgoing interface has not IPv6).
Fix the issue by always sending a IPv4-mapped IPv6 address as nexthop
when the BGP session is on IPv4 and address family IPv6.
Philippe Guibert [Mon, 13 Mar 2023 09:47:16 +0000 (10:47 +0100)]
topotests: add an ebgp 6vpe test
This test uses the connected ipv4 mapped ipv6 prefix
to resolve the received BGP routes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: François Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 4d7df91752d7414d9719a361a2fd4cc30943dc96)
Louis Scalbert [Tue, 20 Feb 2024 16:49:01 +0000 (17:49 +0100)]
zebra: fix crash if macvlan link in another netns
A macvlan interface can have its underlying link-interface in another
namespace (aka. netns). However, by default, zebra does not know the
interface from the other namespaces. It results in a crash the pointer
to the link interface is NULL.
Fix the crash by returning when the macvlan link-interface is in another
namespace. No need to go further because any vxlan under the macvlan
interface would not be accessible by zebra.
Olivier Dugeon [Mon, 26 Feb 2024 09:40:34 +0000 (10:40 +0100)]
ospfd: Solved crash in OSPF TE parsing
Iggy Frankovic discovered an ospfd crash when perfomring fuzzing of OSPF LSA
packets. The crash occurs in ospf_te_parse_te() function when attemping to
create corresponding egde from TE Link parameters. If there is no local
address, an edge is created but without any attributes. During parsing, the
function try to access to this attribute fields which has not been created
causing an ospfd crash.
The patch simply check if the te parser has found a valid local address. If not
found, we stop the parser which avoid the crash.
adding a tests about:
"no bgp as-path access-list" command.
the folloxing "clear bgp *" command leads to the
crash exhibited above.
a sleep had been added to capture the crash befor the end of scenario.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f5f05cbb9c0 (LWP 1371086))]
(gdb) bt
context=0x7ffcf2c216c0) at lib/sigevent.c:248
acl_list=0x55c976ec03c0) at bgpd/bgp_aspath.c:1688
dummy=0x7ffcf2c22340, object=0x7ffcf2c21e70) at bgpd/bgp_routemap.c:2401
match_object=0x7ffcf2c21e70, set_object=0x7ffcf2c21e70, pref=0x0)
at lib/routemap.c:2687
attr=0x7ffcf2c220b0, afi=AFI_IP, safi=SAFI_UNICAST, rmap_name=0x0, label=0x0,
num_labels=0, dest=0x55c976ebeaf0) at bgpd/bgp_route.c:1807
addpath_id=0, attr=0x7ffcf2c22450, afi=AFI_IP, safi=SAFI_UNICAST, type=10,
sub_type=0, prd=0x0, label=0x0, num_labels=0, soft_reconfig=0, evpn=0x0)
at bgpd/bgp_route.c:4424
packet=0x7ffcf2c22410) at bgpd/bgp_route.c:6266
packet=0x7ffcf2c22410, mp_withdraw=false) at bgpd/bgp_packet.c:341
peer=0x55c976e89ed0, size=43) at bgpd/bgp_packet.c:2414
at bgpd/bgp_packet.c:3899
pimd: re-evaluated S,G OILs upon RP changes and for empty SG upstream oils
Topology:
TOR11 (FHR) --- LEAF-11---SPINE1 (RP)MSDP SPINE-2(RP)MSDP --- LEAF-12 -- TOR12 (LHR)
| | | | |
| -----------------------------------------------------(ECMP) |
| | | |
-----------------------------------------------------------------------(ECMP)
Issue:
In some triggers, S,G upstream is preserved even with the PP timer expiry, resulting
in S,G with NULL OILS. This could be because we create a dummy S,G upstream and
dummy channel_oif for *,G, where RPF is UNKNOWN. As a result, PIM+VXLAN traffic is never
forwarded downstream to LHR.
Fix:
when the S,G stream is running, Determine if a reevaluation of the outgoing interface
list (OIL) is required. S,G upstream should then inherit the OIL from *,G.
ospfd: can not delete "segment-routing node-msd" when SR if off
This fixes the initial implementation of commit 7743f2f8c00 ("OSPFd: Update
Segment Routing PR following review") where it wsa not possible to remove
the "segment-routing node-msd" CLI nodes via vtysh once segment-routing got
disabled.
When "no bgp network import-check" is set, it is impossible to
successfully import the static routes into the BGP VPN table. The prefix
is present in the table but is not marked as valid. This issue applies
regardless of whether or not routes are present in the router's RIB.
Always mark as valid the nexthops of BGP static routes when "no bgp
network import-check" is set.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 45bf46441a2e0b02650a1d162367c357b220c7b1)
Issue:
Previously, the PBR common was updated for every rule update or deletion
example:
let say we have three rule 11, 12, 13 and if we are removing rule 12. in the current code
we are making the entire map "valid" to false.
pbr-map MAP1 seq 11
match src-ip 90.1.1.2/32
set nexthop 20.1.1.2 swp1
pbr-map MAP1 seq 12
match src-ip 90.1.1.3/32
set nexthop 20.1.1.2 swp1
pbr-map MAP1 seq 13
match src-ip 90.1.1.4/32
set nexthop 20.1.1.2 swp1
no pbr-map MAP1 seq 12 ==> turns whole map valid to false.
r1(config)# end
r1# show pbr map
pbr-map MAP1 valid: no
Seq: 11 rule: 310
Installed: yes Reason: Valid
SRC IP Match: 90.1.1.2/32
nexthop 20.1.1.2 swp1
Installed: yes Tableid: 10002
Seq: 13 rule: 312
Installed: yes Reason: Valid
SRC IP Match: 90.1.1.4/32
nexthop 20.1.1.2 swp1
Installed: yes Tableid: 10004
Fix:
Now, the PBR common will only be updated when the last rule is being deleted.
This change ensures that we only send a delete request to Zebra once, and only
set the valid and installed flags to false when the last rule is deleted.
This optimizes the handling of PBR rules and reduces unnecessary interactions with Zebra
Igor Ryzhov [Tue, 23 Jan 2024 00:32:22 +0000 (02:32 +0200)]
pimd: fix crash when configuring ssmpingd
Command: `ip ssmpingd 1.1.1.1`
Backtrace:
```
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
0x00007fd1d3b02859 in __GI_abort () at abort.c:79
0x00007fd1d3e323e1 in yang_dnode_xpath_get_canon (dnode=<optimized out>, xpath_fmt=<optimized out>, ap=<optimized out>) at lib/yang_wrappers.c:61
0x00007fd1d3e34f41 in yang_dnode_get_ipv4 (addr=addr@entry=0x7ffc368554d4, dnode=<optimized out>, xpath_fmt=xpath_fmt@entry=0x5556af8680d4 "./source-addr") at lib/yang_wrappers.c:826
0x00005556af8216d3 in routing_control_plane_protocols_control_plane_protocol_pim_address_family_ssm_pingd_source_ip_create (args=0x7ffc36855530) at pimd/pim_nb_config.c:925
0x00007fd1d3dec13f in nb_callback_create (nb_node=0x5556b197ea40, nb_node=0x5556b197ea40, errmsg_len=8192, errmsg=0x7ffc36855a90 "", resource=0x5556b18fa6f8, dnode=0x5556b1ad7a10, event=NB_EV_APPLY, context=0x5556b1ad75c0) at lib/northbound.c:1260
nb_callback_configuration (context=0x5556b1ad75c0, event=NB_EV_APPLY, change=<optimized out>, errmsg=0x7ffc36855a90 "", errmsg_len=8192) at lib/northbound.c:1648
0x00007fd1d3deca6c in nb_transaction_process (event=event@entry=NB_EV_APPLY, transaction=transaction@entry=0x5556b1ad75c0, errmsg=errmsg@entry=0x7ffc36855a90 "", errmsg_len=errmsg_len@entry=8192) at lib/northbound.c:1779
0x00007fd1d3decdd6 in nb_candidate_commit_apply (transaction=0x5556b1ad75c0, save_transaction=save_transaction@entry=true, transaction_id=transaction_id@entry=0x0, errmsg=errmsg@entry=0x7ffc36855a90 "", errmsg_len=errmsg_len@entry=8192) at lib/northbound.c:1129
0x00007fd1d3decf15 in nb_candidate_commit (context=..., candidate=<optimized out>, save_transaction=save_transaction@entry=true, comment=comment@entry=0x0, transaction_id=transaction_id@entry=0x0, errmsg=0x7ffc36855a90 "", errmsg_len=8192) at lib/northbound.c:1162
0x00007fd1d3ded4af in nb_cli_classic_commit (vty=vty@entry=0x5556b1ada2a0) at lib/northbound_cli.c:50
0x00007fd1d3df025f in nb_cli_apply_changes_internal (vty=vty@entry=0x5556b1ada2a0, xpath_base=xpath_base@entry=0x7ffc36859b50 ".", clear_pending=clear_pending@entry=false) at lib/northbound_cli.c:177
0x00007fd1d3df06ad in nb_cli_apply_changes (vty=vty@entry=0x5556b1ada2a0, xpath_base_fmt=xpath_base_fmt@entry=0x0) at lib/northbound_cli.c:233
0x00005556af80fdd5 in pim_process_ssmpingd_cmd (vty=0x5556b1ada2a0, operation=NB_OP_CREATE, src_str=0x5556b1ad9630 "1.1.1.1") at pimd/pim_cmd_common.c:3423
0x00007fd1d3da7b0e in cmd_execute_command_real (vline=vline@entry=0x5556b1ac9520, vty=vty@entry=0x5556b1ada2a0, cmd=cmd@entry=0x0, up_level=up_level@entry=0) at lib/command.c:982
0x00007fd1d3da7cb1 in cmd_execute_command (vline=vline@entry=0x5556b1ac9520, vty=vty@entry=0x5556b1ada2a0, cmd=0x0, vtysh=vtysh@entry=0) at lib/command.c:1040
0x00007fd1d3da7e50 in cmd_execute (vty=vty@entry=0x5556b1ada2a0, cmd=cmd@entry=0x5556b1ae0a30 "ip ssmpingd 1.1.1.1", matched=matched@entry=0x0, vtysh=vtysh@entry=0) at lib/command.c:1207
0x00007fd1d3e278be in vty_command (vty=vty@entry=0x5556b1ada2a0, buf=<optimized out>) at lib/vty.c:591
0x00007fd1d3e27afd in vty_execute (vty=0x5556b1ada2a0) at lib/vty.c:1354
0x00007fd1d3e2bb23 in vtysh_read (thread=<optimized out>) at lib/vty.c:2362
0x00007fd1d3e22254 in event_call (thread=thread@entry=0x7ffc3685cd80) at lib/event.c:2003
0x00007fd1d3dce9e8 in frr_run (master=0x5556b183c830) at lib/libfrr.c:1218
0x00005556af803653 in main (argc=6, argv=<optimized out>, envp=<optimized out>) at pimd/pim_main.c:162
```
bgpd: Set correct TTL for the dynamic neighbor peers
In an EBGP multihop configuration with dynamic neighbors, the TTL configured is not being updated for the socket.
Issue:
Assume the following topology:
Host (Dynamic peer to spine - 192.168.1.100) - Leaf - Spine (192.168.1.1)
When the host establishes a BGP multihop session to the spine,
the connection uses the MAXTTL value instead of the configured TTL (in this case, 2).
This issue is only observed with dynamic peers.
Logs: look at the TTL is still MAXTTL, instead of “2” configured.
> ==3901635==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020003a5940 at pc 0x56260067bb48 bp 0x7ffe8a4f3840 sp 0x7ffe8a4f3838
> READ of size 4 at 0x6020003a5940 thread T0
> #0 0x56260067bb47 in ecommunity_fill_pbr_action bgpd/bgp_ecommunity.c:1587
> #1 0x5626007a246e in bgp_pbr_build_and_validate_entry bgpd/bgp_pbr.c:939
> #2 0x5626007b25e6 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2933
> #3 0x562600909d18 in bgp_zebra_announce bgpd/bgp_zebra.c:1351
> #4 0x5626007d5efd in bgp_process_main_one bgpd/bgp_route.c:3528
> #5 0x5626007d6b43 in bgp_process_wq bgpd/bgp_route.c:3641
> #6 0x7f450f34c2cc in work_queue_run lib/workqueue.c:266
> #7 0x7f450f327a27 in event_call lib/event.c:1970
> #8 0x7f450f21a637 in frr_run lib/libfrr.c:1213
> #9 0x56260062fc04 in main bgpd/bgp_main.c:540
> #10 0x7f450ee2dd09 in __libc_start_main ../csu/libc-start.c:308
> #11 0x56260062ca29 in _start (/usr/lib/frr/bgpd+0x2e3a29)
>
> 0x6020003a5940 is located 0 bytes to the right of 16-byte region [0x6020003a5930,0x6020003a5940)
> allocated by thread T0 here:
> #0 0x7f450f6aa1f8 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
> #1 0x7f450f244f8a in qrealloc lib/memory.c:112
> #2 0x562600673313 in ecommunity_add_val_internal bgpd/bgp_ecommunity.c:143
> #3 0x5626006735bc in ecommunity_uniq_sort_internal bgpd/bgp_ecommunity.c:193
> #4 0x5626006737e3 in ecommunity_parse_internal bgpd/bgp_ecommunity.c:228
> #5 0x562600673890 in ecommunity_parse bgpd/bgp_ecommunity.c:236
> #6 0x562600640469 in bgp_attr_ext_communities bgpd/bgp_attr.c:2674
> #7 0x562600646eb3 in bgp_attr_parse bgpd/bgp_attr.c:3893
> #8 0x562600791b7e in bgp_update_receive bgpd/bgp_packet.c:2141
> #9 0x56260079ba6b in bgp_process_packet bgpd/bgp_packet.c:3406
> #10 0x7f450f327a27 in event_call lib/event.c:1970
> #11 0x7f450f21a637 in frr_run lib/libfrr.c:1213
> #12 0x56260062fc04 in main bgpd/bgp_main.c:540
> #13 0x7f450ee2dd09 in __libc_start_main ../csu/libc-start.c:308
Fixes: dacf6ec120 ("bgpd: utility routine to convert flowspec actions into pbr actions") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 6001c765e2a2ede2c98137b82ec8b4550a51dda9)
Louis Scalbert [Wed, 3 Jan 2024 15:13:02 +0000 (16:13 +0100)]
isisd: fix _isis_spftree_del heap-use-after-free
Fix the following heap-use-after-free
> ==82961==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020001e4750 at pc 0x55a8cc7f63ac bp 0x7ffd6948e340 sp 0x7ffd6948e330
> READ of size 8 at 0x6020001e4750 thread T0
> #0 0x55a8cc7f63ab in isis_route_node_cleanup isisd/isis_route.c:335
> #1 0x7ff25ec617c1 in route_node_free lib/table.c:75
> #2 0x7ff25ec619fc in route_table_free lib/table.c:111
> #3 0x7ff25ec61661 in route_table_finish lib/table.c:46
> #4 0x55a8cc800d83 in _isis_spftree_del isisd/isis_spf.c:397
> #5 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
> #6 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #7 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #8 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #9 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #10 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #11 0x55a8cc7799da in main isisd/isis_main.c:318
> #12 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> #13 0x7ff25e623e3f in __libc_start_main_impl ../csu/libc-start.c:392
> #14 0x55a8cc778e44 in _start (/usr/lib/frr/isisd+0x109e44)
>
> 0x6020001e4750 is located 0 bytes inside of 16-byte region [0x6020001e4750,0x6020001e4760)
> freed by thread T0 here:
> #0 0x7ff25f000537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
> #1 0x7ff25eb9012e in qfree lib/memory.c:130
> #2 0x55a8cc7f6485 in isis_route_table_info_free isisd/isis_route.c:351
> #3 0x55a8cc800cf4 in _isis_spftree_del isisd/isis_spf.c:395
> #4 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
> #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #10 0x55a8cc7799da in main isisd/isis_main.c:318
> #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
> #0 0x7ff25f000a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7ff25eb8ffdc in qcalloc lib/memory.c:105
> #2 0x55a8cc7f63eb in isis_route_table_info_alloc isisd/isis_route.c:343
> #3 0x55a8cc80052a in _isis_spftree_init isisd/isis_spf.c:334
> #4 0x55a8cc800e51 in isis_spftree_clear isisd/isis_spf.c:415
> #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #10 0x55a8cc7799da in main isisd/isis_main.c:318
> #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Fixes: 7153c3cabf ("isisd: update struct isis_route_info has multiple sr info by algorithm") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 9fa9a9d865c6e0ed2454abf3961410edabb17533)
Olivier Dugeon [Thu, 14 Dec 2023 17:34:38 +0000 (18:34 +0100)]
tests: Update OSPF TE topotests
The OSPF TE topotest is using switches to interconnect router. During the test,
interfaces are shutdown on some routers to simulate link failure and check that
the TED is correctly updated. However, the switche between router avoid the
detection by the neighbor router that the interface is down i.e. the interface
line remains up as it is conneted to the switch and not to the router.
This patch update the tested topology by removing the switch and connect
directly the router excepted the inter AS link on R3. Interface are also
renamed accordingly.
Olivier Dugeon [Thu, 14 Dec 2023 17:22:32 +0000 (18:22 +0100)]
ospfd: Correct LSA parser which fulfill the TED
Traffic Engineering Database (TED) is fulfill from the various LSA advertised
and received by the router. To remove information on the TED, 2 mechanisms are
used: i) parse TE Opaque LSA when there are flushed and ii) compare the list of
prefixes advertised in the Router LSA with the list of corresponding edges and
subnets contained in the TED. However, this second mechanism assumes that the
Router LSA is unique and contains all prefixes of the advertised router.
But, this is wrong. Prefixes could be advertised with several Router LSA.
This conduct to remove edge and subnet in the TED while it should be maintained.
The result is a faulty test with ospf_sr_te_topo1 topotest when server is heavy
loaded.
This simple patch removed deletion of edges and subnets when parsing the Router
LSA and only removed them when the corresponding TE Opaque LSA is flushed. In
addition, TE Opaque LSA are not flushed when OSPF ajacency goes down. This
patch also correct this second problem.
Donald Sharp [Mon, 11 Dec 2023 15:46:53 +0000 (10:46 -0500)]
bgpd: Make `suppress-fib-pending` clear peering
When a peer has come up and already started installing
routes into the rib and `suppress-fib-pending` is either
turned on or off. BGP is left with some routes that
may need to be withdrawn from peers and routes that
it does not know the status of. Clear the BGP peers
for the interesting parties and let's let us come
up to speed as needed.
Olivier Dugeon [Thu, 7 Dec 2023 13:53:16 +0000 (14:53 +0100)]
ospfd: Correct SID check size
Segment Router Identifier (SID) could be an index (4 bytes) within a range
(SRGB or SRLB) or an MPLS label (3 bytes). Thus, before calling check_size
macro to verify SID TLVs size, it is mandatory to determine the SID type to
avoid wrong assert.
Donald Sharp [Mon, 6 Nov 2023 18:02:01 +0000 (13:02 -0500)]
bgpd: Ensure BGP does not stop monitoring nexthops
In some cases BGP can be monitoring the same prefix
in both the nexthop and import check tables. If this
is the case, when unregistering one bnc from one table
make sure we are not still registered in the other
Example of the problem:
r1(config-router)# address-family ipv4 uni
r1(config-router-af)# no network 192.168.100.41/32
r1(config-router-af)# exit
r1# show bgp import-check-table
Current BGP import check cache:
r1# show bgp nexthop
Current BGP nexthop cache:
192.168.100.41 valid [IGP metric 0], #paths 1, peer 192.168.100.41
if r1-eth0
Last update: Wed Dec 6 11:01:40 2023
BGP now believes it is only watching 192.168.100.41 in the nexthop
cache, but zebra doesn't have anything:
r1# show ip import-check
VRF default:
Resolve via default: on
r1# show ip nht
VRF default:
Resolve via default: on
So if anything happens to the route that is being matched for
192.168.100.41 bgp is no longer going to be notified about this.
The source of this problem is that zebra has dropped the two different
tables into 1 table, while bgp has 2 tables to track this. The solution
to this problem (other than the rewrite that is being done ) is to have
BGP have a bit of smarts about looking in both tables for the bnc and
if found in both don't send the delete of the prefix tracking to zebra.
Donald Sharp [Wed, 6 Dec 2023 13:33:31 +0000 (08:33 -0500)]
zebra: Add connected with noprefixroute
Add ability for the connected routes to know
if they are a prefix route or not.
sharpd@eva:/work/home/sharpd/frr1$ ip addr show dev dummy1
13: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether aa:93:ce:ce:3f:62 brd ff:ff:ff:ff:ff:ff
inet 192.168.55.1/24 scope global noprefixroute dummy1
valid_lft forever preferred_lft forever
inet 192.168.56.1/24 scope global dummy1
valid_lft forever preferred_lft forever
inet6 fe80::a893:ceff:fece:3f62/64 scope link
valid_lft forever preferred_lft forever
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show int dummy1"
Interface dummy1 is up, line protocol is up
Link ups: 0 last: (never)
Link downs: 0 last: (never)
vrf: default
index 13 metric 0 mtu 1500 speed 0 txqlen 1000
flags: <UP,BROADCAST,RUNNING,NOARP>
Type: Ethernet
HWaddr: aa:93:ce:ce:3f:62
inet 192.168.55.1/24 noprefixroute
inet 192.168.56.1/24
inet6 fe80::a893:ceff:fece:3f62/64
Interface Type Other
Interface Slave Type None
protodown: off
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show ip route"
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:00:08
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr2 linkdown, 00:00:08
L>* 192.168.44.1/32 is directly connected, dummy2, 00:00:08
L>* 192.168.55.1/32 is directly connected, dummy1, 00:00:08
C>* 192.168.56.0/24 is directly connected, dummy1, 00:00:08
L>* 192.168.56.1/32 is directly connected, dummy1, 00:00:08
L>* 192.168.119.205/32 is directly connected, enp13s0, 00:00:08
sharpd@eva:/work/home/sharpd/frr1$ ip route show
default via 192.168.119.1 dev enp13s0 proto dhcp metric 100
169.254.0.0/16 dev virbr2 scope link metric 1000 linkdown
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.45.0/24 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
192.168.56.0/24 dev dummy1 proto kernel scope link src 192.168.56.1
192.168.119.0/24 dev enp13s0 proto kernel scope link src 192.168.119.205 metric 100
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
sharpd@eva:/work/home/sharpd/frr1$ ip route show table 255
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
local 172.17.0.1 dev docker0 proto kernel scope host src 172.17.0.1
broadcast 172.17.255.255 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
local 192.168.44.1 dev dummy2 proto kernel scope host src 192.168.44.1
broadcast 192.168.44.255 dev dummy2 proto kernel scope link src 192.168.44.1
local 192.168.45.1 dev virbr2 proto kernel scope host src 192.168.45.1
broadcast 192.168.45.255 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
local 192.168.55.1 dev dummy1 proto kernel scope host src 192.168.55.1
broadcast 192.168.55.255 dev dummy1 proto kernel scope link src 192.168.55.1
local 192.168.56.1 dev dummy1 proto kernel scope host src 192.168.56.1
broadcast 192.168.56.255 dev dummy1 proto kernel scope link src 192.168.56.1
local 192.168.119.205 dev enp13s0 proto kernel scope host src 192.168.119.205
broadcast 192.168.119.255 dev enp13s0 proto kernel scope link src 192.168.119.205
local 192.168.122.1 dev virbr0 proto kernel scope host src 192.168.122.1
broadcast 192.168.122.255 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
Donald Sharp [Wed, 6 Dec 2023 13:24:01 +0000 (08:24 -0500)]
zebra: Add ability to note that a address is NOPREFIXROUTE
The linux kernel can send up a flag that tells us that the
connected address is not a PREFIXROUTE. Add the ability
to note this and pass it up from the data plane.
Louis Scalbert [Thu, 30 Nov 2023 16:29:20 +0000 (17:29 +0100)]
staticd: fix changing to source auto in bfd monitor
When monitoring a static route with BFD multi-hop, the source IP can be
either configured or retrieved from NextHop-Tracking (NHT). After
removing a configured source, the source is supposed to be retrieved
from NHT but it remains to the previous value. This is problematic if
the user desires to fix the configuration of a incorrect source IP.
For example, theses two commands results in the incorrect state:
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop source 10.2.2.2
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop
When removing the source, BFD is unable to find the source from NHT via
bfd_nht_update() were called.
Force zebra to resend the information to BFD by unregistering and
registering again NHT. The (...)/frr-nexthops/nexthop northbound
apply_finish function will trigger a call to static_install_nexthop()
that does a call to static_zebra_nht_register(nh, true);
Fixes: b7ca809d1c ("lib: BFD automatic source selection") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 580c605194b3893a1d61a997a7b9d62e2d877427)
Chirag Shah [Tue, 5 Dec 2023 03:23:32 +0000 (19:23 -0800)]
bgpd: check bgp evpn instance presence in soo
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
3623 bgpd/bgp_evpn.c: No such file or directory.
(gdb) info locals
bgp_evpn = 0x0
macvrf_soo = <optimized out>
ret = false
__func__ = <optimized out>
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
(bgp=bgp@entry=0x55e86e9cd010, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, p=p@entry=0x0,
pi=pi@entry=0x55e86ec1a5a0, import=import@entry=1,
in_vrf_rt=true,
in_vni_rt=true) at bgpd/bgp_evpn.c:4200
(import=1, pi=pi@entry=0x55e86ec1a5a0, p=p@entry=0x0,
safi=safi@entry=SAFI_EVPN, afi=afi@entry=AFI_L2VPN,
bgp=bgp@entry=0x55e86e9cd010) at bgpd/bgp_evpn.c:6266
afi=afi@entry=AFI_L2VPN, safi=safi@entry=SAFI_EVPN,
p=p@entry=0x7fff4edc2160, pi=pi@entry=0x55e86ec1a5a0)
at bgpd/bgp_evpn.c:6266
(peer=peer@entry=0x55e86ea35400, p=p@entry=0x7fff4edc2160,
addpath_id=addpath_id@entry=0, attr=attr@entry=0x7fff4edc4400,
afi=afi@entry=AFI_L2VPN, safi=<optimized out>,
safi@entry=SAFI_EVPN, type=9, sub_type=0, prd=0x7fff4edc2120,
label=0x7fff4edc211c, num_labels=1,
soft_reconfig=0, evpn=0x7fff4edc2130) at bgpd/bgp_route.c:4805
(peer=peer@entry=0x55e86ea35400, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, attr=attr@entry=0x7fff4edc4400,
pfx=<optimized out>, psize=psize@entry=34,
addpath_id=0) at bgpd/bgp_evpn.c:4922
(peer=0x55e86ea35400, attr=0x7fff4edc4400, packet=<optimized out>,
withdraw=0) at bgpd/bgp_evpn.c:5997
(peer=peer@entry=0x55e86ea35400, attr=attr@entry=0x7fff4edc4400,
packet=packet@entry=0x7fff4edc43d0,
mp_withdraw=mp_withdraw@entry=0) at bgpd/bgp_packet.c:363
(peer=peer@entry=0x55e86ea35400, size=size@entry=161)
at bgpd/bgp_packet.c:2076
(thread=<optimized out>) at bgpd/bgp_packet.c:2931
Keelan10 [Wed, 29 Nov 2023 11:49:58 +0000 (15:49 +0400)]
bgpd: Free Memory for SRv6 Functions and Locator Chunks
Implement proper memory cleanup for SRv6 functions and locator chunks to prevent potential memory leaks.
The list callback deletion functions have been set.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in bgp_srv6l3vpn_to_bgp_vrf.test_bgp_srv6l3vpn_to_bgp_vrf/r2.asan.bgpd.4180
Direct leak of 544 byte(s) in 2 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x55d5dba6ee75 in sid_register bgpd/bgp_mplsvpn.c:591
#3 0x55d5dba6ee75 in alloc_new_sid bgpd/bgp_mplsvpn.c:712
#4 0x55d5dba6f3ce in ensure_vrf_tovpn_sid_per_af bgpd/bgp_mplsvpn.c:758
#5 0x55d5dba6fb94 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:849
#6 0x55d5dba7f975 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:299
#7 0x55d5dba7f975 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3704
#8 0x55d5dbbb6c66 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3164
#9 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#10 0x7f8d1713f034 in event_call lib/event.c:1974
#11 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#12 0x55d5db99d19d in main bgpd/bgp_main.c:510
#13 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 296 byte(s) in 1 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x7f8d170b1d5f in srv6_locator_chunk_alloc lib/srv6.c:135
#3 0x55d5dbbb6a19 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3144
#4 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#5 0x7f8d1713f034 in event_call lib/event.c:1974
#6 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#7 0x55d5db99d19d in main bgpd/bgp_main.c:510
#8 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
***********************************************************************************