summaryrefslogtreecommitdiff
path: root/zebra
AgeCommit message (Collapse)Author
2025-04-19Merge pull request #18692 from donaldsharp/event_cleanupsHEADmasterDonatas Abraitis
zebra: Save event pointer for rib sweeping
2025-04-18zebra: Save event pointer for rib sweepingDonald Sharp
The rib_sweep_route function when not doing graceful restart does not attempt to save the event on the t_rib_sweep pointer for shutdown. Prevent any weird shenanigans by allowing shutdown to clean up the rib_sweep_route event. Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-04-16Merge pull request #18497 from krishna-samy/show-metaq-countersMark Stapp
zebra: show command to display metaq info
2025-04-16Merge pull request #18579 from krishna-samy/krishna/dplane_fpm_readMark Stapp
zebra: change fpm_read to batch the messages
2025-04-16zebra: change fpm_read to batch the messagesKrishnasamy
Make code changes in fpm_read to create a list of ctx and send it to zebra for processing rather than sending individual ctx Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
2025-04-14Merge pull request #18641 from donaldsharp/fpm_listener_storageJafar Al-Gharaibeh
zebra: Add ability to dump routes received from fpm_listener
2025-04-11Merge pull request #18645 from louis-6wind/fix-zebra-pbr-leakDonald Sharp
zebra: fix pbr_iptable memory leak
2025-04-11zebra: clean pbr_iptable interface_name_list freeLouis Scalbert
Clean up code related to pbr_iptable->interface_name_list free. This is a cosmetic change. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-11zebra: fix pbr_iptable memory leakLouis Scalbert
We are obviously doing deleting on wrong object. > Direct leak of 40 byte(s) in 1 object(s) allocated from: > #0 0x7fcf718b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fcf7126f8dd in qcalloc lib/memory.c:105 > #2 0x7fcf7124401a in list_new lib/linklist.c:49 > #3 0x55771621d86d in pbr_iptable_alloc_intern zebra/zebra_pbr.c:1015 > #4 0x7fcf71217d79 in hash_get lib/hash.c:147 > #5 0x55771621dad3 in zebra_pbr_add_iptable zebra/zebra_pbr.c:1030 > #6 0x55771614d00c in zread_iptable zebra/zapi_msg.c:4131 > #7 0x55771614e586 in zserv_handle_commands zebra/zapi_msg.c:4424 > #8 0x5577162dae2c in zserv_process_messages zebra/zserv.c:521 > #9 0x7fcf7137798e in event_call lib/event.c:2011 > #10 0x7fcf71242ff1 in frr_run lib/libfrr.c:1216 > #11 0x5577160e4d6d in main zebra/main.c:540 > #12 0x7fcf70c29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > Indirect leak of 24 byte(s) in 1 object(s) allocated from: > #0 0x7fcf718b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fcf7126f8dd in qcalloc lib/memory.c:105 > #2 0x7fcf71244129 in listnode_new lib/linklist.c:71 > #3 0x7fcf71244238 in listnode_add lib/linklist.c:92 > #4 0x55771621d938 in pbr_iptable_alloc_intern zebra/zebra_pbr.c:1019 > #5 0x7fcf71217d79 in hash_get lib/hash.c:147 > #6 0x55771621dad3 in zebra_pbr_add_iptable zebra/zebra_pbr.c:1030 > #7 0x55771614d00c in zread_iptable zebra/zapi_msg.c:4131 > #8 0x55771614e586 in zserv_handle_commands zebra/zapi_msg.c:4424 > #9 0x5577162dae2c in zserv_process_messages zebra/zserv.c:521 > #10 0x7fcf7137798e in event_call lib/event.c:2011 > #11 0x7fcf71242ff1 in frr_run lib/libfrr.c:1216 > #12 0x5577160e4d6d in main zebra/main.c:540 > #13 0x7fcf70c29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Fixes: f80ec7e3d6 ("zebra: handle iptable list of interfaces") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-11zebra: split up MTYPE_PBR_OBJLouis Scalbert
Split up MTYPE_PBR_OBJ into dedicated MTYPE to clarify the memory allocation and free. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-10zebra: Add ability to dump routes received from fpm_listenerDonald Sharp
The fpm_listener currently has no ability to store the list of prefixes that it has received. Modify the code to store the prefixes in a typesafe RB Tree. Additionally modify the code such that when a SIGUSR1 is received to dump the routes out. If the operator specifies a -z <filename> then write the routes to that file. It will overwrite the last version of the file written. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-10zebra: modify fpm_listener to display data about nhgsDonald Sharp
Currently the fpm_listener completely ignores NHG's. Let's start dumping some data about the nexthop groups: [2025-04-10 16:55:12.939235306] FPM message - Type: 1, Length 52 [2025-04-10 16:55:12.939254252] Nexthop Group ID: 9, Protocol: Zebra(11), Contains 1 nexthops, Family: 2, Scope: 0 [2025-04-10 16:55:12.939260564] FPM message - Type: 1, Length 52 [2025-04-10 16:55:12.939263990] Nexthop Group ID: 10, Protocol: Zebra(11), Contains 1 nexthops, Family: 2, Scope: 0 [2025-04-10 16:55:12.939268659] FPM message - Type: 1, Length 56 [2025-04-10 16:55:12.939271635] Nexthop Group ID: 8, Protocol: Zebra(11), Contains 2 nexthops, Family: 0, Scope: 0 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-09zebra: Fix shadow warning in irdp_packet.cDonald Sharp
My compiler is complaining about irdp_sock being a shadow variable. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-08zebra: clean up -Wshadow compiler warningsMark Stapp
Clean up variable-shadowing compiler warnings. Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-04-01Merge pull request #18450 from donaldsharp/bgp_packet_readsRuss White
Bgp packet reads conversion to a FIFO
2025-04-01zebra: show command to display metaq infoKrishnasamy
Display below info from metaq and sub queues 1. Current queue size 2. Max/Highwater size 3. Total number of events received fo so far r1# sh zebra metaq MetaQ Summary Current Size : 0 Max Size : 9 Total : 20 |------------------------------------------------------------------| | SubQ | Current | Max Size | Total | |----------------------------------+----------+-----------+--------| | NHG Objects | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | EVPN/VxLan Objects | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | Early Route Processing | 0 | 8 | 11 | |----------------------------------+----------+-----------+--------| | Early Label Handling | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | Connected Routes | 0 | 6 | 9 | |----------------------------------+----------+-----------+--------| | Kernel Routes | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | Static Routes | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | RIP/OSPF/ISIS/EIGRP/NHRP Routes | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | BGP Routes | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | Other Routes | 0 | 0 | 0 | |----------------------------------+----------+-----------+--------| | Graceful Restart | 0 | 0 | 0 | |------------------------------------------------------------------| Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
2025-03-30zebra: Clean up memory associated with affinity mapsDonald Sharp
Zebra is using affinity maps but not cleaning up memory on shutdown. BAD! Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-25zebra: Limit reading packets when MetaQ is fullDonald Sharp
Currently Zebra is just reading packets off the zapi wire and stacking them up for processing in zebra in the future. When there is significant churn in the network the size of zebra can grow without bounds due to the MetaQ sizing constraints. This ends up showing by the number of nexthops in the system. Reducing the number of packets serviced to limit the metaQ size to the packets to process allieviates this problem. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-21Merge pull request #18359 from soumyar-roy/soumya/streamsizeMark Stapp
zebra: zebra crash for zapi stream
2025-03-20Merge pull request #18409 from donaldsharp/typesafe_zclientRuss White
Typesafe zclient
2025-03-20zebra: reduce memory usage by streams when redistributing routesSoumya Roy
This commit undo 8c9b007a0c7efb2e9afc2eac936ba9dd971c6707 stream lib has been modified to expand the stream if needed Now for zapi route encode, we use expandable stream Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-20zebra: zebra crash for zapi streamSoumya Roy
Issue: If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size. Fix: 1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage Testing: No crash happens anymore with the fix zebra: zebra crash for zapi stream Issue: If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size. Fix: 1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage Testing: No crash happens anymore with the fix r1# r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000 r1# r2# conf r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1 r2(config)# exit r2# Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-19zebra: Add timestamp to outputDonald Sharp
It's interesting to know the time we received the route. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-19zebra: Allow fpm_listener to reject all routesDonald Sharp
Now usage of `-r -f` with fpm_listener now causes all routes to be rejected. r1# sharp install routes 10.0.0.0 nexthop 192.168.44.5 5 r1# show ip route Codes: K - kernel route, C - connected, L - local, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric, t - Table-Direct, > - selected route, * - FIB route, q - queued, r - rejected, b - backup t - trapped, o - offload failure IPv4 unicast VRF default: D>o 10.0.0.0/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02 D>o 10.0.0.1/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02 D>o 10.0.0.2/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02 D>o 10.0.0.3/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02 D>o 10.0.0.4/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02 C>* 192.168.44.0/24 is directly connected, r1-eth0, weight 1, 00:00:37 L>* 192.168.44.1/32 is directly connected, r1-eth0, weight 1, 00:00:37 r1# Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-19zebra: Rework the stale client list to a typesafe listDonald Sharp
The stale client list was just a linked list, let's use the typesafe list. Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-03-19zebra: Convert the zrouter.client_list to a typesafe listDonald Sharp
This list should just be a typesafe list. Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-03-19Merge pull request #18374 from raja-rajasekar/rajasekarr/nhg_intf_flap_issueRuss White
zebra: Fix reinstalling nexthops in NHGs upon interface flaps
2025-03-18zebra: Fix reinstalling nexthops in NHGs upon interface flapsRajasekar Raja
Trigger: Imagine a route utilizing an NHG with six nexthops (Intf swp1-swp6). If interfaces swp1-swp4 flaps, the NHG remains the same but now only references two nexthops (swp5-6) instead of all six. This behavior occurs due to how NHGs with recursive nexthops are managed within Zebra. In the scenario below, NHG 370 has all six nexthops installed in the kernel. However, Zebra maintains a list of recursive NHGs that NHG 370 references i.e., Depends: (371), (372), (373) which are not directly installed in the kernel. - When an interface comes up, its nexthop and corresponding dependents are installed. - These dependents (counterparts to 371-373) are non-recursive and are installed as well. - However, when attempting to install the recursive ones in zebra_nhg_install_kernel(), they resolve to the already installed counterparts, resulting in a NO-OP. Fixing this by iterating all dependents of the recursively resolved NHGs and reinstalling them. Trigger: Flap swp1 to swp4 Before Fix: root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5 6.0.0.5 nhid 370 proto bgp metric 20 ip -d next show id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra id 341 via 2000:1:0:1:0:8:0:8 dev swp4 scope link proto zebra id 343 via 2000:1:0:1:0:7:0:8 dev swp3 scope link proto zebra id 346 via 2000:1:0:1:0:1:0:7 dev swp2 scope link proto zebra id 348 via 2000:1:0:1::7 dev swp1 scope link proto zebra id 370 group 346/348/341/343/337/339 scope global proto zebra After Trigger: root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5 6.0.0.5 nhid 370 proto bgp metric 20 root@leaf-11:mgmt:/var/home/cumulus# ip -d next show id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra id 370 group 337/339 scope global proto zebra After Fix: root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5 6.0.0.5 nhid 432 proto bgp metric 20 ip -d next show id 432 group 395/397/400/402/405/407 scope global proto zebra After Trigger root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5 6.0.0.5 nhid 432 proto bgp metric 20 root@leaf-11:mgmt:/var/home/cumulus# ip -d next show id 432 group 395/397/400/402/405/407 scope global proto zebra Ticket :# Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-18Merge pull request #18349 from donaldsharp/more_yang_stateRuss White
More yang state
2025-03-17zebra: add rtadv information output in vtysh jsonDmytro Shytyi
Add to "show interface json" output multiple rtadv parameters. if_dump_vty() calls => hook_call(zebra_if_extra_info, vty, ifp); if_dump_vty_json() now do the same call, with additional parameter: hook_call(zebra_if_extra_info, vty, json_if, ifp); Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
2025-03-15Merge pull request #18394 from donaldsharp/fpm_listener_outputDonatas Abraitis
zebra: add ability to specify output file with fpm_listener
2025-03-14zebra: add ability to specify output file with fpm_listenerDonald Sharp
The fpm_listener didn't have the ability to specify the output file location at all. Modify the code to accept this. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-14Merge pull request #18360 from ↵Jafar Al-Gharaibeh
raja-rajasekar/rajasekarr/fix_explicit_sid_allocation zebra: ensure proper return for failure for Sid allocation
2025-03-12Merge pull request #18336 from routingrocks/rvaratharaj/bugfixmarMark Stapp
zebra: Fix neigh delete causing heap-use-after-free error
2025-03-11zebra: Fix neigh delete causing heap-use-after-free errorRajesh Varatharaj
Issue: Not freeing the neighbor n within the same function can lead to memory leak. zebra_neigh_del_all() -> zebra_neigh_del() re lookup and free Fix: not accessing n after its freed. Directly free the neighbor entry (n) when its interface index matches ifp->ifindex. This fixes: ERROR: AddressSanitizer: heap-use-after-free on address 0x6070001052e8 at pc 0x7f6bf7d09ddb bp 0x7ffd3366a000 sp 0x7ffd33669ff0 READ of size 8 at 0x6070001052e8 thread T0 #0 0x7f6bf7d09dda in _rb_next lib/openbsd-tree.c:455 #1 0x55f95a307261 in zebra_neigh_rb_head_RB_NEXT zebra/zebra_neigh.h:34 #2 0x55f95a3082e9 in zebra_neigh_del_all zebra/zebra_neigh.c:162 #3 0x55f95a121ee7 in zebra_interface_down_update zebra/redistribute.c:571 #4 0x55f95a0f819d in if_down zebra/interface.c:1017 #5 0x55f95a0fe168 in zebra_if_dplane_ifp_handling zebra/interface.c:2102 #6 0x55f95a0ff10c in zebra_if_dplane_result zebra/interface.c:2241 #7 0x55f95a27ce9c in rib_process_dplane_results zebra/zebra_rib.c:5015 #8 0x7f6bf7da3ad9 in event_call lib/event.c:1984 #9 0x7f6bf7c62141 in frr_run lib/libfrr.c:1246 #10 0x55f95a11ca7f in main zebra/main.c:543 #11 0x7f6bf7029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #12 0x7f6bf7029e3f in __libc_start_main_impl ../csu/libc-start.c:392 #13 0x55f95a0dd0b4 in _start (/usr/lib/frr/zebra+0x1a80b4) Ticket: #18047 Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
2025-03-11Merge pull request #16614 from louis-6wind/fix-otable-heap-after-freeMark Stapp
zebra: fix table heap-after-free crash
2025-03-10zebra: ensure proper return for failure for Sid allocationRajasekar Raja
The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool but we have places where we return a -1 or NULL which the caller is assuming as a True/Valid and ending up allocating Sid Without Fix: 2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32 2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088 2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000] 2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2 2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients With Fix: 2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056 2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000] 2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32 2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1# Ticket :# Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2025-03-10lib, tests, zebra: keep table routes at vrf disablingLouis Scalbert
At VRF disabling, keep the route entries that was associated to its table ID but not to the VRF itself. Kernel flushes these entries so we need to reinstall them. To do so, add a flag to mean that a route entry is owned by a table ID and not by a VRF. If the VRF associated to the table ID is deleted, the route entry must not be deleted. Update to tests with new flag. 2057 is in hexa 0x809, meaning that the new flag has been to some prefix. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10zebra: fix vanished blackhole routeLouis Scalbert
Fix vanished blackhole route when kernel routes are updated. > root@router# echo "100 my_table" | tee -a /etc/iproute2/rt_tables > root@router# ip l add du0 type dummy > root@router# ifconfig du0 192.168.0.1/24 up > root@router# ip route add blackhole default table 100 > root@router# ip route show table 100 > blackhole default > root@router# vtysh -c 'show ip route table 100' > [...] > Table 100: > K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:05 > root@router# ip l add red type vrf table 100 > root@router# vtysh -c 'show ip route table 100' > [...] > Table 100: > K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:16 > root@router# ip l set du0 master red > root@router# vtysh -c 'show ip route table 100' > [...] > Table 100: > C>* 192.168.0.0/24 is directly connected, du0, weight 1, 00:00:02 > L>* 192.168.0.1/32 is directly connected, du0, weight 1, 00:00:02 > root@router# ip route show table 100 > blackhole default > 192.168.0.0/24 dev du0 proto kernel scope link src 192.168.0.1 > local 192.168.0.1 dev du0 proto kernel scope host src 192.168.0.1 > broadcast 192.168.0.255 dev du0 proto kernel scope link src 192.168.0.1 Fixes: d528c02a20 ("zebra: Handle kernel routes appropriately") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10zebra: fix removed default route at vrf enablingLouis Scalbert
When a routing table (RT) already has a default route before being assigned to a VRF, the default route vanishes in zebra after the VRF assignment. > root@router:~# ip route add blackhole default table 100 > root@router:~# ip route show table 100 > blackhole default > root@router:~# vtysh -c 'show ip route table 100' > [...] > VRF default table 100: > K>* 0.0.0.0/0 [0/0] unreachable (blackhole), 00:00:05 > root@router:~# ip l add red type vrf table 100 > root@router:~# vtysh -c 'show ip route table 100' > root@router:~# Do not override the default route if it exists. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10zebra: remove vrf route entries at vrf disablingLouis Scalbert
This is the continuation of the previous commit. When a VRF is deleted, the kernel retains only its own routing entries in the former VRF table and removes all others. This change ensures that routing entries created by FRR daemons are also removed from the former zebra VRF table when the VRF is disabled. To test: > echo "100 my_table" | tee -a /etc/iproute2/rt_tables > ip l add du0 type dummy > ifconfig du0 192.168.0.1/24 up > ip route add blackhole default table 100 > ip route show table 100 > ip l add red type vrf table 100 > ip l set du0 master red > vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254' > vtysh -c 'show ip route table 100' > sleep 0.1 > ip l del red > sleep 0.1 > vtysh -c 'show ip route table 100' > ip l add red type vrf table 100 > ip l set du0 master red > vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254' > vtysh -c 'show ip route table 100' > sleep 0.1 > ip l del red > sleep 0.1 > vtysh -c 'show ip route table 100' Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10zebra: fix table heap-after-free crashLouis Scalbert
Fix a heap-after-free that causes zebra to crash even without address-sanitizer. To reproduce: > echo "100 my_table" | tee -a /etc/iproute2/rt_tables > ip route add blackhole default table 100 > ip route show table 100 > ip l add red type vrf table 100 > ip l del red > ip route del blackhole default table 100 Zebra manages routing tables for all existing Linux RT tables, regardless of whether they are assigned to a VRF interface. When a table is not assigned to any VRF, zebra arbitrarily assigns it to the default VRF, even though this is not strictly accurate (the code expects this behavior). When an RT table is created after a VRF, zebra correctly assigns the table to the VRF. However, if a VRF interface is assigned to an existing RT table, zebra does not update the table owner, which remains as the default VRF. As a result, existing routing entries remain under the default VRF, while new entries are correctly assigned to the VRF. The VRF mismatch is unexpected in the code and creates crashes and memory related issues. Furthermore, Linux does not automatically delete RT tables when they are unassigned from a VRF. It is incorrect to delete these tables from zebra. Instead, at VRF disabling, do not release the table but reassign it to the default VRF. At VRF enabling, change the table owner back to the appropriate VRF. > ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88 > READ of size 1 at 0x606000154f54 thread T0 > #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28 > #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28 > #2 0x7fa32474d783 in route_node_get lib/table.c:283 > #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231 > #4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957 > #5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988 > #6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894 > #7 0x7fa32476689c in event_call lib/event.c:1996 > #8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > #9 0x55b0e4e6c32a in main zebra/main.c:526 > #10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > #11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649) > > 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78) > freed by thread T0 here: > #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123 > #1 0x7fa324668d8f in qfree lib/memory.c:130 > #2 0x7fa32474c421 in route_table_free lib/table.c:126 > #3 0x7fa32474bf96 in route_table_finish lib/table.c:46 > #4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191 > #5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214 > #6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219 > #7 0x7fa32476fabf in vrf_disable lib/vrf.c:326 > #8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231 > #9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478 > #10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949 > #11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > #12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > #13 0x7fa32476689c in event_call lib/event.c:1996 > #14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > #15 0x55b0e4e6c32a in main zebra/main.c:526 > #16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > > previously allocated by thread T0 here: > #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fa324668c4d in qcalloc lib/memory.c:105 > #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38 > #3 0x7fa32474e73c in route_table_init lib/table.c:512 > #4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137 > #5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358 > #6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140 > #7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286 > #8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533 > #9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968 > #10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > #11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > #12 0x7fa32476689c in event_call lib/event.c:1996 > #13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > #14 0x55b0e4e6c32a in main zebra/main.c:526 > #15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-07zebra: Add mpls-forwarding to yang state modelDonald Sharp
The mpls-forwarding state was missing from the model add it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07zebra: Don't use MTYPE_TMP for l2 vni dataDonald Sharp
Convert over from MTYPE_TMP to MTYPE_L2_VNI as the data type. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07zebra: Declutter zebra_vxlan_if_add_update_vniDonald Sharp
This function has equivalent code on both sides of a if statement. Let's consolidate this. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07zebra: malloc functions cannot failDonald Sharp
Let's try to remember that when using a malloc function it can never fail and as such testing for NULL does nothing. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-06Merge pull request #18214 from soumyar-roy/soumya/ra514neiDonatas Abraitis
zebra: Bring up 514 BGP neighbor sessions
2025-03-05zebra: Bring up 514 BGP neighbor sessionsSoumya Roy
Issue: When 514 inerfaces/neighbors are configured, it creates socket error, "Cannot allocate memory", when back to back V6 RA messages are tried to be sent over the socket. This prevents interface, to know its peer's link local address. Socket error comes when 1) try to join ICMPv6 all router multicast group, back to back for all interfaces 2)send back to back RA for all interfaces Fix: 1)For ICMPv6 join case, we check if the interface has already joined all router group, if not try to join. On failure, retry joining after random amount of time determined 1 ms to ICMPV6_JOIN_TIMER_EXP_MS(100 ms) 2) For RA issue case, batch sending of RA mesages using wheel timer Testing: Monitor BGP session running sh bgp summary command Before fix: r1# sh bgp summary IPv4 Unicast Summary: BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0 BGP table version 0 RIB entries 0, using 0 bytes of memory Peers 515, using 12 MiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc r1-eth0 4 1002 89 90 0 0 0 00:07:10 0 0 N/A r1-eth1 4 1002 89 90 0 0 0 00:07:10 0 0 N/A r1-eth2 4 1002 89 90 0 0 0 00:07:10 0 0 N/A r1-eth3 4 1002 89 90 0 0 0 00:07:10 0 0 N/A r1-eth4 4 1002 89 90 0 0 0 00:07:10 0 0 N/A r1-eth5 4 1002 89 90 0 0 0 00:07:10 0 0 N/A …..<snip>... r1-eth252 4 1002 31 29 0 0 0 00:02:08 0 0 N/A r1-eth253 4 1002 31 29 0 0 0 00:02:08 0 0 N/A r1-eth254 4 1002 31 29 0 0 0 00:02:08 0 0 N/A r1-eth255 4 1002 31 29 0 0 0 00:02:08 0 0 N/A r1-eth256 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth257 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth258 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth259 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth260 4 0 0 0 0 0 0 never Idle 0 N/A ……..<snip>….. r1-eth511 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth512 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth513 4 0 0 0 0 0 0 never Idle 0 N/A r1-eth514 4 0 0 0 0 0 0 never Idle 0 N/A After fix: r1# show bgp summary IPv4 Unicast Summary: BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0 BGP table version 0 RIB entries 0, using 0 bytes of memory Peers 515, using 12 MiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc r1-eth0 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth1 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth2 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth3 4 1002 64 67 0 0 0 00:05:09 0 0 N/A r1-eth4 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth5 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth6 4 1002 67 70 0 0 0 00:05:22 0 0 N/A r1-eth7 4 1002 87 87 0 0 0 00:07:04 0 0 N/A r1-eth8 4 1002 87 87 0 0 0 00:07:04 0 0 N/A .... r1-eth499 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth500 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth501 4 1002 19 22 0 0 0 00:01:21 0 0 N/A r1-eth502 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth503 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth504 4 1002 20 23 0 0 0 00:01:30 0 0 N/A r1-eth505 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth506 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth507 4 1002 22 25 0 0 0 00:01:39 0 0 N/A r1-eth508 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth509 4 1002 17 20 0 0 0 00:01:13 0 0 N/A r1-eth510 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth511 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth512 4 1002 19 22 0 0 0 00:01:22 0 0 N/A r1-eth513 4 1002 43 43 0 0 0 00:03:22 0 0 N/A r1-eth514 4 1002 43 43 0 0 0 00:03:22 0 0 N/A Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-04Merge pull request #18253 from dksharp5/yang_zebraRuss White
Allow retrieval of v4/v6 forwarding state via NB
2025-03-03Merge pull request #18030 from fdumontet6WIND/mem_alloc_streamMark Stapp
zebra: reduce memory usage by streams when redistributing routes