summaryrefslogtreecommitdiff
path: root/zebra/zebra_dplane.c
AgeCommit message (Collapse)Author
2024-06-19zebra: Prevent starvation in dplane_thread_loopDonald Sharp
When removing a large number of routes, the linux kernel can take the cpu for an extended amount of time, leaving a situation where FRR detects a starvation event. r1# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10 2024-06-14 12:55:49.365 [NTFY] sharpd: [M7Q4P-46WDR] vty[5]@# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10 2024-06-14 12:55:49.365 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes 2024-06-14 12:55:57.256 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.890085 2024-06-14 12:55:57.256 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes 2024-06-14 12:56:07.802 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 7078ms (cpu time 220ms) 2024-06-14 12:56:25.039 [DEBG] sharpd: [WTN53-GK9Y5] Removed all Items 27.783668 2024-06-14 12:56:25.039 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes 2024-06-14 12:56:32.783 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.743524 2024-06-14 12:56:32.783 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes 2024-06-14 12:56:41.447 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 5175ms (cpu time 179ms) Let's modify the loop in dplane_thread_loop such that after a provider has been run, check to see if the event should yield, if so, stop and reschedule this for the future. Signed-off-by: Donald Sharp <sharpd@nvidia.com> (cherry picked from commit 6faad863f30d29157e4c675ad956e3ccd38991a7)
2024-04-09zebra: add dataplane API version valueMark Stapp
Add a version value and accessor API for the zebra dataplane; plugins can test this to detect API changes. Signed-off-by: Mark Stapp <mjs@cisco.com>
2024-03-15zebra: changes for code maintainabilitysri-mohan1
these changes are for improving the code maintainability and readability Signed-off-by: sri-mohan1 <sri.mohan@samsung.com>
2024-01-04zebra: `ctx` has to be non NULL at this pointCarmine Scarpitta
Fix the following coverity issue: *** CID 1575079: Null pointer dereferences (REVERSE_INULL) /zebra/zebra_dplane.c: 5950 in dplane_srv6_encap_srcaddr_set() 5944 if (ret == AOK) 5945 result = ZEBRA_DPLANE_REQUEST_QUEUED; 5946 else { 5947 atomic_fetch_add_explicit(&zdplane_info 5948 .dg_srv6_encap_srcaddr_set_errors, 5949 1, memory_order_relaxed); CID 1575079: Null pointer dereferences (REVERSE_INULL) Null-checking "ctx" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 5950 if (ctx) 5951 dplane_ctx_free(&ctx); 5952 } 5953 return result; 5954 } 5955 Remove the pointer check for `ctx`. At this point in the function it has to be non null since we deref'ed it. Additionally the alloc function that creates it cannot fail. Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2023-12-14zebra: Add code to set SRv6 encap source addr in dplaneCarmine Scarpitta
Add a bunch of set functions and associated data structure in zebra_dplane to allow the configuration of the source address for SRv6 encap in the data plane. Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
2023-12-06zebra: Add ability to note that a address is NOPREFIXROUTEDonald Sharp
The linux kernel can send up a flag that tells us that the connected address is not a PREFIXROUTE. Add the ability to note this and pass it up from the data plane. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-28Merge pull request #14811 from donaldsharp/zebra_final_shutdown_finallyChristian Hopps
Zebra final shutdown finally
2023-11-22zebra: fix dplane_ctx_iptable use-after-freeLouis Scalbert
Fix a crash because a use-after-free. > ================================================================= > ==1249835==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000074210 at pc 0x7fa1b42a652c bp 0x7ffc477a2aa0 sp 0x7ffc477a2a98 > READ of size 8 at 0x604000074210 thread T0 > #0 0x7fa1b42a652b in list_delete_all_node git/frr/lib/linklist.c:299:20 > #1 0x7fa1b42a683f in list_delete git/frr/lib/linklist.c:312:2 > #2 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4 > #3 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2 > #4 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2 > #5 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4 > #6 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2 > #7 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3 > #8 0x556808 in main git/frr/zebra/main.c:488:2 > #9 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16 > #10 0x4453e9 in _start (/usr/lib/frr/zebra+0x4453e9) > > 0x604000074210 is located 0 bytes inside of 40-byte region [0x604000074210,0x604000074238) > freed by thread T0 here: > #0 0x4bf1dd in free (/usr/lib/frr/zebra+0x4bf1dd) > #1 0x7fa1b42df0c0 in qfree git/frr/lib/memory.c:130:2 > #2 0x7fa1b42a68ce in list_free_internal git/frr/lib/linklist.c:24:2 > #3 0x7fa1b42a6870 in list_delete git/frr/lib/linklist.c:313:2 > #4 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4 > #5 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2 > #6 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2 > #7 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4 > #8 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2 > #9 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3 > #10 0x556808 in main git/frr/zebra/main.c:488:2 > #11 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16 > > previously allocated by thread T0 here: > #0 0x4bf5d2 in calloc (/usr/lib/frr/zebra+0x4bf5d2) > #1 0x7fa1b42dee18 in qcalloc git/frr/lib/memory.c:105:27 > #2 0x7fa1b42a3784 in list_new git/frr/lib/linklist.c:18:9 > #3 0x6d165f in pbr_iptable_alloc_intern git/frr/zebra/zebra_pbr.c:1015:29 > #4 0x7fa1b426ad1f in hash_get git/frr/lib/hash.c:147:13 > #5 0x6d15f2 in zebra_pbr_add_iptable git/frr/zebra/zebra_pbr.c:1030:13 > #6 0x5db2a3 in zread_iptable git/frr/zebra/zapi_msg.c:3759:3 > #7 0x5e365d in zserv_handle_commands git/frr/zebra/zapi_msg.c:4039:3 > #8 0x7e09fc in zserv_process_messages git/frr/zebra/zserv.c:520:3 > #9 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2 > #10 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3 > #11 0x556808 in main git/frr/zebra/main.c:488:2 > #12 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16 Fixes: 1cc380679e ("zebra: Actually free all memory associated ctx->u.iptable.interface_name_list") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2023-11-21zebra: On shutdown, ensure dg_update_list is emptiedDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-21zebra: Cleanup dplane provider owned ctx's on shutdownDonald Sharp
On shutdown go through and ensure that any contexts the dplane provider holds are freed. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-21zebra: On shutdown, cleanup dplane providersDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-21*: Let's use the native IFNAMSIZ instead of INTERFACE_NAMSIZDonald Sharp
INTERFACE_NAMSIZ is just a redefine of IFNAMSIZ and IFNAMSIZ is the standard for interface name length on all platforms that FRR currently compiles on. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-10-05lib,*: add vrf id to pbr rule results zapi messageMark Stapp
The iprule/pbr rule object has a vrf id, and zebra uses that internally, but the vrf id isn't returned to clients who install rules and are waiting for results. Include the vrf_id sent by the client in the zapi result notification message; update the existing clients so they decode the id. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-09-01lib,zebra: add tx queuelen to interface structMark Stapp
Add the txqlen attribute to the common interface struct. Capture the value in zebra, and distribute it through the interface lib module's zapi messaging. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-08-17zebra: Fix crashes in interface changeDonald Sharp
Upon some internal testing some crashes were found. This fixes the several crashes and normalizes the code to be closer in it's execution pre and post changes to use the data plane. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-07zebra: zebra_dplane.[ch]: use pbr common struct in ctxG. Paul Ziemba
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2023-07-19pbrd: add vlan filters pcp/vlan-id/vlan-flags; ip-protocol any (zebra dplane)G. Paul Ziemba
Subset: zebra dataplane Add new vlan filter fields. No kernel dataplane implementation yet (linux does not support). Changes by: Josh Werner <joshuawerner@mitre.org> Eli Baum <ebaum@mitre.org> G. Paul Ziemba <paulz@labn.net> Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2023-07-07zebra: Abstract `dplane_ctx_route_init` to init route without copyingCarmine Scarpitta
The function `dplane_ctx_route_init` initializes a dplane route context from the route object passed as an argument. Let's abstract this function to allow initializing the dplane route context without actually copying a route object. This allows us to use this function for initializing a dplane route context when we don't have any route to copy in it. Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
2023-07-05zebra: Add code to get/set interface to pass up from dplaneDonald Sharp
1) Add a bunch of get/set functions and associated data structure in zebra_dplane to allow the setting and retrieval of interface netlink data up into the master pthread. 2) Add a bit of code to breakup startup into stages. This is because FRR currently has a mix of dplane and non dplane interactions and the code needs to be paused before continuing on. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-07-05zebra: Remove unused dplane_intf_deleteDonald Sharp
There is no need for this functionality and it is not used. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-09zebra: bugfix dplane priority sortingG. Paul Ziemba
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2023-05-22zebra: Fix paths that have already de-refed ctxDonald Sharp
There is no path in some functions where the ctx has not already been de-refed. As such no need to test for it's existence. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-12zebra: Fix dp_out_queued counter to actually reflect real lifeDonald Sharp
The prov->dp_out_queued counter was never being decremented when a ctx was pulled off of the list. Let's change it to accurately reflect real life. Broken: janelle.pinkbelly.org# show zebra dplane providers detailed Zebra dataplane providers: Kernel (1): in: 330872, q: 0, q_max: 100, out: 330872, q: 330872, q_max: 330872 janelle.pinkbelly.org# Fixed: sharpd@janelle:/tmp/topotests$ vtysh -c "show zebra dplane providers detailed" Zebra dataplane providers: Kernel (1): in: 221495, q: 0, q_max: 100, out: 221495, q: 0, q_max: 100 sharpd@janelle:/tmp/topotests$ Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05zebra: dplane_gre_set could return while leaking ctxDonald Sharp
Prevent this function from leaking the ctx memory. Also properly record that something has gone wrong. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05zebra: Dplane ctx allocation cannot failDonald Sharp
Having tests for memory allocation success makes no sense given what happens when frr fails to allocate memory. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-21zebra: ctx has to be non NULL at this pointDonald Sharp
Remove the pointer check for ctx. At this point in the function it has to be non null since we deref'ed it. Additionally the alloc function that creates it cannot fail. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-12Merge pull request #13249 from Pdoijode/connected-route-install-fixMark Stapp
zebra: Mark connected route as installed after interface flap event
2023-04-10zebra: Install directly connected route after interface flapPooja Jagadeesh Doijode
Issue: After vlan flap, zebra was not marking the selected/best route as installed. As a result, when a static route was configured with nexthop as directly connected interface's(vlan) IP, the static route was not being installed in the kernel since its nexthop was unresolved. The nexthop was marked unresolved because zebra failed to mark the best route as installed after interface flap. This was happening because, in dplane_route_update_internal() if the old and new context type, and nexthop group id are the same, then zebra doesn't send down a route replace request to kernel. But, the installed (ROUTE_ENTRY_INSTALLED) flag is set when zebra receives a response from kernel. Since the request to kernel was being skipped for the route entry, installed flag was not being set Fix: In dplane_route_update_internal() if the old and new context type, and nexthop group id are the same, then before returning, installed flag will be set on the route-entry if it's not set already. Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2023-04-04zebra: fix race during shutdownMark Stapp
During shutdown, the main pthread stops the dplane pthread before exiting. Don't try to clean up any events scheduled to the dplane pthread at that point - just let the thread exit and clean up. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-03-24*: Convert `struct event_master` to `struct event_loop`Donald Sharp
Let's find a better name for it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert THREAD_XXX macros to EVENT_XXX macrosDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert struct thread_master to struct event_master and it's ilkDonald Sharp
Convert the `struct thread_master` to `struct event_master` across the code base. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert thread_cancelXXX to event_cancelXXXDonald Sharp
Modify the code base so that thread_cancel becomes event_cancel Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert thread_add_XXX functions to event_add_XXXDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Rename `struct thread` to `struct event`Donald Sharp
Effectively a massive search and replace of `struct thread` to `struct event`. Using the term `thread` gives people the thought that this event system is a pthread when it is not Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-17Merge pull request #12780 from opensourcerouting/spdx-license-idDonald Sharp
*: convert to SPDX License identifiers
2023-02-13zebra: add VNI info to flood entryStephen Worley
When we are installing the flood entry for a vtep in SVD, ensure VNI is set on the ctx object so that it gets sent to the kernel and set appropriately with src_vni. Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13zebra: single vxlan device dataplace vni update changesSharath Ramamurthy
dplane_mac_info and dplane_neigh_info is modified to be vni aware. dplane_rem_mac_add/del dplane_mac_init is modified to be vni aware. During dplane context update (mac and neigh), we use the vni information and if set, corresponding netlink attribute NDA_SRC_VNI is set and passed to the dplane. Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-01-25zebra: fix SA warning, don't lock plugin listMark Stapp
Locking around the list of providers/plugins is not helpful - these only change at init time. Clear some SA warnings by removing the locking. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-01-23zebra: use typesafe lib lists in zebra dplaneMark Stapp
Replace some of the old queue/DLIST macros with typesafe dlists. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-01-11zebra: cosmetic changes for debuganlan_cs
Just remove redundant white spaces in debug information. Before: ``` 2023/01/11 05:04:48 ZEBRA: [W8V7C-6W4DS] init neigh ctx NEIGH_INSTALL: ifp vlan100, mac 9a:68:e9:73:74:88, ip 88.88.88.88 2023/01/11 05:04:48 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF vlan100(8) Neigh 88.88.88.88 MAC 9a:68:e9:73:74:88 flags 0x10 state 0x40 ext_flags 0x0 ``` After: ``` 2023/01/11 05:17:26 ZEBRA: [W8V7C-6W4DS] init neigh ctx NEIGH_INSTALL: ifp vlan100, mac 9a:68:e9:73:74:88, ip 88.88.88.88 2023/01/11 05:17:26 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF vlan100(8) Neigh 88.88.88.88 MAC 9a:68:e9:73:74:88 flags 0x10 state 0x40 ext_flags 0x0 ``` Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-12-13zebra: Read from the dplane_fpm_nl a route updateDonald Sharp
Read from the fpm dplane a route update that will include status about whether or not the asic was successfull in offloading the route. Have this data passed up to zebra for processing and disseminate this data as appropriate. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Add ctx to netlink message parsingDonald Sharp
Add the initial step of passing in a dplane context to reading route netlink messages. This code will be run in two contexts: a) The normal pthread for reading netlink messages from the kernel b) The dplane_fpm_nl pthread. The goal of this commit is too just allow a) to work b) will be filled in in the future. Effectively everything should still be working as it should pre this change. We will just possibly allow the passing of the context around( but not used ) Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Rearrange dplane_ctx_route_initDonald Sharp
In order for a future commit to abstract the dplane_ctx_route_init so that the kernel can use it, let's move some stuff around and add a dplane_ctx_route_init_basic that can be used by multiple different paths Signed-off-by: Donald Sharp <sharpd@nvidia.com> create a dplane_ctx_route_init_basic so it can be used Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Add dplane_ctx_get|set_flagsDonald Sharp
Zebra needs the ability to pass this data around. Add it to the dplanes ability to pass. Signed-off-by: Donald Sharp <sharpd@nvidia.com> zebra: Add a dplane_ctx_set_flags The dplane_ctx_set_flags call is missing, we will need it. Add it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Remove goto's that do not do anything specialDonald Sharp
If we have this semantics: int ret = FAILURE; if (foo) goto done; .... done: return ret; This pattern does us no favors and makes it harder to figure out what is going on. Let's remove. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-09zebra: Actually free all memory associated ctx->u.iptable.interface_name_listDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-22zebra: traffic control state managementSiger Yang
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning jobs when it starts up. Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-08-19Merge pull request #11832 from sigeryang/masterQuentin Young
zebra: trim unused tc dplane result values