| Age | Commit message (Collapse) | Author |
|
When removing a large number of routes, the linux kernel can take the
cpu for an extended amount of time, leaving a situation where FRR
detects a starvation event.
r1# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [NTFY] sharpd: [M7Q4P-46WDR] vty[5]@# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:55:57.256 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.890085
2024-06-14 12:55:57.256 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:07.802 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 7078ms (cpu time 220ms)
2024-06-14 12:56:25.039 [DEBG] sharpd: [WTN53-GK9Y5] Removed all Items 27.783668
2024-06-14 12:56:25.039 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:56:32.783 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.743524
2024-06-14 12:56:32.783 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:41.447 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 5175ms (cpu time 179ms)
Let's modify the loop in dplane_thread_loop such that after a provider
has been run, check to see if the event should yield, if so, stop
and reschedule this for the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 6faad863f30d29157e4c675ad956e3ccd38991a7)
|
|
Add a version value and accessor API for the zebra dataplane;
plugins can test this to detect API changes.
Signed-off-by: Mark Stapp <mjs@cisco.com>
|
|
these changes are for improving the code maintainability and readability
Signed-off-by: sri-mohan1 <sri.mohan@samsung.com>
|
|
Fix the following coverity issue:
*** CID 1575079: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_dplane.c: 5950 in dplane_srv6_encap_srcaddr_set()
5944 if (ret == AOK)
5945 result = ZEBRA_DPLANE_REQUEST_QUEUED;
5946 else {
5947 atomic_fetch_add_explicit(&zdplane_info
5948 .dg_srv6_encap_srcaddr_set_errors,
5949 1, memory_order_relaxed);
CID 1575079: Null pointer dereferences (REVERSE_INULL)
Null-checking "ctx" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
5950 if (ctx)
5951 dplane_ctx_free(&ctx);
5952 }
5953 return result;
5954 }
5955
Remove the pointer check for `ctx`. At this point in the
function it has to be non null since we deref'ed it.
Additionally the alloc function that creates it cannot
fail.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
|
|
Add a bunch of set functions and associated data structure in
zebra_dplane to allow the configuration of the source address for SRv6
encap in the data plane.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
|
|
The linux kernel can send up a flag that tells us that the
connected address is not a PREFIXROUTE. Add the ability
to note this and pass it up from the data plane.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Zebra final shutdown finally
|
|
Fix a crash because a use-after-free.
> =================================================================
> ==1249835==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000074210 at pc 0x7fa1b42a652c bp 0x7ffc477a2aa0 sp 0x7ffc477a2a98
> READ of size 8 at 0x604000074210 thread T0
> #0 0x7fa1b42a652b in list_delete_all_node git/frr/lib/linklist.c:299:20
> #1 0x7fa1b42a683f in list_delete git/frr/lib/linklist.c:312:2
> #2 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4
> #3 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2
> #4 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2
> #5 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4
> #6 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
> #7 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
> #8 0x556808 in main git/frr/zebra/main.c:488:2
> #9 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16
> #10 0x4453e9 in _start (/usr/lib/frr/zebra+0x4453e9)
>
> 0x604000074210 is located 0 bytes inside of 40-byte region [0x604000074210,0x604000074238)
> freed by thread T0 here:
> #0 0x4bf1dd in free (/usr/lib/frr/zebra+0x4bf1dd)
> #1 0x7fa1b42df0c0 in qfree git/frr/lib/memory.c:130:2
> #2 0x7fa1b42a68ce in list_free_internal git/frr/lib/linklist.c:24:2
> #3 0x7fa1b42a6870 in list_delete git/frr/lib/linklist.c:313:2
> #4 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4
> #5 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2
> #6 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2
> #7 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4
> #8 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
> #9 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
> #10 0x556808 in main git/frr/zebra/main.c:488:2
> #11 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16
>
> previously allocated by thread T0 here:
> #0 0x4bf5d2 in calloc (/usr/lib/frr/zebra+0x4bf5d2)
> #1 0x7fa1b42dee18 in qcalloc git/frr/lib/memory.c:105:27
> #2 0x7fa1b42a3784 in list_new git/frr/lib/linklist.c:18:9
> #3 0x6d165f in pbr_iptable_alloc_intern git/frr/zebra/zebra_pbr.c:1015:29
> #4 0x7fa1b426ad1f in hash_get git/frr/lib/hash.c:147:13
> #5 0x6d15f2 in zebra_pbr_add_iptable git/frr/zebra/zebra_pbr.c:1030:13
> #6 0x5db2a3 in zread_iptable git/frr/zebra/zapi_msg.c:3759:3
> #7 0x5e365d in zserv_handle_commands git/frr/zebra/zapi_msg.c:4039:3
> #8 0x7e09fc in zserv_process_messages git/frr/zebra/zserv.c:520:3
> #9 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
> #10 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
> #11 0x556808 in main git/frr/zebra/main.c:488:2
> #12 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16
Fixes: 1cc380679e ("zebra: Actually free all memory associated ctx->u.iptable.interface_name_list")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
|
|
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
On shutdown go through and ensure that any contexts the
dplane provider holds are freed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
INTERFACE_NAMSIZ is just a redefine of IFNAMSIZ and IFNAMSIZ
is the standard for interface name length on all platforms
that FRR currently compiles on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
The iprule/pbr rule object has a vrf id, and zebra uses
that internally, but the vrf id isn't returned to clients
who install rules and are waiting for results. Include the
vrf_id sent by the client in the zapi result notification
message; update the existing clients so they decode the id.
Signed-off-by: Mark Stapp <mjs@labn.net>
|
|
Add the txqlen attribute to the common interface struct. Capture
the value in zebra, and distribute it through the interface lib
module's zapi messaging.
Signed-off-by: Mark Stapp <mjs@labn.net>
|
|
Upon some internal testing some crashes were found. This fixes
the several crashes and normalizes the code to be closer in
it's execution pre and post changes to use the data plane.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
|
|
Subset: zebra dataplane
Add new vlan filter fields. No kernel dataplane
implementation yet (linux does not support).
Changes by:
Josh Werner <joshuawerner@mitre.org>
Eli Baum <ebaum@mitre.org>
G. Paul Ziemba <paulz@labn.net>
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
|
|
The function `dplane_ctx_route_init` initializes a dplane route context
from the route object passed as an argument. Let's abstract this
function to allow initializing the dplane route context without actually
copying a route object.
This allows us to use this function for initializing a dplane route
context when we don't have any route to copy in it.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
|
|
1) Add a bunch of get/set functions and associated data
structure in zebra_dplane to allow the setting and retrieval
of interface netlink data up into the master pthread.
2) Add a bit of code to breakup startup into stages. This is
because FRR currently has a mix of dplane and non dplane interactions
and the code needs to be paused before continuing on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
There is no need for this functionality and it is
not used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
|
|
There is no path in some functions where the ctx
has not already been de-refed. As such no need
to test for it's existence.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
The prov->dp_out_queued counter was never being decremented
when a ctx was pulled off of the list. Let's change it to
accurately reflect real life.
Broken:
janelle.pinkbelly.org# show zebra dplane providers detailed
Zebra dataplane providers:
Kernel (1): in: 330872, q: 0, q_max: 100, out: 330872, q: 330872, q_max: 330872
janelle.pinkbelly.org#
Fixed:
sharpd@janelle:/tmp/topotests$ vtysh -c "show zebra dplane providers detailed"
Zebra dataplane providers:
Kernel (1): in: 221495, q: 0, q_max: 100, out: 221495, q: 0, q_max: 100
sharpd@janelle:/tmp/topotests$
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Prevent this function from leaking the ctx memory.
Also properly record that something has gone wrong.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Having tests for memory allocation success makes no sense
given what happens when frr fails to allocate memory.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Remove the pointer check for ctx. At this point in the
function it has to be non null since we deref'ed it.
Additionally the alloc function that creates it cannot
fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
zebra: Mark connected route as installed after interface flap event
|
|
Issue:
After vlan flap, zebra was not marking the selected/best route as installed.
As a result, when a static route was configured with nexthop as directly
connected interface's(vlan) IP, the static route was not being installed
in the kernel since its nexthop was unresolved. The nexthop was marked
unresolved because zebra failed to mark the best route as installed after
interface flap.
This was happening because, in dplane_route_update_internal() if the old and
new context type, and nexthop group id are the same, then zebra doesn't send
down a route replace request to kernel. But, the installed (ROUTE_ENTRY_INSTALLED)
flag is set when zebra receives a response from kernel. Since the
request to kernel was being skipped for the route entry, installed flag
was not being set
Fix:
In dplane_route_update_internal() if the old and new context type, and
nexthop group id are the same, then before returning, installed flag will
be set on the route-entry if it's not set already.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
|
|
During shutdown, the main pthread stops the dplane pthread
before exiting. Don't try to clean up any events scheduled
to the dplane pthread at that point - just let the thread
exit and clean up.
Signed-off-by: Mark Stapp <mjs@labn.net>
|
|
Let's find a better name for it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Convert the `struct thread_master` to `struct event_master`
across the code base.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Modify the code base so that thread_cancel becomes event_cancel
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
*: convert to SPDX License identifiers
|
|
When we are installing the flood entry for a vtep in SVD,
ensure VNI is set on the ctx object so that it gets
sent to the kernel and set appropriately with src_vni.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
|
|
dplane_mac_info and dplane_neigh_info is modified to be vni aware.
dplane_rem_mac_add/del dplane_mac_init is modified to be vni aware.
During dplane context update (mac and neigh), we use the vni information
and if set, corresponding netlink attribute NDA_SRC_VNI is set and passed to the
dplane.
Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
|
|
Done with a combination of regex'ing and banging my head against a wall.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
Locking around the list of providers/plugins is not
helpful - these only change at init time. Clear some SA
warnings by removing the locking.
Signed-off-by: Mark Stapp <mjs@labn.net>
|
|
Replace some of the old queue/DLIST macros with typesafe
dlists.
Signed-off-by: Mark Stapp <mjs@labn.net>
|
|
Just remove redundant white spaces in debug information.
Before:
```
2023/01/11 05:04:48 ZEBRA: [W8V7C-6W4DS] init neigh ctx NEIGH_INSTALL: ifp vlan100, mac 9a:68:e9:73:74:88, ip 88.88.88.88
2023/01/11 05:04:48 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF vlan100(8) Neigh 88.88.88.88 MAC 9a:68:e9:73:74:88 flags 0x10 state 0x40 ext_flags 0x0
```
After:
```
2023/01/11 05:17:26 ZEBRA: [W8V7C-6W4DS] init neigh ctx NEIGH_INSTALL: ifp vlan100, mac 9a:68:e9:73:74:88, ip 88.88.88.88
2023/01/11 05:17:26 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF vlan100(8) Neigh 88.88.88.88 MAC 9a:68:e9:73:74:88 flags 0x10 state 0x40 ext_flags 0x0
```
Signed-off-by: anlan_cs <vic.lan@pica8.com>
|
|
Read from the fpm dplane a route update that will
include status about whether or not the asic was
successfull in offloading the route.
Have this data passed up to zebra for processing and disseminate
this data as appropriate.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Add the initial step of passing in a dplane context
to reading route netlink messages. This code
will be run in two contexts:
a) The normal pthread for reading netlink messages from
the kernel
b) The dplane_fpm_nl pthread.
The goal of this commit is too just allow a) to work
b) will be filled in in the future. Effectively
everything should still be working as it should
pre this change. We will just possibly allow
the passing of the context around( but not used )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
In order for a future commit to abstract the dplane_ctx_route_init
so that the kernel can use it, let's move some stuff around
and add a dplane_ctx_route_init_basic that can be used by multiple
different paths
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
create a dplane_ctx_route_init_basic so it can be used
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Zebra needs the ability to pass this data around.
Add it to the dplanes ability to pass.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra: Add a dplane_ctx_set_flags
The dplane_ctx_set_flags call is missing, we will need it. Add it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
If we have this semantics:
int ret = FAILURE;
if (foo)
goto done;
....
done:
return ret;
This pattern does us no favors and makes it harder to figure out what is going
on. Let's remove.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
|
|
zebra: trim unused tc dplane result values
|