summaryrefslogtreecommitdiff
path: root/zebra/zebra_rib.c
AgeCommit message (Collapse)Author
2023-07-05Merge pull request #13875 from donaldsharp/static_dplane_issuesMark Stapp
zebra: Static routes async notification do not need this test
2023-06-29zebra: Dump route details when deleting a routeDonatas Abraitis
Just more details what's going on when deleting a route. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-06-29zebra: Static routes async notification do not need this testDonald Sharp
When using asic_offload with an asynchronous notification the rib_route_match_ctx function is testing for distance and tag being correct against the re. Normal route notification for static routes is this(well really all routes): a) zebra dplane generates a ctx to send to the dplane for route install b) dplane installs it in the kernel c) if the dplane_fpm_nl.c module is being used it installs it. d) The context's success code is set to it worked and passes the context back up to zebra for processing. e) Zebra master receives this and checks the distance and tag are correct for static routes and accepts the route and marks it installed. If the operator is using a wait for install mechansim where the dplane is asynchronously sending the result back up at a future time *and* it is using the dplane_fpm_nl.c code where it uses the rt_netlink.c route parsing code, then there is no way to set distance as that we do not pass distance to the kernel. As such static routes were never being properly handled since the re and context would not match and the route would still be marked as queued. Modify the code such that the asynchronous path notification for static routes ignores the distance and tag's as that there is no way to test for this data from that path at this point in time. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05zebra: Reduce creation and fix memory leak of frrscripting pointersDonald Sharp
There are two issues being addressed: a) The ZEBRA_ON_RIB_PROCESS_HOOK_CALL script point was creating a fs pointer per dplane ctx in rib_process_dplane_results(). b) The fs pointer was not being deleted and directly leaked. For (a) Move the creation of the fs to outside the do while loop. For (b) At function end ensure that the pointer is actually deleted. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-20zebra: Fix connected route deletion when multiple entry existsXiao Liang
When multiple interfaces have addresses in the same network, deleting one of them may cause the wrong connected route being deleted. For example: ip link add veth1 type veth peer veth2 ip link set veth1 up ip link set veth2 up ip addr add dev veth1 192.168.0.1/24 ip addr add dev veth2 192.168.0.2/24 ip addr flush dev veth1 Zebra deletes the route of interface veth2 rather than veth1. Should match nexthop against ere->re_nhe instead of ere->re->nhe. Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
2023-04-12zebra: Actually free up memory associated with the mq listDonald Sharp
Free up the link list data structures as well as properly account for data sizes. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-04Merge pull request #13145 from donaldsharp/do_deleteJafar Al-Gharaibeh
Improve and fix zebra GR
2023-03-31zebra: Cleanup ctx leak on shutdown and turn off eventDonald Sharp
two things: On shutdown cleanup any events associated with the update walker. Also do not allow new events to be created. Fixes this mem-leak: ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from: ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #8 0x7f0dd0797f0c in event_call lib/event.c:1995 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #10 0x55b42fa30caa in main zebra/main.c:465 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29zebra: Ensure gr events run after Meta Queue has runDonald Sharp
BGP signals to zebra that a afi has converged immediately after it has finished processing all routes for a given afi/safi. This generates events in zebra in this order a) Routes received from BGP, placed on early-rib Meta-Q b) Signal GR for the afi. Now imagine that zebra reads GR code and immediately processes routes that are in the actual rib and removes some routes. This generates a c) route deletion to the kernel for some number of routes that may be in the the early-rib Meta-Q d) Process the Meta-Q, and re-install the routes This is undesirable behavior in zebra. In that while we may end up in a correct state, there will be a blip for some number of routes that happen to be in the early rib Meta-Q. Modify the GR code to have it's own processing entry at the end of the Meta-Q. This will allow all routes to be processed and ready for handling by the Graceful Restart code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-28zebra: Use zebra_vrf_lookup_by_id when we canDonald Sharp
Let's make this as consistent as is possible. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert event.h to frrevent.hDonald Sharp
We should probably prevent any type of namespace collision with something else. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert THREAD_XXX macros to EVENT_XXX macrosDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert a bunch of thread_XX to event_XXDonald Sharp
Convert these functions: thread_getrusage thread_cmd_init thread_consumed_time thread_timer_to_hhmmss thread_is_scheduled thread_ignore_late_timer Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert thread_add_XXX functions to event_add_XXXDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Rename `struct thread` to `struct event`Donald Sharp
Effectively a massive search and replace of `struct thread` to `struct event`. Using the term `thread` gives people the thought that this event system is a pthread when it is not Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Rename thread.[ch] to event.[ch]Donald Sharp
This is a first in a series of commits, whose goal is to rename the thread system in FRR to an event system. There is a continual problem where people are confusing `struct thread` with a true pthread. In reality, our entire thread.c is an event system. In this commit rename the thread.[ch] files to event.[ch]. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-21Merge pull request #12798 from donaldsharp/rib_match_multicastRuss White
Rib match multicast
2023-02-21Merge pull request #12818 from imzyxwvu/fix/other-table-inactiveDonald Sharp
zebra: Fix other table inactive when ip import-table is on
2023-02-17Merge pull request #12780 from opensourcerouting/spdx-license-idDonald Sharp
*: convert to SPDX License identifiers
2023-02-16zebra: Remove code duplication for v4 and v6 versions of rib_match_multicastDonald Sharp
a) Consolidate v4 and v6 versions of rib_match_multicast b) Improve debug to show what we matched against as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-15zebra: Fix other table inactive when ip import-table is onzyxwvu Shi
In `rib_link`, if is_zebra_import_table_enabled returns true, `rib_queue_add` will not called, resulting in other table route node never processed. This actually should not be dependent on whether the route is imported. In `rib_delnode`, if is_zebra_import_table_enabled returns true, it will use `rib_unlink` instead of enqueuing the route node for process. There is no reason that imported route nodes should not be reprocessed. Long ago, the behaviour was dependent on whether the route_entry comes from a table other than main. Signed-off-by: zyxwvu Shi <i@shiyc.cn>
2023-02-14Merge pull request #12789 from donaldsharp/version_cleanupDavid Lamparter
2023-02-13lib,zebra,bgpd,staticd: use label code to store VNI infoStephen Worley
Use the already existing mpls label code to store VNI info for vxlan. VNI's are defined as labels just like mpls, we should be using the same code for both. This patch is the first part of that. Next we will need to abstract the label code to not be so mpls specific. Currently in this, we are just treating VXLAN as a label type and storing it that way. Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-10lib, zebra: Consolidate ZEBRA_TABLE_MAX_DISTANCE valuesDonald Sharp
Currently `ip import-table 33` imports routes with a distance of 15, as defined by zebra.h. zebra_rib.c on the other hand believes the default value for the table is 150. Let's make them agree with each other. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-10lib, zebra: Use defines for distanceDonald Sharp
Use the defines for distance that are in zebra.h. We could easily have a cluster where we don't agree with ourselves. So let's convert zebra to use the defines in zebra.h Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-01-31zebra: Add missing enums to switch statementsDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-31zebra: Send nht resolved entry up to concerned protocols in all casesDonald Sharp
There existed the idea, from Volta, that a nexthop group would not have the same nexthops installed -vs- what FRR actually sent down. The dplane would notify you. With the addition of 06525c4f99d4dcafdf448565f7e11bd70993697d the code was put behind a bit of a wall controlled the usage of it. The flag ROUTE_ENTRY_USE_FIB_NHG flag was being used to control which set was being sent up to concerned parties in nexthop tracking. Put this flag behind the wall and do not necessarily set it when we receive a data plane notification about a route being installed or not. Fixes: #12706 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-25zebra: Remove impossible to use functionDonald Sharp
The rib_update_handle_vrf function is no longer being used. Cleanup it's usage from zebra. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-23zebra: use typesafe lib lists in zebra dplaneMark Stapp
Replace some of the old queue/DLIST macros with typesafe dlists. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-01-18Merge pull request #12604 from donaldsharp/distance_metric_offload_fixesRuss White
Distance/metric offload fixes
2023-01-16zebra: fix use after free on RIB processingRafael Zalamena
After calling `rib_unlink` the variable `re` will point to `free()`d memory, so don't attempt to use it after this point. Found by Coverity Scan (Coverity ID 1519784) Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-05zebra: Set metric appropriately on route offload to asicDonald Sharp
When FRR receives a route from the kernel about the route offload success/failure. The metric being reported is not going to be correct since we may not know it appropriately at this point in time. If we can set the metric to something appropriate. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-05zebra: Fix distance being set incorrectly on kernel offload updateDonald Sharp
When we are notified about the kernel about a route being offloaded or not correctly set the distance. Ticket: CM-33097 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-15zebra: When freeing the early route queue, actually free it rightDonald Sharp
The early route queue has a series of `struct zebra_early_route *` entries. Zebra is treating this memory as just a `struct route entry`. This is wrong. Correct this to free the memory correctly. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-15lib, tests, zebra: Remove unused workqueue error functionDonald Sharp
The wq->spec.errorfunc is never used in the code. It's been in the code base since 2005 and I also do not remember ever seeing it being called. No workqueue process function ever returns error. Since it's not used let's just remove it from the code base. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-13zebra: Read from the dplane_fpm_nl a route updateDonald Sharp
Read from the fpm dplane a route update that will include status about whether or not the asic was successfull in offloading the route. Have this data passed up to zebra for processing and disseminate this data as appropriate. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Add `zrouter.asic_notification_nexthop_control`Donald Sharp
Volta submitted notification changes for the dplane that had a special use case for their system. Volta is no more, the code is not being actively developed and from talking with ex-Volta employees there is no current plans to even maintain this code. Wrap the special handling of nexthops that their asic-dataplane did in a bit of code to isolate it and allow for future removal, as that I do not actually believe anyone else is using this code. Add a CPP_NOTICE several years into the future that will tell us to remove the code. If someone starts using it then they will have to notice this variable to set it and hopefully they will see my CPP_NOTICE to come talk to us. If this is being used then we can just remove this wrapper. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12zebra: Return statements do not use paranthesisDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-22zebra: traffic control state managementSiger Yang
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning jobs when it starts up. Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-10-26zebra: Fix handling of recursive routes when processing closely in timeDonald Sharp
When zebra receives routes from upper level protocols it decodes the zapi message and places the routes on the metaQ for processing. Suppose we have a route A that is already installed by some routing protocol. And there is a route B that has a nexthop that will be recursively resolved through A. Imagine if a route replace operation for A is going to happen from an upper level protocol at about the same time the route B is going to be installed into zebra. If these routes are received, and decoded, at about the same time there exists a chance that the metaQ will contain both of them at the same time. If the order of installation is [ B, A ]. B will be resolved correctly through A and installed, A will be processed and re-installed into the FIB. If the nexthops have changed for A then the owner of B should be notified about the change( and B can do the correct action here and decide to withdraw or re-install ). Now imagine if the order of routes received for processing on the metaQ is [ A, B ]. A will be received, processed and sent to the dataplane for reinstall. B will then be pulled off the metaQ and fail the install since A is in a `not Installed` state. Let's loosen the restriction in nexthop resolution for B such that if the route we are dependent on is a route replace operation allow the resolution to suceed. This requires zebra to track a new route state( ROUTE_ENTRY_ROUTE_REPLACING ) that can be looked at during nexthop resolution. I believe this is ok because A is a route replace operation, which could result in this: -route install failed, in which case B should be nht'ing and will receive the nht failure and the upper level protocol should remove B. -route install succeeded, no nexthop changes. In this case allowing the resolution for B is ok, NHT will not notify the upper level protocol so no action is needed. -route install succeeded, nexthops changes. In this case allowing the resolution for B is ok, NHT will notify the upper level protocol and it can decide to reinstall B or not based upon it's own algorithm. This set of events was found by the bgp_distance_change topotest(s). Effectively the tests were looking for the bug ( A, B order in the metaQ ) as the `correct` state. When under very heavy load, the A, B ordering caused A to just be installed and fully resolved in the dataplane before B is gotten to( which is entirely possible ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-09-24zebra: fix fpm crashanlan_cs
Fix issue#11996. When removing VRF ( all routes of this VRF), zebra mistakenly forgot to check whether its routes are in update queue of FPM. So FPM module will crash during its dealing with these routes, which are already freed. Add a new HOOK `rib_shutdown()`, `zebra_rtable_node_cleanup()` will use it to remove these routes from update queue of FPM module before freeing them. Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-08-17zebra: Create a zebra_rib_route_entry_new function and use itDonald Sharp
Abstract the creation of the route_entry and use it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-17zebra: Introduce early route processing on the MetaQDonald Sharp
Currently if an operator does this operation: sharpd@eva ~/frr8> sudo ip nexthop add id 5000 via 192.168.119.44 dev enp39s0 ; sudo ip route add 10.0.0.1 nhid 5000 2022/06/30 08:52:40 ZEBRA: [ZHQK5-J9M1R] proto2zebra: Please add this protocol(0) to proper rt_netlink.c handling 2022/06/30 08:52:40 ZEBRA: [PS16P-365FK][EC 4043309076] Zebra failed to find the nexthop hash entry for id=5000 in a route entry sharpd@eva ~/frr8> vtysh -c "show ip route 10.0.0.1" Routing entry for 0.0.0.0/0 Known via "kernel", distance 0, metric 100, best Last update 00:01:58 ago * 192.168.119.1, via enp39s0 The route is dropped by zebra with no warnings. This is not good, but unlikely to happen at this point in time. In order to fix this issue route processing from inputs needs to happen after nexthop group processing from inputs. This was not possible because nexthop groups are placed on the metaQ. As such the above nexthop group creation is placed on the metaQ for processing in META_QUEUE_NHG. Then the route is read in and processed immediately. The nexthop group is not found ( not processed yet!) and the route is dropped in zebra. Modify the code to have early route processing of validity on the MetaQ. This preserves the order of operations. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-17zebra: Convert label processing to Meta-QDonald Sharp
Convert label processing that comes from zapi messages into being handled by the meta-Q. This is because early route processing is going to be moved to the meta-Q as well and we will have a chicken and egg problem without moving this code to be processed by the meta-Q. Ordering of messages from ospf as an example: 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:62] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:43] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:61] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:66] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] The ZEBRA_MPLS_LABELS_REPLACE immediately turn around and attempt to replace nexthop labels on routes that were added. If the route add is placed on the metaQ, it will not exist yet and as such the label replace will fail. Modify the zebra code to take the label operations and place them on the metaQ as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-11zebra: add tc netlink and dplane opsSiger Yang
This commit implements necessary netlink encoders for traffic control including QDISC, TCLASS and TFILTER, and adds basic dplane operations. Co-authored-by: Stephen Worley <sworley@nvidia.com> Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-08-10zebra: Combine meta_queue_free and meta_queue_vrf_free functionsDonald Sharp
These functions essentially do the same thing. Combine them for the goodness of mankind. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-10zebra: System routes should be processed the same time as kernelDonald Sharp
For whatever reason. ZEBRA_ROUTE_SYSTEM routes were being processed last. Since a system route is just another kernel route type. Let's just switch it to be processed the same time as kernel routes. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-10zebra: Let's use enum for META Queue indexesDonald Sharp
Convert the meta queue values to an enum and use them. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-10zebra: Explicitly call out the correct queue nameDonald Sharp
There were more than a few places where the NHG meta queue was not being explicitly called out. Let's be consistent and use the same nomenclature as much as possible when talking about metaQ's. Signed-off-by: Donald Sharp <sharpd@nvidia.com>