summaryrefslogtreecommitdiff
path: root/zebra/zebra_rib.c
AgeCommit message (Collapse)Author
2021-01-13Merge pull request #7819 from donaldsharp/more_data_for_debug_dumpsMark Stapp
zebra: Add ability to display human readable format re->flags and status
2021-01-13Merge pull request #7818 from donaldsharp/ip_proto_deniedMark Stapp
zebra: notify installing protocol when nexthops cannot be resolved
2021-01-13zebra: Add ability to display in human readable format re->flags and statusDonald Sharp
The re->flags and re->status in debugs were being dumped as hex values. I can never quickly decode this. Here is an idea. Let's let FRR do it for me. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-13Merge pull request #6853 from mjstapp/fix_rib_dupsDonald Sharp
zebra: reduce impact of route-update overload
2021-01-11zebra: notify installing protocol when nexthops cannot be resolvedDonald Sharp
In the case where a routes nexthops cannot be resolved as part of route processing, immmediately notify the upper level protocol that their routes failed to install if they are interested in being informed about this issue. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-12-29Merge pull request #7777 from volta-networks/fix_zebra_rib_c++Quentin Young
zebra: avoid c++ reserved keyword
2020-12-21zebra: avoid c++ reserved keywordEmanuele Di Pascale
in rib_handle_nhg_replace, do not use new as a parameter name to allow compilation of c++ code including zebra headers. Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
2020-12-11Revert "zebra: When shutting down an interface immediately notify about rnh"Donald Sharp
This reverts commit 0aaa722883245c2109d9856ca0656749860fc579.
2020-12-08zebra: Gather opaque data into the route entry for storageDonald Sharp
Just gather the opaque data into the route entry. Later commits will display this data for end users as well as to send it down. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-12-08lib, zebra: Fix overlapping message typesDonald Sharp
We had duplicate message id's. Shit's broke yo. Fix. I have no idea how this properly worked. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-12-07zebra: remove useless deleted route_entries promptlyMark Stapp
Zebra accumulates route-entry objects and then processes them as a group. If that rib processing is delayed, because the dataplane/fib programming has built up a queue e.g., zebra can hold multiple deleted route objects in memory. At scale, this can be a problem. Delete unneeded route entries promptly, if they can't contribute to rib processing. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-11-30zebra: free dplane ctx after pw updateMark Stapp
Free the dplane contexts used for pseudowire updates; we were leaking these. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-11-15zebra: Add `--asic-offload` commandDonald Sharp
Add a command that allows FRR to know it's being used with an underlying asic offload, from the linux kernel perspective. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-06bgpd: Advertise FIB installed routes to bgp peers (Part 1)Soman K S
Issue: The bgp routes learnt from peers which are not installed in kernel are advertised to peers. This can cause routers to send traffic to these destinations only to get dropped. The fix is to provide a configurable option "bgp suppress-fib-pending". When the option is enabled, bgp will advertise routes only if it these are successfully installed in kernel. Fix (Part1) : * Added message ZEBRA_ROUTE_NOTIFY_REQUEST used by client to request FIB install status for routes * Added AFI/SAFI to ZAPI messages * Modified the functions zapi_route_notify_decode(), zsend_route_notify_owner() and route_notify_internal() to include AFI, SAFI as parameters Signed-off-by: kssoman <somanks@gmail.com>
2020-10-29Merge pull request #7414 from donaldsharp/32bitflagsJafar Al-Gharaibeh
zebra: Consolidate on 32 bits as the flag size for route flags
2020-10-29zebra: Consolidate on 32 bits as the flag size for route flagsDonald Sharp
When we get a route for installation via any method we should consolidate on 32 bits as the flag size, since we have actually more than 8 bits of data to bass around. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-29zebra: Don't do expensive string manip if not in debugDonald Sharp
Modify the code to not load up a string that is only used in debugging unless we are debugging. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-26zebra: dplane APIs for programming evpn-mh access port attributesAnuradha Karuppiah
This includes - 1. non-DF block filter 2. List of es-peers that need to be blocked per-access port (for split horizon filtering) 3. Backup nexthop group to failover local-es via the VxLAN overlay Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-26zebra: Replace some prefix2str with %pFXDonald Sharp
We are loading a buffer with the prefix2str results then using it in the debugs throughout functions. Replace with just using %pFX and remove the buffer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-22:* Convert prefix2str to %pFXDonatas Abraitis
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-20Merge pull request #7311 from donaldsharp/table_lock_countDonatas Abraitis
Abstract rn->lock accessing and cleanup usage to %pFX and %pRN
2020-10-18Merge pull request #7333 from mjstapp/fix_multi_connectedDonald Sharp
zebra: support multiple connected subnets on an interface
2020-10-17*: Create/Use accessor functions for lock countDonald Sharp
Create appropriate accessor functions for the rn->lock data. We should be accessing this data through accessor functions since it is private data to the data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17zebra: Fix use after free in debug pathDonald Sharp
When zebra is running with debugs turned on there is a use after free reported by the address sanitizer: 2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing 2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3 ================================================================= ==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8 READ of size 8 at 0x608000011d28 thread T0 #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222 #1 0x555555705b54 in re_list_first zebra/rib.h:222 #2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248 #3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286 #4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320 #5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291 #6 0x7ffff7450e9c in thread_call lib/thread.c:1581 #7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099 #8 0x55555561a578 in main zebra/main.c:455 #9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308 #10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429) 0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78) freed by thread T0 here: #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f) #1 0x7ffff739ccad in qfree lib/memory.c:129 #2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746 #3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240 #4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245 #5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286 #6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320 #7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291 #8 0x7ffff7450e9c in thread_call lib/thread.c:1581 #9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099 #10 0x55555561a578 in main zebra/main.c:455 #11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308 previously allocated by thread T0 here: #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037) #1 0x7ffff739cb98 in qcalloc lib/memory.c:110 #2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515 #3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576 #4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607 #5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012 #6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049 #7 0x55555571788b in rib_add zebra/zebra_rib.c:3327 #8 0x5555555e584a in connected_up zebra/connected.c:254 #9 0x5555555e42ff in connected_announce zebra/connected.c:94 #10 0x5555555e4fd3 in connected_update zebra/connected.c:195 #11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340 #12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213 #13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350 #14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941 #15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402 #16 0x7ffff7450e9c in thread_call lib/thread.c:1581 #17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099 #18 0x55555561a578 in main zebra/main.c:455 #19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first This is happening because we are using the dest pointer after a call into rib_gc_dest. In process_subq_route, we call rib_process() and if the dest is deleted dest pointer is now garbage. We must reload the dest pointer in this case. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-16zebra: support multiple connected subnets on an interfaceMark Stapp
We support configuration of multiple addresses in the same subnet on a single interface: make sure that zebra supports multiple instances of the corresponding connected route. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-10-01zebra: Make connected routes their own entry on the meta_qDonald Sharp
During quick ifdown / ifup events from the linux kernel there exists a situation where a prefix that has both a kernel route and a static route can queued up on the meta-q. If the static route happens to point at a connected route for nexthop resolution and we receive a series of quick up/down events *after* the static route and kernel route are queued up for rib reprocessing. Since the static route and kernel route are queued on meta-q 1 and the connected route is also on meta-q 1 there exists a situation where the connected route will be resolved after the static route fails to resolve, leaving the static route in a unresolved state. Add a new queue level and put connected routes on their own level, since they are the fundamental building blocks of pretty much all the other routes. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-01zebra: When processing route_entries ignore unusable routesDonald Sharp
When zebra is processing routes to determine what to send to the rib, suppose we have two routes (a) a route processed earlier that none of it's nexthops were active and (b) a route that has good nexthops but has a worse admin distance. rib_process, would not relook at (a)'s nexthops because the ROUTE_ENTRY_CHANGED flag was not true and it would win when compared to (b) because it's admin distance was better, leaving us with a state where we would attempt and fail to install route (a) because it was not valid. Modify the code to consider the number of nexthops we have as a determiner if we can use the route. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-09-30zebra: Prevent uninstall attempts when new entry is not happyDonald Sharp
In rib_process_update_fib, the function is sent two route entries the old ( previously installed ) and new ( the one to install ) When the function detects that the new is unusable because the number of nexthops that are usable for that route is 0, then we uninstall the old route. The problem here is that we should not attempt to uninstall any route that is not owned by FRR. Modify the code to not attempt this behavior Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-09-28zebra: fix refcnt/rib issues in NHG replace/deleteStephen Worley
Fix some reference counting issues seen when replacing a NHG and deleting one. For replacement, we should end with the same refcnt on the new one. For delete, its the caller's job to decrement its ref after its done with it. Further, update routes in the rib with the new pointer after replace. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-09-28zebra: handle zapi routes with NHG ID setStephen Worley
Add code to properly handle routes sent with NHG ID rather than a nexthop_group. For now, we separate this from backup nexthop handling since that should probably be added to the nhg_proto_add calls. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-08-28zebra: When we get a rib deletion event be smarterDonald Sharp
When we get a rib deletion event and we already have that particular route node in the queue to be reprocessed, just note that someone from kernel land has done us dirty and allow it to be cleaned up by normal processing Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-28zebra: When shutting down an interface immediately notify about rnhDonald Sharp
Imagine a situation where a interface is bouncing up/down. The interface comes up and daemons like pbr will get a nht tracking callback for a connected interface up and will install the routes down to zebra. At this same time the interface can go down. But since zebra is busy handling route changes ( from pbr ) it has not read the netlink message and can get into a situation where the route resolves properly and then we attempt to install it into the kernel( which is rejected ). If the interface bounces back up fast at this point, the down then up netlink message will be read and create two route entries off the connected route node. Zebra will then enqueue both route entries for future processing. After this processing happens the down/up is collapsed into an up and nexthop tracking sees no changes and does not inform any upper level protocol( in this case pbr ) that nexthop tracking has changed. So pbr still believes the nexthops are good but the routes are not installed since pbr has taken no action. Fix this by immediately running rnh when we signal a connected route entry is scheduled for removal. This should cause upper level protocols to get a rnh notification for the small amount of time that the connected route was bouncing around like a madman. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-26zebra: When we fail, actually note the failureDonald Sharp
During testing it was noticed that routes were considered installed by zebra, but the kernel did not have the route. Upon close debugging of the rib it was noticed that FRR was turning a dplane_ctx_route_init into a success and FRR was now in a bad state. 2020/08/26 17:55:53.897436 PBR: route_notify_owner: [0.0.0.0/0] Route Removed succeeded for table: 10012 2020/08/26 17:55:53.897572 ZEBRA: 0.0.0.0/0: uptime == 432033, type == 24, instance == 0, table == 10012 2020/08/26 17:55:53.897622 ZEBRA: rib_meta_queue_add: (0:10012):0.0.0.0/0: queued rn 0x5566b0ea7680 into sub-queue 5 2020/08/26 17:55:53.907637 ZEBRA: default(0:10012):0.0.0.0/0: Processing rn 0x5566b0ea7680 2020/08/26 17:55:53.907665 ZEBRA: default(0:10012):0.0.0.0/0: Examine re 0x5566b0d01200 (pbr) status 2 flags 1 dist 200 metric 0 2020/08/26 17:55:53.907702 ZEBRA: default(0:10012):0.0.0.0/0: After processing: old_selected 0x0 new_selected 0x5566b0d01200 old_fib 0x0 new_fib 0x5566b0d01200 2020/08/26 17:55:53.907713 ZEBRA: default(0:10012):0.0.0.0/0: Adding route rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr) 2020/08/26 17:55:53.907879 ZEBRA: default(0:10012):0.0.0.0/0: rn 0x5566b0ea7680 dequeued from sub-queue 5 2020/08/26 17:55:53.907943 ZEBRA: netlink_route_multipath: RTM_NEWROUTE 0.0.0.0/0 vrf 0(10012) 2020/08/26 17:55:53.910756 ZEBRA: default(0:10012):0.0.0.0/0 Processing dplane result ctx 0x5566b0ea82f0, op ROUTE_INSTALL result SUCCESS 2020/08/26 17:55:53.910769 ZEBRA: update_from_ctx: default(0:10012):0.0.0.0/0: SELECTED, re 0x5566b0d01200 2020/08/26 17:55:53.910785 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): no fib nhg 2020/08/26 17:55:53.910793 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): rib nhg matched, changed 'true' 2020/08/26 17:55:53.910802 ZEBRA: (0:10012):0.0.0.0/0: Redist update re 0x5566b0d01200 (pbr), old 0x0 (None) 2020/08/26 17:55:53.910812 ZEBRA: Notifying Owner: 24 about prefix 0.0.0.0/0(10012) 2 vrf: 0 2020/08/26 17:55:53.910912 PBR: route_notify_owner: [0.0.0.0/0] Route installed succeeded for table: 10012 2020/08/26 17:55:55.400516 ZEBRA: RTM_DELROUTE 0.0.0.0/0 vrf default(0) table_id: 10012 metric: 20 Admin Distance: 0 2020/08/26 17:55:55.400527 ZEBRA: rib_delete: (0:10012):0.0.0.0/0: rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr) was deleted from kernel, adding We were receiving a notification from the kernel that the route was deleted and deciding that we needed to reinstall it. At that point in time when it got into the dplane handlers to convert it to the dplane pthread, the dplane decided to drop the request convert it too a success and not do anything. This code change removes the conversion from this failure to success and notifies the upper level about it. After this change the default route to table 10012 is now properly marked as rejected: root@mlx-2700-07:mgmt:/var/log/frr# vtysh -c "show ip route table 10012" Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric, > - selected route, * - FIB route, q - queued route, r - rejected route VRF default table 10012: F>r 0.0.0.0/0 [200/0] via 172.168.1.164, isp2-uplink (vrf PUBLIC), weight 1, 00:24:48 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-21zebra: fix SA warning in rib_process()Mark Stapp
Fix an SA warning about a possible NULL pointer deref in rib_process(). Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-08-19zebra: Add table id to debug outputDonald Sharp
There are a bunch of places where the table id is not being outputed in debug messages for routing changes. Add in the table id we are operating on. This is especially useful for the case where pbr is working. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-12lib, zebra: add support for sending ARP requestsJakub Urbańczyk
We can make the Linux kernel send an ARP/NDP request by adding a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag. This commit adds new dataplane operation as well as new zapi message to allow other daemons send ARP/NDP requests. Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-10zebra: remove "PENDING" dplane request stateJakub Urbańczyk
This request state is redundant with new message batching interface. Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-07lib, zebra: Add SR-TE policy infrastructure to zebraSebastien Merle
For the sake of Segment Routing (SR) and Traffic Engineering (TE) Policies there's a need for additional infrastructure within zebra. The infrastructure in this PR is supposed to manage such policies in terms of installing binding SIDs and LSPs. Also it is capable of managing MPLS labels using the label manager, keeping track of nexthops (for resolving labels) and notifying interested parties about changes of a policy/LSP state. Further it enables a route map mechanism for BGP and SR-TE colors such that learned BGP routes can be mapped onto SR-TE Policies. This PR does not introduce any usable features by now, it is just infrastructure for other upcoming PRs which will introduce 'pathd', a new SR-TE daemon. Co-authored-by: Renato Westphal <renato@opensourcerouting.org> Co-authored-by: GalaxyGorilla <sascha@netdef.org> Signed-off-by: Sebastien Merle <sebastien@netdef.org>
2020-07-27Merge pull request #6765 from mjstapp/backup_nhg_netlinkRenato Westphal
lib,zebra: support multiple backup nexthops
2020-07-17Merge pull request #6753 from mjstapp/fix_zebra_backup_saStephen Worley
zebra: fix SA warnings in backup nexthop code
2020-07-17zebra: add a route_entry flag for FIB-specific nexthopsMark Stapp
Add a route_entry flag to indicate the presence of a fib (installed) list of nexthops - more explicit and clearer. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-17lib,sharpd,zebra: initial support for multiple backup nexthopsMark Stapp
Initial changes to support a nexthop with multiple backups. Lib changes to hold a small array in each primary, zapi message changes to support sending multiple backups, and daemon changes to show commands to support multiple backups. The config input for multiple backup indices is not present here. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-16zebra: fix SA warnings in backup nexthop codeMark Stapp
Fix a couple of recent SA warnings that came from backup nexthop/nhlfe changes. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-14*: un-split strings across linesDavid Lamparter
Remove mid-string line breaks, cf. workflow doc: .. [#tool_style_conflicts] For example, lines over 80 characters are allowed for text strings to make it possible to search the code for them: please see `Linux kernel style (breaking long lines and strings) <https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_ and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_. Scripted commit, idempotent to running: ``` python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'` ``` Signed-off-by: David Lamparter <equinox@diac24.net>
2020-07-07zebra: improve logic handling backup nexthop installationMark Stapp
When handling a fib notification event that involves a route with backup nexthops, be clearer about representing the installed state of the backups: any installed backup will be on a dedicated route_entry list. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07zebra: add fib nhg for backups, revise apiMark Stapp
Add an nhg for the fib-installed backup nexthops; rename an api to access the fib-installed nexthop nhg. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-26zebra: prepare data plane for batchingJakub Urbańczyk
* Add new zebra_dplane_result to allow kernel updates not to return a result immediately. Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-25zebra: improve route_entry comparison logicMark Stapp
Improve and centralize some logic used to a) compare two route_entries, and b) to locate a route_entry that matches a dplane context object that contains the results of a fib update. We were not rigorous enough in checking routes' properties, especially when examining connected routes where we allow multiple route_entries. Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-10zebra: convert ip rule installation to use dplane threadJakub Urbańczyk
* Implement new dataplane operations * Convert existing code to use dataplane context object * Modify function preparing netlink message to use dataplane context object Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-01Merge pull request #6480 from volta-networks/feat_pwstatusRenato Westphal
ldpd: Relay data plane pseudowire status in LDP notification