summaryrefslogtreecommitdiff
path: root/zebra/rib.h
AgeCommit message (Collapse)Author
2024-10-06zebra: remove unused function rib_lookup_ipv4Donna Sharp
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-09-16zebra: Expose _route_entry_dump_nh so it can be used.Donald Sharp
Expose this helper function so it can be used in zebra_nhg.c Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-10Merge pull request #15259 from dmytroshytyi-6WIND/nexthop_resolutionRuss White
zebra: add LSP entry to nexthop via recursive (part 2)
2024-09-05zebra: Modify show `zebra dplane providers` to give more dataDonald Sharp
The show zebra dplane provider command was ommitting the input and output queues to the dplane itself. It would be nice to have this insight as well. New output: r1# show zebra dplane providers dataplane Incoming Queue from Zebra: 100 Zebra dataplane providers: Kernel (1): in: 6, q: 0, q_max: 3, out: 6, q: 14, q_max: 3 dplane_fpm_nl (2): in: 6, q: 10, q_max: 3, out: 6, q: 0, q_max: 3 dataplane Outgoing Queue to Zebra: 43 r1# Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27zebra: Handle kernel routes appropriatelyDonald Sharp
Current code intentionally ignores kernel routes. Modify zebra to allow these routes to be read in on linux. Also modify zebra to look to see if a route should be treated as a connected and mark it as such. Additionally this should properly handle some of the issues being seen with NOPREFIXROUTE. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27zebra: Expose rib_update_handle_vrf_allDonald Sharp
This function will be used on interface down events to allow for kernel routes to be cleaned up. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27zebra: Make p and src_p const for rib_deleteDonald Sharp
The prefix'es p and src_p are not const. Let's make them so. Useful to signal that we will not change this data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-07-01*: Add and use option for graceful (re)startvivek
Add a new start option "-K" to libfrr to denote a graceful start, and use it in zebra and bgpd. zebra will use this option to denote a planned FRR graceful restart (supporting only bgpd currently) to wait for a route sync completion from bgpd before cleaning up old stale routes from the FIB. An optional timer provides an upper-bounds for this cleanup. bgpd will use this option to denote either a planned FRR graceful restart or a bgpd-only graceful restart, and this will drive the BGP GR restarting router procedures. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
2024-06-07zebra: the header containing the re->flags is in zclient.hPhilippe Guibert
Change the comment in the code that refers to ZEBRA_FLAG_XXX defines. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
2024-02-05zebra: Reorg `struct route_entry` to have important bits firstDonald Sharp
The `struct route_entry` had items that were almost never used at the front of the data structure resulting in items that would be loaded first into memory that were never used. Let's reorg a tiny bit and put all the frequently used items in the first cache line. I'm sure people will notice .000000001 speedup new layout: sharpd@eva /w/h/s/frr1 (reorg_route_entry)> /home/sharpd/pahole/build/pahole --reorganize --show_reorg_steps -C route_entry zebra/.libs/zebra struct route_entry { struct re_list_item next; /* 0 8 */ struct nhg_hash_entry * nhe; /* 8 8 */ uint32_t nhe_id; /* 16 4 */ uint32_t nhe_installed_id; /* 20 4 */ int type; /* 24 4 */ vrf_id_t vrf_id; /* 28 4 */ uint32_t table; /* 32 4 */ uint32_t metric; /* 36 4 */ uint32_t mtu; /* 40 4 */ uint32_t nexthop_mtu; /* 44 4 */ uint32_t flags; /* 48 4 */ uint32_t status; /* 52 4 */ uint32_t dplane_sequence; /* 56 4 */ uint16_t instance; /* 60 2 */ uint8_t distance; /* 62 1 */ /* XXX 1 byte hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ route_tag_t tag; /* 64 4 */ /* XXX 4 bytes hole, try to pack */ time_t uptime; /* 72 8 */ struct re_opaque * opaque; /* 80 8 */ struct nexthop_group fib_ng; /* 88 32 */ struct nexthop_group fib_backup_ng; /* 120 32 */ /* size: 152, cachelines: 3, members: 20 */ /* sum members: 147, holes: 2, sum holes: 5 */ /* last cacheline: 24 bytes */ }; Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-12-11zebra: The dplane_fpm_nl return path leaks memoryDonald Sharp
The route entry created when using a ctx to pass route entry data backup to the master pthread in zebra is being leaked. Prevent this from happening. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-12-07zebra: enqueue NHG_DEL in rib_nhg meta queuePhilippe Guibert
The NHG_DEL operation is done directly from ZAPI call, whereas the NHG_ADD operation is done in the rib_nhg meta queue. This may be problematic when ADD is followed by DEL. Imagine a scenarion with two protocol NHIDs. <NH1> depends of <NH2> and <NH3>. The deletion of <NH3> at the protocol level will trigger 2 messages to ZEBRA: NHG_ADD(<NH1>) and NHG_DEL(<NH3>). Those operations are properly enqueued in ZAPI, but in the end, the NHG_DEL is executed first. This causes NHG_ADD to unlink an already freed NHG. Fix this by consistently enqueuing NHG_DEL and NHG_ADD operations. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-12-05Merge pull request #12600 from donaldsharp/local_routesRuss White
*: Introduce Local Host Routes to FRR
2023-11-21zebra: On shutdown, ensure ctx's in rib_dplane_q are freedDonald Sharp
a) Rename rib_init to zebra_rib_init() to better follow how things are named b) on shutdown cycle through the rib_dplane_q and free up any contexts sitting in it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-06zebra: Move v6_rr_semantics to be part of zrouter structureDonald Sharp
Move global variable v6_rr_semantics from a global data structure into the zrouter data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-01*: Introduce Local Host Routes to FRRDonald Sharp
Create Local routes in FRR: S 0.0.0.0/0 [1/0] via 192.168.119.1, enp39s0, weight 1, 00:03:46 K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:03:51 O 192.168.119.0/24 [110/100] is directly connected, enp39s0, weight 1, 00:03:46 C>* 192.168.119.0/24 is directly connected, enp39s0, 00:03:51 L>* 192.168.119.224/32 is directly connected, enp39s0, 00:03:51 O 192.168.119.229/32 [110/100] via 0.0.0.0, enp39s0 inactive, weight 1, 00:03:46 C>* 192.168.119.229/32 is directly connected, enp39s0, 00:03:46 Create ability to redistribute local routes. Modify tests to support this change. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-31zebra: Fix zebra crash when replacing NHE during shutdownRajasekar Raja
During replace of a NHE from upper proto in zebra_nhg_proto_add(), - rib_handle_nhg_replace() is invoked with old NHE where we walk all RNs/REs & replace the re->nhe whose address points to old NHE. - In this walk, if prev re->nhe refcnt is decremented to 0, we free up the memory which the old NHE is pointing to. Later in zebra_nhg_proto_add(), we end up accessing this freed memory and crash. Logs: 1380766 2023/08/16 22:34:11.994671 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 2 => 1 1380773 2023/08/16 22:34:11.994678 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 1 => 0 1380777 2023/08/16 22:34:11.994844 ZEBRA: [JE46R-G2NEE] zebra_nhg_release: nhe 0x56091d890840 (70312519[2756/2762/2810]) 1380778 2023/08/16 22:34:11.994849 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (70312519[2756/2762/2810]), refcnt 0 1380782 2023/08/16 22:34:11.995000 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (0[]), refcnt 0 1380783 2023/08/16 22:34:11.995011 ZEBRA: lib/memory.c:84: mt_count_free(): assertion (mt->n_alloc) failed Backtrace: 0  0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6 1  0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6 2  0x00007f833f636648 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 3  0x00007f833f63cd6a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 4  0x00007f833f63cfb4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 5  0x00007f833f63fbc8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 6  0x00007f833f64172a in malloc () from /lib/x86_64-linux-gnu/libc.so.6 7  0x00007f833f6c3fd2 in backtrace_symbols () from /lib/x86_64-linux-gnu/libc.so.6 8  0x00007f833f9013fc in zlog_backtrace_sigsafe (priority=priority@entry=2, program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:222 9  0x00007f833f901593 in zlog_signal (signo=signo@entry=6, action=action@entry=0x7f833f988ee8 "aborting...", siginfo_v=siginfo_v@entry=0x7ffee1ce4a30,     program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:154 10 0x00007f833f92dbd1 in core_handler (signo=6, siginfo=0x7ffee1ce4a30, context=<optimized out>) at lib/sigevent.c:254 11 <signal handler called> 12 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6 13 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6 14 0x00007f833f958f96 in _zlog_assert_failed (xref=xref@entry=0x7f833f9e4080 <_xref.10705>, extra=extra@entry=0x0) at lib/zlog.c:680 15 0x00007f833f905400 in mt_count_free (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:84 16 mt_count_free (ptr=0x51, mt=0x7f833fa02800 <MTYPE_NH_LABEL>) at lib/memory.c:80 17 qfree (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:140 18 0x00007f833f90799c in nexthop_del_labels (nexthop=nexthop@entry=0x56091d776640) at lib/nexthop.c:563 19 0x00007f833f907b91 in nexthop_free (nexthop=0x56091d776640) at lib/nexthop.c:393 20 0x00007f833f907be8 in nexthops_free (nexthop=<optimized out>) at lib/nexthop.c:408 21 0x000056091c21aa76 in zebra_nhg_free_members (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628 22 zebra_nhg_free (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628 23 0x000056091c21bab2 in zebra_nhg_proto_add (id=<optimized out>, type=9, instance=<optimized out>, session=0, nhg=nhg@entry=0x56091d7da028, afi=afi@entry=AFI_UNSPEC)     at zebra/zebra_nhg.c:3532 24 0x000056091c22bc4e in process_subq_nhg (lnode=0x56091d88c540) at zebra/zebra_rib.c:2689 25 process_subq (qindex=META_QUEUE_NHG, subq=0x56091d24cea0) at zebra/zebra_rib.c:3290 26 meta_queue_process (dummy=<optimized out>, data=0x56091d24d4c0) at zebra/zebra_rib.c:3343 27 0x00007f833f9492c8 in work_queue_run (thread=0x7ffee1ce55a0) at lib/workqueue.c:285 28 0x00007f833f93f60d in thread_call (thread=thread@entry=0x7ffee1ce55a0) at lib/thread.c:2008 29 0x00007f833f8f9888 in frr_run (master=0x56091d068660) at lib/libfrr.c:1223 30 0x000056091c1b8366 in main (argc=12, argv=0x7ffee1ce5988) at zebra/main.c:551 Issue: 3492162 Ticket# 3492162 Signed-off-by: Chirag Shah <chirag@nvidia.com> Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2023-08-25Merge pull request #14256 from rodecker/rt-table-idDonald Sharp
zebra: Make main routing table (RT_TABLE_MAIN) configurable
2023-08-22zebra: Make main routing table (RT_TABLE_MAIN) configurableMartin Pels
Signed-off-by: Martin Pels <mpels@ripe.net>
2023-08-09zebra: fix assert in process_subq_routePavel Ivashchenko
Bug is reporoduced in case of switching interfaces betwean VRFs. ospf6d is enabled and configured in each VRF. 'dest' can be removed from the route node in the time when the same route node waiting processing in another sub-queue. A route node must only be in one sub-queue at a time. Details: 1. Config: interface if0 ipv6 address 2001:db8:cafe:2::2/64 ipv6 nat inside ipv6 ospf6 area 0.0.0.51 ipv6 ospf6 cost 10 vrf test2 exit ! interface if1 ipv6 address 2001:db8:cafe:4::1/64 ipv6 nat outside ipv6 ospf6 area 0.0.0.0 ipv6 ospf6 cost 10 vrf test2 exit ! router ospf6 ospf6 router-id 2.2.2.2 exit ! router ospf6 vrf test1 ospf6 router-id 2.2.2.2 exit ! router ospf6 vrf test2 ospf6 router-id 2.2.2.2 exit I just quickly switched interfaces between different VRFs (default/test1/test2). 2. Log messages: Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:?):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595ae40 (connected) existing 0x0, same_count 0 Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595ae40 for 2001:db8:cafe:2::/64 vrf default(0) Aug 02 16:51:56 ubuntu zebra[386985]: [GCGMT-SQR82] rib_link: (0:?):2001:db8:cafe:2::/64: rn 0x56267593de90 adding dest Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40, removing Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595abf0 (ospf6) existing 0x0, same_count 1 Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595abf0 for 2001:db8:cafe:2::/64 vrf default(0) Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [YEYFX-TDSC2] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing unneeded re 0x56267595ae40 Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40 Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0, removing Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue RIP/OSPF/ISIS/EIGRP/NHRP Routes Aug 02 16:51:56 ubuntu zebra[386985]: [NZNZ4-7P54Y] default(0:254):2001:db8:cafe:2::/64: Processing rn 0x56267593de90 Aug 02 16:51:56 ubuntu zebra[386985]: [ZJVZ4-XEGPF] default(0:254):2001:db8:cafe:2::/64: Examine re 0x56267595abf0 (ospf6) status: Removed Changed flags: None dist 110 metric 10 Aug 02 16:51:56 ubuntu zebra[386985]: [NM15X-X83N9] rib_process: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing re 0x56267595abf0 Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0 Aug 02 16:51:56 ubuntu zebra[386985]: [KT8QQ-45WQ0] rib_gc_dest: (0:?):2001:db8:cafe:2::/64: removing dest from table Aug 02 16:51:56 ubuntu zebra[386985]: [HH6N2-PDCJS] default(0:0):2001:db8:cafe:2::/64 rn 0x56267593de90 dequeued from sub-queue Connected Routes 3. ...and then assert: (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=140662163115136, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007fee76753476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007fee767397f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x00007fee76a420fd in _zlog_assert_failed () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 #6 0x0000562674efe0f0 in process_subq_route (qindex=7 '\a', lnode=0x562675940c60) at zebra/zebra_rib.c:2540 #7 process_subq (qindex=META_QUEUE_NOTBGP, subq=0x562675574580) at zebra/zebra_rib.c:3055 #8 meta_queue_process (dummy=<optimized out>, data=0x56267556d430) at zebra/zebra_rib.c:3091 #9 0x00007fee76a386e8 in work_queue_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 #10 0x00007fee76a31c91 in thread_call () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 #11 0x00007fee769ee528 in frr_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 #12 0x0000562674e97ec5 in main (argc=5, argv=0x7ffd1e275958) at zebra/main.c:478 (gdb) print lnode->data $10 = (void *) 0x56267593de90 (gdb) p/x *(struct route_node *)0x56267593de90 $11 = { p = { family = 0xa, prefixlen = 0x40, u = { prefix = 0x20, prefix4 = { s_addr = 0xb80d0120 }, prefix6 = { __in6_u = { __u6_addr8 = {0x20, 0x1, 0xd, 0xb8, 0xca, 0xfe, 0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, __u6_addr16 = {0x120, 0xb80d, 0xfeca, 0x200, 0x0, 0x0, 0x0, 0x0}, __u6_addr32 = {0xb80d0120, 0x200feca, 0x0, 0x0} } }, ... table = 0x5626755ae010, parent = 0x5626755ae070, link = {0x0, 0x0}, lock = 0x4, nodehash = { hi = { next = 0x5626755ae0d0, hashval = 0xebe8bdbf } }, info = 0x0 3. What's happen: We removed unneeded re 0x56267595ae40 while adding re 0x56267595abf0. It was the last connected re, but rn 0x56267593de90 is still in the connected sub-queue. Then rib_delnode was called for 0x56267595abf0. (rn 0x56267593de90 is still in the connected sub-queue). rib_delnode have called rib_meta_queue_add which have checked, that rn is absent in sub-queue RIP/OSPF/ISIS/EIGRP/NHRP and have added rn in the second sub-queue. Fixes: d7ac4c4d88 ("zebra: Introduce early route processing on the MetaQ") Signed-off-by: Pavel Ivashchenko <pivashchenko@nfware.com>
2023-07-10zebra: use NHRP routes as valid in nexthop checkMark Stapp
Treat NHRP-installed routes as valid, as if they were CONNECTED routes, when checking candidate routes' nexthops for validity. This allows use of NHRP by an IGP, for example, that doesn't normally want recursive nexthop resolution. Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-01zebra: Unlock the route node when sending route notificationsDonald Sharp
When using a context to send route notifications to upper level protocols, the code was using a locking function to get the route node. There is no need for this to be locked as such FRR should free it up. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-04Merge pull request #13145 from donaldsharp/do_deleteJafar Al-Gharaibeh
Improve and fix zebra GR
2023-03-31zebra: Cleanup ctx leak on shutdown and turn off eventDonald Sharp
two things: On shutdown cleanup any events associated with the update walker. Also do not allow new events to be created. Fixes this mem-leak: ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from: ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #8 0x7f0dd0797f0c in event_call lib/event.c:1995 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #10 0x55b42fa30caa in main zebra/main.c:465 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308 ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- ./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29zebra: Ensure gr events run after Meta Queue has runDonald Sharp
BGP signals to zebra that a afi has converged immediately after it has finished processing all routes for a given afi/safi. This generates events in zebra in this order a) Routes received from BGP, placed on early-rib Meta-Q b) Signal GR for the afi. Now imagine that zebra reads GR code and immediately processes routes that are in the actual rib and removes some routes. This generates a c) route deletion to the kernel for some number of routes that may be in the the early-rib Meta-Q d) Process the Meta-Q, and re-install the routes This is undesirable behavior in zebra. In that while we may end up in a correct state, there will be a blip for some number of routes that happen to be in the early rib Meta-Q. Modify the GR code to have it's own processing entry at the end of the Meta-Q. This will allow all routes to be processed and ready for handling by the Graceful Restart code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Rename `struct thread` to `struct event`Donald Sharp
Effectively a massive search and replace of `struct thread` to `struct event`. Using the term `thread` gives people the thought that this event system is a pthread when it is not Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-21Merge pull request #12798 from donaldsharp/rib_match_multicastRuss White
Rib match multicast
2023-02-17Merge pull request #12780 from opensourcerouting/spdx-license-idDonald Sharp
*: convert to SPDX License identifiers
2023-02-16zebra: Remove code duplication for v4 and v6 versions of rib_match_multicastDonald Sharp
a) Consolidate v4 and v6 versions of rib_match_multicast b) Improve debug to show what we matched against as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-10lib, zebra: Move ZEBRA_ON_RIB_PROCESS_HOOK_CALLDonald Sharp
The define of ZEBRA_ON_RIB_PROCESS_HOOK_CALL was in zebra.h which exposes it to everyone, except zebra is the only daemon to use this define. This does not beling in zebra.h Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-10-26zebra: Fix handling of recursive routes when processing closely in timeDonald Sharp
When zebra receives routes from upper level protocols it decodes the zapi message and places the routes on the metaQ for processing. Suppose we have a route A that is already installed by some routing protocol. And there is a route B that has a nexthop that will be recursively resolved through A. Imagine if a route replace operation for A is going to happen from an upper level protocol at about the same time the route B is going to be installed into zebra. If these routes are received, and decoded, at about the same time there exists a chance that the metaQ will contain both of them at the same time. If the order of installation is [ B, A ]. B will be resolved correctly through A and installed, A will be processed and re-installed into the FIB. If the nexthops have changed for A then the owner of B should be notified about the change( and B can do the correct action here and decide to withdraw or re-install ). Now imagine if the order of routes received for processing on the metaQ is [ A, B ]. A will be received, processed and sent to the dataplane for reinstall. B will then be pulled off the metaQ and fail the install since A is in a `not Installed` state. Let's loosen the restriction in nexthop resolution for B such that if the route we are dependent on is a route replace operation allow the resolution to suceed. This requires zebra to track a new route state( ROUTE_ENTRY_ROUTE_REPLACING ) that can be looked at during nexthop resolution. I believe this is ok because A is a route replace operation, which could result in this: -route install failed, in which case B should be nht'ing and will receive the nht failure and the upper level protocol should remove B. -route install succeeded, no nexthop changes. In this case allowing the resolution for B is ok, NHT will not notify the upper level protocol so no action is needed. -route install succeeded, nexthops changes. In this case allowing the resolution for B is ok, NHT will notify the upper level protocol and it can decide to reinstall B or not based upon it's own algorithm. This set of events was found by the bgp_distance_change topotest(s). Effectively the tests were looking for the bug ( A, B order in the metaQ ) as the `correct` state. When under very heavy load, the A, B ordering caused A to just be installed and fully resolved in the dataplane before B is gotten to( which is entirely possible ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-09-24zebra: fix fpm crashanlan_cs
Fix issue#11996. When removing VRF ( all routes of this VRF), zebra mistakenly forgot to check whether its routes are in update queue of FPM. So FPM module will crash during its dealing with these routes, which are already freed. Add a new HOOK `rib_shutdown()`, `zebra_rtable_node_cleanup()` will use it to remove these routes from update queue of FPM module before freeing them. Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-08-17zebra: Create a zebra_rib_route_entry_new function and use itDonald Sharp
Abstract the creation of the route_entry and use it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-17zebra: Introduce early route processing on the MetaQDonald Sharp
Currently if an operator does this operation: sharpd@eva ~/frr8> sudo ip nexthop add id 5000 via 192.168.119.44 dev enp39s0 ; sudo ip route add 10.0.0.1 nhid 5000 2022/06/30 08:52:40 ZEBRA: [ZHQK5-J9M1R] proto2zebra: Please add this protocol(0) to proper rt_netlink.c handling 2022/06/30 08:52:40 ZEBRA: [PS16P-365FK][EC 4043309076] Zebra failed to find the nexthop hash entry for id=5000 in a route entry sharpd@eva ~/frr8> vtysh -c "show ip route 10.0.0.1" Routing entry for 0.0.0.0/0 Known via "kernel", distance 0, metric 100, best Last update 00:01:58 ago * 192.168.119.1, via enp39s0 The route is dropped by zebra with no warnings. This is not good, but unlikely to happen at this point in time. In order to fix this issue route processing from inputs needs to happen after nexthop group processing from inputs. This was not possible because nexthop groups are placed on the metaQ. As such the above nexthop group creation is placed on the metaQ for processing in META_QUEUE_NHG. Then the route is read in and processed immediately. The nexthop group is not found ( not processed yet!) and the route is dropped in zebra. Modify the code to have early route processing of validity on the MetaQ. This preserves the order of operations. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-17zebra: Convert label processing to Meta-QDonald Sharp
Convert label processing that comes from zapi messages into being handled by the meta-Q. This is because early route processing is going to be moved to the meta-Q as well and we will have a chicken and egg problem without moving this code to be processed by the meta-Q. Ordering of messages from ospf as an example: 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:62] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:43] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:61] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:66] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] 2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36] The ZEBRA_MPLS_LABELS_REPLACE immediately turn around and attempt to replace nexthop labels on routes that were added. If the route add is placed on the metaQ, it will not exist yet and as such the label replace will fail. Modify the zebra code to take the label operations and place them on the metaQ as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-10zebra: Combine meta_queue_free and meta_queue_vrf_free functionsDonald Sharp
These functions essentially do the same thing. Combine them for the goodness of mankind. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-05-21zebra: clean up rtadv integrationDavid Lamparter
Move a few things into places they actually belong, and reduce the number of places we have `#ifdev HAVE_RTADV`. Just overall code prettification. ... I had actually done this quite a while ago while doing some other random hacking and thought it more useful to not be sitting on it on my disk... Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-05-13zebra: Remove unused function `route_entry_copy_nexthops`Donald Sharp
This function is no longer used. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-26zebra: add rib_match_ipv6_multicast variantDavid Lamparter
... for IPv6, analogous to v4. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-12zebra: Remove unused ZEBRA_NHT_EXACT_MATCHDonald Sharp
This usage was removed in an earlier bit of code do some final cleanup Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23*: Change thread->func to return void instead of intDonald Sharp
The int return value is never used. Modify the code base to just return a void instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-07zebra: Fix ships in the night issueDonald Sharp
When using wait for install there exists situations where zebra will issue several route change operations to the kernel but end up in a state where we shouldn't be at the end due to extra data being received. Example: a) zebra receives from bgp a route change, installs sends the route to the kernel. b) zebra receives a route deletion from bgp, removes the struct route entry and then sends to the kernel a deletion. c) zebra receives an asynchronous notification that (a) succeeded but we treat this as a new route. This is the ships in the night problem. In this case if we receive notification from the kernel about a route that we know nothing about and we are not in startup and we are doing asic offload then we can ignore this update. Ticket: #2563300 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-02Merge pull request #10409 from idryzhov/zebra-mq-clean-crashMark Stapp
zebra: fix cleanup of meta queues on vrf disable
2022-02-01zebra: fix cleanup of meta queues on vrf disableIgor Ryzhov
Current code treats all metaqueues as lists of route_node structures. However, some queues contain other structures that need to be cleaned up differently. Casting the elements of those queues to struct route_node and dereferencing them leads to a crash. The crash may be seen when executing bgp_multi_vrf_topo2. Fix the code by using the proper list element types. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2022-01-31zebra: name the route_entry opaque struct more specificallyMark Stapp
The name 'opaque' is a little general - call the route_entry struct 're_opaque' to make it more specific. Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-01-18zebra: Modify route_notify_internal to use a route_nodeDonald Sharp
Pass in the route_node that is under consideration into route_notify_internal to allow calling functions to reduce stack size as well as looking up data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-01-13zebra: Fix for route node having no tracking NHTSarita Patra
Topology: IXIA-----(ens192)FRR(ens224)------iXIA Configuration: 1. Create 8 sub-interfaces on ens192 under Default VRF and configure 8 EBGP session between FRR and IXIA. 2. Create 1000 sub-interfaces on ens224 under Default VRF and configure 1000 EBGP session between FRR and IXIA. 3. 2M prefixes distributed from Left side Ixia each with 8 ECMP path. 4. So in total, there are 2M prefixes * 8 ECMP = 16M prefixes entries in RIB and FIB. Issue: Shut ens192 and ens224, this is taking 1hr 15 mins to clean up the routes. Root Cause: In the case of route deletion, if the particular route node is having nht count = 0, we are going to the parent and doing nht evaluation, which is not needed. Fix: If the deleted the route node is having nht count > 0, then do a nht evaluation on the parent node. Shut ens192 and ens224, it is taking 1 min to clean up the routes with the fix. Signed-off-by: Sarita Patra <saritap@vmware.com>
2021-11-23zebra: add installed nexthop-group id valueMark Stapp
In some cases, zebra may install a nexthop-group id that is different from the id of the nhe struct attached to a route-entry. This happens for a singleton recursive nexthop, for example, where a route is installed with the resolving nexthop's id. The installed value is the most useful value - that corresponds to information in the kernel on linux/netlink platforms that support nhgs. Display both values if they differ in ascii output, and include both values in the json form. Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2021-09-27zebra: Start carrying safi for rnh processingDonald Sharp
PIM is going to need to be able to send down the address it is trying to resolve in the multicast rib. We need a way to signal this to the end developer. Start the conversion by adding the ability to have a safi. But only allow SAFI_UNICAST at the moment. Signed-off-by: Donald Sharp <sharpd@nvidia.com>