Add a callback function `isis_zebra_process_srv6_locator_delete()` that
is called when an SRv6 locator is deleted in zebra.
When an existing SRv6 locator is deleted in zebra, zebra sends a
ZEBRA_SRV6_LOCATOR_DELETE notification to all daemons informing them of
the deleted locator.
In IS-IS, we register the new `isis_zebra_process_srv6_locator_delete()`
callback as the handler for ZEBRA_SRV6_LOCATOR_DELETE.
This callback iterates over all areas of the current IS-IS instance and
looks for an area for which the deleted locator was configured.
If a match is found, we remove
the locator's chunks from the area's chunks list and call
`lsp_regenerate_schedule` to remove the locator from the SRv6 Locator
TLV advertised in the LSPs and regenerate the LSPs.
If no match is found, we do nothing.
isisd: Add func to process a received SRv6 locator
Add a callback function `isis_zebra_process_srv6_locator_add()` that is
called upon receiving an SRv6 locator from zebra.
When a new SRv6 locator is created in zebra, zebra sends a
ZEBRA_SRV6_LOCATOR_ADD notification to all daemons informing them of the
new locator.
In IS-IS, we register the new `isis_zebra_process_srv6_locator_add()`
callback as the handler for ZEBRA_SRV6_LOCATOR_ADD.
This callback iterates over all areas of the current IS-IS instance and
looks for an area for which the new locator was configured.
If a match is found, we call
`isis_zebra_srv6_manager_get_locator_chunk()` to ask zebra a chunk from
the locator.
If no match is found, we do nothing.
isisd: Add function to process received SRv6 chunk
Add a callback function that is called upon receiving an SRv6 locator
chunk from zebra.
This function iterates over all areas of the current IS-IS instance and
looks for an area for which the received chunk was requested.
If a match is found, the new chunk is added to the area's chunk list and
`lsp_regenerate_schedule()` is called to regenerate the LSPs to
advertise the new SRv6 locator.
If no match is found, we free the allocated resources and do nothing.
Add a list of SRv6 locator chunks allocated to a specific IS-IS area.
The list is initialized when the IS-IS area is created and freed when
the IS-IS area is destroyed. Subsequent commits will introduce the
possibility to allocate and release locator chunks.
isisd: Add SRv6 locator name to SRv6 configuration
Add the name of the SRv6 locator to use with IS-IS to the per-area SRv6
configuration. If an SRv6 locator is not configured for an IS-IS
instance, the locator name is an empty string. When an IS-IS instance is
configured to use an SRv6 locator, the locator name stores the name of
the selected locator.
Subsequent commits will add the possibility to set and unset an SRv6
locator for a specific IS-IS instance.
Add a CLI command to print SRv6 capabilities, algorithms and MSDs
supported by the IS-IS nodes.
Example:
r1# show isis segment-routing srv6 node
Area FOO:
IS-IS L1 SRv6-Nodes:
IS-IS L2 SRv6-Nodes:
System ID Algorithm SRH Max SL SRH Max End Pop SRH Max H.encaps SRH Max End D
-----------------------------------------------------------------------------------------
1111.1111.1111 SPF 16 0 1 2
2222.2222.2222 SPF 16 0 1 2
Update the `isis_router_cap_tlv_size` function to take into account the
SRv6 Capabilities Sub-TLV and SRv6-related MSDs when calculating the
size needed to pack the Router Capabilities TLV.
`isis_srv6_area_term()` cleans up SRv6 information for a specific
IS-IS area. This commit adds a new function `isis_srv6_term()` that will
be used to perform global SRv6 cleanup.
`isis_srv6_area_init()` initializes SRv6 information for a specific
IS-IS area. This commit adds a new function `isis_srv6_init()` that will
be used to perform global SRv6 initialization.
bgpd: Fix session reset issue caused by malformed core attributes
RCA:
On encountering any attribute error for core attributes in update message,
the error handling is set to 'treat as withdraw' and
further parsing of the remaining attributes is skipped.
But the stream pointer is not being correctly adjusted to
point to the next NLRI field skipping the rest of the attributes.
This leads to incorrect parsing of the NLRI field,
which causes BGP session to reset.
Fix:
The stream pointer offset is rightly adjusted to point to the NLRI field correctly
when the malformed attribute is encountered and remaining attribute parsing is skipped.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
Donald Sharp [Mon, 31 Jul 2023 12:45:50 +0000 (08:45 -0400)]
tests: Convert d1 and d2 to output and expected in gen_json_diff_report
The output of gen_json_diff_report is used all over the place and
it outputs d1 and d2. Let's change this to output and expected
as that is how it is used. Should help with debugging.
Problem:
-------
- Send IGMP/MLD join and traffic.
LHR: (S,G) mroute is created with reference count = 2
and set the flag SRC_STREAM.
(Code flow: pim_mroute_msg_wholepkt -> pim_upstream_add,
pim_upstream_sg_running_proc -> pim_upstream_ref)
- Send IGMP/MLD prune.
LHR: removes (*,G) entry and it tries to remove childen (S,G) entries.
But (S,G) is having reference count = 2. So after prune,
(S,G) entry reference count becomes 1 and will be present
until KAT expires.
Fix:
---
Don't set SRC_STREAM flag for LHR.
In LHR, (S,G) should be maintained, until (*,G) is present.
When prune receives delete (*,G) and children (S,G).
When traffic stops, delete (S,G) after KAT expires.
Donald Sharp [Thu, 27 Jul 2023 19:36:33 +0000 (15:36 -0400)]
tests: Convert isis to use 1 and 10 for hello/multiplier
Current isis tests use a variety of hello timers as well
as hello-multiplier, let's modify all of the isis test
cases to use 1 and 10. This cleans up some spurious test
failures I was seeing locally. As an example without
these changes running isis_tilfa_topo1 2r6 times I would
see 5-10 test failures now I am seeing ~2 test failures.
In any event part of the problem was that some tests were
not fully converged when looking at them under heavy
system load. Changing this to 1/10 gives us 10 chances
to see the incoming packet.
Donald Sharp [Sat, 29 Jul 2023 17:26:26 +0000 (13:26 -0400)]
tests: bfd_bgp_cbit_topo3 allow bgp to converge before testing
This test was failing upstream a bunch of times. Upon examining
the log files as well as the test script it was noticed that
the bfd peers were checked to see that they had come up. But
both the timers used for bgp as well as not checking that bgp
has actually come up would cause the test to fail in subsuquent
steps if bgp has not come up. Test that bgp peering is actually
established before testing link down events. It's possible
this test might need to be revisited to ensure that the routes
are actually installed and ready to go before as well, but I am
not seeing that right now.
Donald Sharp [Sat, 29 Jul 2023 17:24:55 +0000 (13:24 -0400)]
tests: Fix zebra_seg6_route to give more time for routes to be installed
This test is failing upstream regularly, when inspecting the log
files we see that the route being looked for is in a queued state
when the test fails. Give this test more time for when the
system is under severe load.
Donald Sharp [Sat, 29 Jul 2023 17:22:39 +0000 (13:22 -0400)]
tests: isis_te_topo1 can fail occassionally
Upstream ( and locally ) this test fails. The adj-sid value
being looked for in the testing is a dynamic value that is
assigned based upon how the network comes up. The reality
is that there is no enforced order of what the adj-sid
can be. As such this test looking for this value makes
no sense. Let's remove that from the test.
Additionally bring the isis hello-interval to 1 down
from 3 to make things converge faster.
Donald Sharp [Fri, 28 Jul 2023 14:31:30 +0000 (10:31 -0400)]
tests: zebra_netlink ensure the address is installed
Ran test under high load and system rejected the sharp
install of routes. Only reason that that would happen
would be if the address had not been set by the kernel
yet. The test log files had timestamp precision and the
addition of the sharp routes was under 1/10 of a second
after the address was attempted to be installed.
Starting from step 11, this topotest focuses on validating the TI-LFA
switchover functionality, where the backup nexthops are activated
after an adjacency expires, either with or without BFD.
Currently, the test checks the RIB shortly after the switchover using
a tight 5 seconds interval to ensure that the RIB update is due to the
switchover and not an SPF update (which is configured with an initial
delay of 15 seconds). However, it was observed that the kernel might
take longer than 5 seconds to install routes when the system is under
heavy load. To account for that, double the wait interval so that
this topotest will succeed even in those conditions.
tests: ensure BFD session is up before proceeding to the next step
In this topotest, BFD is configured at the end of step 13. However,
in certain cases where the testing machine is exceptionally fast (e.g.,
Donald's quantum computer), there is a possibility that the interface
shutdown event from step 14 may occur before BFD has had sufficient
time to establish the session, which leads to a test failure. To fix
this problem, ensure the BFD session is up before proceeding to the
next step.
Node-SIDs refer to Prefix-SIDs associated with host prefixes of
loopback addresses. As such, whenever an interface address is added
or deleted, all configured Prefix-SIDs must be reevaluated to check
if the N-flag needs to be set or unset.
This change fixes some race conditions in the TI-LFA topotest where
specific sequence of events could cause Prefix-SIDs to not have the
N-flag set when they should, resulting in various failures.
tests: increase hello multiplier in TI-LFA topotest
In this topotest, the IS-IS hello interval is set to 1 for fast
convergence. However, the current hello multiplier of 3 results in a
tight IS-IS adjacency holdtime of 3 seconds. This tight timeframe can
cause failures when the testing machine is running multiple tests at
full capacity. To improve stability under such conditions, this commit
raises the hello multiplier to 10, providing a more forgiving holdtime
and reducing the likelihood of failures.
When 'no neighbor .. update-source' is issued for a regular peer, that
peer is always reset. This is unnecessary if the peer is a member of a
peer-group and it inherits an identical update-source, so let's skip
the reset/Notification for that condition.
Before:
------------
ub20-2(config-router)# do show ip bgp sum | include .99
192.168.122.99 4 1 36 34 0 0 0 00:00:17 0 0 N/A
ub20-2(config-router)# do show ip bgp neighbors 192.168.122.99 | include Local host
Local host: 100.64.0.3, Local port: 46083
ub20-2(config-router)# no neighbor 192.168.122.99 update-source
ub20-2(config-router)# do show ip bgp sum | include .99
192.168.122.99 4 1 36 35 0 0 0 00:00:01 Idle 0 N/A
ub20-2(config-router)# do show ip bgp neighbors 192.168.122.99 | include Local host
Local host: 100.64.0.3, Local port: 39847
After:
------------
ub20-2(config-router)# do show ip bgp sum | include .99
192.168.122.99 4 1 3 3 0 0 0 00:00:20 0 0 N/A
ub20-2(config-router)# do show ip bgp neighbors 192.168.122.99 | include Local host
Local host: 100.64.0.3, Local port: 39415
ub20-2(config-router)# no neighbor 192.168.122.99 update-source
ub20-2(config-router)# do show ip bgp sum | include .99
192.168.122.99 4 1 3 3 0 0 0 00:00:28 0 0 N/A
ub20-2(config-router)# do show ip bgp neighbors 192.168.122.99 | include Local host
Local host: 100.64.0.3, Local port: 39415
Rajasekar Raja [Sun, 23 Jul 2023 05:43:12 +0000 (05:43 +0000)]
zebra: fix nhe refcnt when frr service goes down
When frr.service is going down(restart or stop),
zebra core can be seen.
Sequence of events leading to crash:
Increments of nhe refcnt:
- Upper level creates a new nhe(say NHE1) —> nhe->refcnt=1
- Two RE’s (Say RE1 & RE2) associate with NHE1 —> nhe->refcnt = 3
Decrements of nhe refcnt:
- BGP sends a zapi msg to zebra to delete NHG. —> nhe->refcnt = 2
- RE1 is queued for delete in META-Q
- As zebra is dissociating with its clients, zebra_nhg_score_proto() is
invoked -> nhe->refcnt=1
- RE2 is no more associated with the NHE1 —>nhe->refcnt=0 &
hence NHE IS FREED
- Now RE1 is dequeued from META-Q for processing the re delete. —> At
this point re->nhe is pointing to freed pointer. CRASH CRASH!!!!
Fix:
- When we iterate zebra_nhg_score_proto_entry() to delete the upper
proto specific nhe’s, we need to skip the additional nhe->refcnt
decrement in case nhe->flags has NEXTHOP_GROUP_PROTO_RELEASED set.
Backtrace-1
0x00007fa8449ce8eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fa8449b9535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fa844d32f86 in _zlog_assert_failed (xref=xref@entry=0x55fa37871040 <_xref.28142>, extra=extra@entry=0x0) at lib/zlog.c:680
0x000055fa3778f770 in rib_re_nhg_free (re=0x55fa39e33770) at zebra/zebra_rib.c:2578
rib_unlink (rn=0x55fa39e27a60, re=0x55fa39e33770) at zebra/zebra_rib.c:3930
0x000055fa3778ff18 in rib_process (rn=0x55fa39e27a60) at zebra/zebra_rib.c:1439
0x000055fa37790b1c in process_subq_route (qindex=8 '\b', lnode=0x55fa39e1c1b0) at zebra/zebra_rib.c:2549
process_subq (qindex=META_QUEUE_BGP, subq=0x55fa3999c580) at zebra/zebra_rib.c:3107
meta_queue_process (dummy=<optimized out>, data=0x55fa3999c480) at zebra/zebra_rib.c:3146
0x00007fa844d232b8 in work_queue_run (thread=0x7ffffbdf6cb0) at lib/workqueue.c:285
0x00007fa844d195fd in thread_call (thread=thread@entry=0x7ffffbdf6cb0) at lib/thread.c:2008
0x00007fa844cd3888 in frr_run (master=0x55fa397b7630) at lib/libfrr.c:1223
0x000055fa3771e294 in main (argc=12, argv=0x7ffffbdf7098) at zebra/main.c:526
Backtrace-2
0x00007f125af3f535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
0x00007f125b2b8f96 in _zlog_assert_failed (xref=xref@entry=0x7f125b344260 <_xref.18768>, extra=extra@entry=0x0) at lib/zlog.c:680
0x00007f125b268190 in nexthop_copy_no_recurse (copy=copy@entry=0x5606dd726f10, nexthop=nexthop@entry=0x7f125b0d7f90, rparent=<optimized out>) at lib/nexthop.c:806
0x00007f125b2681b2 in nexthop_copy (copy=0x5606dd726f10, nexthop=0x7f125b0d7f90, rparent=<optimized out>) at lib/nexthop.c:836
0x00007f125b268249 in nexthop_dup (nexthop=nexthop@entry=0x7f125b0d7f90, rparent=rparent@entry=0x0) at lib/nexthop.c:860
0x00007f125b26b67b in copy_nexthops (tnh=tnh@entry=0x5606dd9ec748, nh=<optimized out>, rparent=rparent@entry=0x0) at lib/nexthop_group.c:457
0x00007f125b26b6ba in nexthop_group_copy (to=to@entry=0x5606dd9ec748, from=from@entry=0x5606dd9ee9f8) at lib/nexthop_group.c:291
0x00005606db6ec678 in zebra_nhe_copy (orig=0x5606dd9ee9d0, id=id@entry=0) at zebra/zebra_nhg.c:431
0x00005606db6ddc63 in mpls_ftn_uninstall_all (zvrf=zvrf@entry=0x5606dd6e7cd0, afi=afi@entry=2, lsp_type=ZEBRA_LSP_NONE) at zebra/zebra_mpls.c:3410
0x00005606db6de108 in zebra_mpls_cleanup_zclient_labels (client=0x5606dd8e03b0) at ./zebra/zebra_mpls.h:471
0x00005606db73e575 in hook_call_zserv_client_close (client=0x5606dd8e03b0) at zebra/zserv.c:566
zserv_client_free (client=0x5606dd8e03b0) at zebra/zserv.c:585
zserv_close_client (client=0x5606dd8e03b0) at zebra/zserv.c:706
0x00007f125b29f60d in thread_call (thread=thread@entry=0x7ffc2a740290) at lib/thread.c:2008
0x00007f125b259888 in frr_run (master=0x5606dd3b7630) at lib/libfrr.c:1223
0x00005606db68d298 in main (argc=12, argv=0x7ffc2a740678) at zebra/main.c:534
Currently, when changing ABR type on a working router, SPF recalculation
will only be initiated if the OSPF flags have changed after this.
Otherwise, SPF recalculation will be omitted and OSPF RIB update will
not occur. In other words, changing ABR type might not result in
inter-area routes addition/deletion.
With this fix, when ABR type is changed, the command handler initiates
SPF recalculation.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Two changes for debug:
1. Add a field to indicate its vrf for nexthop. When the interface changes
vrf, we can't easily know the vrf of this nexthop according to current log.
2. Add a field to indicate operation type. We can't know whether to add or
remove route according to current log.
Before:
```
zebra_nhg_increment_ref: nhe 0x555623eb82c0 (76[if 6]) 0 => 1
zebra_interface_nhg_reinstall install nhe 75[77.75.1.75 if 6] nh type 3 flags 0x1
Route 77.75.1.0/24(8) queued for processing into sub-queue Early Route Processing
Route 77.75.1.0/24(8) queued for processing into sub-queue Early Route Processing
```
After:
```
zebra_nhg_increment_ref: nhe 0x555623eb82c0 (76[if 6 vrfid 9]) 0 => 1
zebra_interface_nhg_reinstall install nhe 75[77.75.1.75 if 6 vrfid 8] nh type 3 flags 0x1
Route 77.75.1.0/24(8) (add) queued for processing into sub-queue Early Route Processing
Route 77.75.1.0/24(8) (delete) queued for processing into sub-queue Early Route Processing
```
Donald Sharp [Mon, 24 Jul 2023 00:30:47 +0000 (20:30 -0400)]
bgpd: The last_reset_cause in the peer structure is too large
The last_reset_cause is a plain old BGP_MAX_PACKET_SIZE buffer
that is really enlarging the peer data structure. Let's just
copy the stream that failed and only allocate how ever much
the packet size actually was. While it's likely that we have
a reset reason, the packet typically is not going to be 65k
in size. Let's save space.
Quentin Young [Mon, 24 Jul 2023 23:01:51 +0000 (19:01 -0400)]
tests: fix strncpy warning
GCC/clang warns about using strncpy in such a way that it does not copy
the null byte of a string; as implemented it was fine, but to fix the
warning, just use strlcat which was purpose made for the task being
accomplished here.
Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
Donald Sharp [Tue, 2 May 2023 13:25:04 +0000 (09:25 -0400)]
lib: Fix elf_py.c for coverity
David rightly pointed out that having a test for fd > 0 would
technically not be right, but not wrong for this portion of the
code since we know that we would never get a fd = 0 in this section.
In any event let's make coverity happy and move on with our life.
Donald Sharp [Mon, 24 Jul 2023 14:33:21 +0000 (10:33 -0400)]
bgpd: Reduce size of ibuf_work ringbuf
The ringbuf is 650k in size. This is obscenely large and
in practical experimentation FRR never even approaches
that size at all. Let's reduce this to 1.5 max packet sizes.
If a BGP_MAX_PACKET_SIZE packet is ever received having a bit
of extra space ensures that we can read at least 1 packet.
This also will significantly reduce memory usage when the
operator has a lot of peers.