Trey Aspelund [Tue, 28 Jun 2022 14:08:55 +0000 (14:08 +0000)]
bgpd: include 0 in configured hold/keepalive
The default keepalive/hold timers are always exposed via this commit:
```
commit 9b1b96233d7204263d409ea6c504b316af9e533f (origin/bgp_timer_always_on)
Author: Trey Aspelund <taspelund@nvidia.com>
Date: Mon Jun 27 23:20:33 2022 +0000
bgpd: always display keepalive/hold intervals
`show bgp neighbors <peer> [json]` was only displaying the configured
keepalive and holdtime intervals when they differed from the default
values. Since default config is still config, let's make sure these
values are always displayed.
However it mistakenly changed the logic to only display the peer's
timers if the configured value was non-zero. This updates the logic to
check PEER_FLAG_TIMER to determine if the values were configured,
given 0 is a valid value (to disable keepalives).
Trey Aspelund [Mon, 27 Jun 2022 23:20:33 +0000 (23:20 +0000)]
bgpd: always display keepalive/hold intervals
`show bgp neighbors <peer> [json]` was only displaying the configured
keepalive and holdtime intervals when they differed from the default
values. Since default config is still config, let's make sure these
values are always displayed.
Donald Sharp [Mon, 27 Jun 2022 19:30:55 +0000 (15:30 -0400)]
zebra: Add ability for netconf dplane to handle global values
Add the ability for the netconf dplane code to handle
the global NETCONFA_IFINDEX_DEFAULT and NETCONF_IFINDEX_ALL
values. Then store our interested values when we get
them from the kernel as well as being able to display
them to the end operator.
configure, zebra: include DPDK headers and shared libs in the dp-dpdk build
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
-> Moved new capabilities needed to under HAVE_DPDK Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
zebra: expand pbr rule action for dataplane programming
PBR rules are installed as match, action rules in most dataplanes. This
requires the action to be resolved via a GW. And the GW to be subsequently
resolved to {SMAC, DMAC}.
zebra: add support for maintaining local neigh entries
Currently specific local neighbors (attached to SVIs) are maintatined
in an EVPN specific database. There is a need to maintain L3 neighbors
for other purposes including MAC resolution for PBR nexthops.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Cleanup compile and fix crash Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
G. Paul Ziemba [Fri, 20 May 2022 16:26:56 +0000 (09:26 -0700)]
toptests/isis_sr_te_topo1: test out-of-order route/route-map changes
A SR policy matches a BGP nexthop based on the IP address of
the nexthop and the color of the route (color may be assigned
to routes using a route-map).
The order of events (BGP route arrival, route-map definition,
policy and candidate-path definition) should not affect the
matching/mapping.
These changes add tests for:
- removing/adding BGP route after policy and routemap are
defined and held constant
- changing route map color to be different from policy color,
and then changing back to match
after each change, the policy should be observed to be in effect
unchanged from before, i.e., the route's nexthops should reflect
the matching SR policy.
Sarita Patra [Fri, 24 Jun 2022 14:48:03 +0000 (07:48 -0700)]
pimd: fix pim interface deletion flow
Deletion of pim interface(pim_if_delete) should
do the below things before cleanup.
1. Send a hello message with zero hold time.
2. Delete all the neighbors.
3. Close the pim socket.
Sarita Patra [Fri, 24 Jun 2022 10:04:37 +0000 (03:04 -0700)]
pimd: fix invalid memory access join_timer_stop
Issue:
==16837== Invalid read of size 8
==16837== at 0x17971C: pim_neighbor_find (pim_neighbor.c:431)
==16837== by 0x186439: join_timer_stop (pim_upstream.c:348)
==16837== by 0x186794: pim_upstream_del (pim_upstream.c:231)
==16837== by 0x189A66: pim_upstream_terminate (pim_upstream.c:1951)
==16837== by 0x17111B: pim_instance_terminate (pim_instance.c:54)
==16837== by 0x17111B: pim_vrf_delete (pim_instance.c:172)
==16837== by 0x4F1D6C8: vrf_delete (vrf.c:264)
==16837== by 0x19006F: pim_terminate (pimd.c:160)
==16837== by 0x1B2E4D: pim_sigterm (pim_signals.c:51)
==16837== by 0x4F08FA2: frr_sigevent_process (sigevent.c:130)
==16837== by 0x4F1A2CC: thread_fetch (thread.c:1771)
==16837== by 0x4ED4F92: frr_run (libfrr.c:1197)
==16837== by 0x15D81A: main (pim_main.c:176)
Root Cause:
In the pim_terminate flow, the interface is deleted
before the pim_interface clean up. Because of this,
the pim_interface is having garbage value.
Fix:
Release the pim interface memory and then delete the
interface.
Donald Sharp [Fri, 17 Jun 2022 15:23:31 +0000 (11:23 -0400)]
zebra: Fix rtadv startup when config read in is before interface up
When a interface is configured with this:
int eva
ipv6 nd ra-interval 5
no ipv6 nd suppress-ra
!
And then subsuquently the interface is created and brought up, FRR
would both error on joining the RA multicast address and never
properly work in this state.
Delay the startup of the join and start of the Router Advertisements
until after the ifindex has actually been found.
Eugene Bogomazov [Fri, 24 Jun 2022 09:28:13 +0000 (12:28 +0300)]
bgpd: update topotests for role mismatch
In topotests, we also want to check for role mismatch cases. However, if
we are testing the sender of a role mismatch notification, sometimes it
can have non-deterministic behavior (probably due to a configuration
change). Thus, there is an assumption that the recipient of
notifications will more consistently display the reason why the session
was terminated in the first place.
rgirada [Thu, 23 Jun 2022 14:37:28 +0000 (07:37 -0700)]
vtysh: Account validity should be verified when authenticating users with PAM.
Description:
SonarQube detects the following behaviour as a vulanarability.
When authenticating users using PAM, it is strongly recommended to
check the validity of the account (not locked, not expired ...),
otherwise it leads to unauthorized access to resources.
pam_acct_mgmt() should be called for account validity after
calling pam_authenticate().
Donald Sharp [Sat, 18 Jun 2022 18:37:14 +0000 (14:37 -0400)]
isisd: Fix crash with xfrm interface type
When creating a xfrm interface FRR is crashing when configured
with isis. This is because the weird pattern of not allocating
list's until needed and then allowing the crash when we have
a usage pattern that was not expected. Just always allocate
the different lists that a circuit needs.
(gdb) bt
(gdb)
Fixes #11432 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Fri, 17 Jun 2022 19:40:36 +0000 (15:40 -0400)]
tests: Increase time for zebra_seg6local to look for sharp routes
I have a test failure:
r1.vtysh_cmd(
"sharp install seg6local-routes {} nexthop-seg6local dum0 {} 1".format(
dest, context
)
)
test_func = partial(
check,
r1,
dest,
manifest["out"],
)
success, result = topotest.run_and_expect(test_func, None, count=5, wait=1)
> assert result is None, "Failed"
E AssertionError: Failed
E assert Generated JSON diff error report:
E
E > $: d2 has the following element at index 0 which is not present in d1:
E
E {
E "prefix": "1::1/128",
E "protocol": "sharp",
E "selected": true,...
E
The test output for 1::1/128:
{
"1::1/128":[
{
"prefix":"1::1/128",
"prefixLen":128,
"protocol":"sharp",
"vrfId":0,
"vrfName":"default",
"selected":true,
"destSelected":true,
"distance":150,
"metric":0,
"queued":true,
"table":254,
"internalStatus":8,
Notice that it is still queued after 5 seconds. Under extremely heavy system load
this is not long enough for convergence. Also the zebra.log shows thread starvation
as well as long running tasks
2022/06/17 15:30:02 ZEBRA: [PHJDC-499N2][EC 100663314] STARVATION: task dplane_incoming_request (55b3ce0fea8b) ran for 6369ms (cpu time 0ms)
2022/06/17 15:30:02 ZEBRA: [T83RR-8SM5G] zebra 8.4-dev starting: vty@2601
2022/06/17 15:30:02 ZEBRA: [YZRX4-ZXG0C][EC 100663315] Thread Starvation: {(thread *)0x55b3ce6c15b0 arg=0x0 timer r=-6.375 rib_sweep_route() &zrouter.sweeper from zebra/main.c:447} was scheduled to pop greater than 4s ago
Increasing the time to 25 seconds to give it a chance.
Donald Sharp [Wed, 22 Jun 2022 23:40:58 +0000 (19:40 -0400)]
pimd: Checks imply that pim is not properly configured
The call to gm_update_ll checks for null pointers and
implies to SA that things could not be configured correctly
This is not true with the code flow. Remove the confusing code.
Donald Sharp [Wed, 22 Jun 2022 12:12:04 +0000 (08:12 -0400)]
pimd: Limit pim's ecmp to what zebra tells us is the multipath
Zebra can be setup to use a value that is less than MULTIPATH_NUM.
When pimd connects to zebra, zebra will inform pim about the MULTIPATH_NUM
used. Let's use that value for figuring out our multipath value.
Donald Sharp [Wed, 22 Jun 2022 11:48:51 +0000 (07:48 -0400)]
bgpd: Cleanup pointer assignment so compiler doesn't get confused
Coverity SA thinks that the `struct prefix`.u.prefix4 is limited
to actually 4 bytes of memory at that spot, but it's in a union
and it can be treated as a prefix6 as well. Just change the
pointer assignment to something that covers both easily.
Donald Sharp [Thu, 23 Jun 2022 16:22:30 +0000 (12:22 -0400)]
zebra: Allow kernel routes to stick around better on interface state changes
Currently kernel routes on system bring up would be `auto-accepted`,
then if an interface went down all kernel and system routes would
be re-evaluated. There exists situations where a kernel route can
exist but the interface itself is not exactly in a state that is
ready to create a connected route yet. As such when any interface
goes down in the system all kernel/system routes would be re-evaluated
and then since that interfaces connected route is not in the table yet
the route is matching against a default route( or not at all ) and
is being dropped.
Modify the code such that kernel or system routes just look for interface
being in a good state (up or operative) and accept it.
Broken code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:08
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:08
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:08
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:08
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:08
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:28
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:28
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:28
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:28
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:28
Working code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:04
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:04
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:04
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:04
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:04
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:15
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:15
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:15
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:15
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:15
eva#
Donald Sharp [Thu, 23 Jun 2022 14:27:56 +0000 (10:27 -0400)]
lib, zebra: Notice when a nexthop is set linkdown
When a nexthop is set RTNH_F_LINKDOWN, start noticing
that this flag is set. Allow FRR to know about this
flag but at this point do not do anything with it.
Donald Sharp [Thu, 23 Jun 2022 14:22:45 +0000 (10:22 -0400)]
lib: Increase nexthop flags size to 16 bits
commit: 5609e70fb87a3b23b55629a33e5afb298974c142
Added a new flag to the `struct nexthop` and
this addition of a flag caused the flags size to
be too small. Increase the size of flags to
allow more flags to be had.
Donald Sharp [Thu, 23 Jun 2022 15:14:46 +0000 (11:14 -0400)]
zebra: Fix bug in netconf handling where dplane would drop the change
When reading a on the fly change of an interested netconf netlink
message. The ifindex and ns_id for the context was being set for the sub structure
but not for the main context data structure and zebra_if_dplane_result
was dropping the result on the floor because it was expecting the ns_id and
the interface id to be in a different spot.
rgirada [Thu, 23 Jun 2022 13:40:19 +0000 (06:40 -0700)]
ospfd: fixing few coverity issue in 'show_ip_ospf_neighbour_brief'
Description:
timerval data structure is being used without initialization.
Using these uninitialized parameters can lead unexpected results
so initializing before using it.
anlan_cs [Thu, 19 May 2022 01:55:33 +0000 (21:55 -0400)]
zebra: move the check for l3vni
The two checks for l3vni have been already done in
`lib_vrf_zebra_l3vni_id_modify()` as it should be. And it is improper that
the two checks are put after `zebra_vxlan_handle_vni_transition()`, which
will do real things.
My original fix is to remove them. But NB module can't guarantee many changes,
so we'd better keep them in `zebra_vxlan_process_vrf_vni_cmd()` in APPLY stage
for safe.
Just move them in front of `zebra_vxlan_handle_vni_transition()`.
rgirada [Sun, 19 Jun 2022 18:04:39 +0000 (11:04 -0700)]
ospfd: Fixing "show ip ospf neighbour <nbrid>" command
Description:
"show ip ospf neighbour [nbrid] [json]" is expected to give brief output
of the specific neighbour. But it gives the detailed output without
the detail keyword.
"show ip ospf neighbour [nbrid] [deatil] [json]" command is failed to
fetch the ecpected o/p. Corrected it.
Ex o/p:
frr(config-if)# do show ip ospf neighbor
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
8.8.8.8 1 Full/DR 17m03s 31.192s 20.1.1.194 ens192:20.1.1.220 0 0 0
30.1.1.100 1 Full/DR 56.229s 32.000s 30.1.1.100 ens224:30.1.1.220 0 0 0
frr(config-if)#
frr(config-if)#
frr(config-if)# do show ip ospf neighbor 8.8.8.8
Neighbor 8.8.8.8, interface address 20.1.1.194
In the area 0.0.0.0 via interface ens192
Neighbor priority is 1, State is Full/DR, 6 state changes
Most recent state change statistics:
Progressive change 17m18s ago
DR is 20.1.1.194, BDR is 20.1.1.220
Options 2 *|-|-|-|-|-|E|-
Dead timer due in 35.833s
Database Summary List 0
Link State Request List 0
Link State Retransmission List 0
Thread Inactivity Timer on
Thread Database Description Retransmision off
Thread Link State Request Retransmission on
Thread Link State Update Retransmission on
Graceful restart Helper info:
Graceful Restart HELPER Status : None
frr(config-if)# do show ip ospf neighbor 8.8.8.8 detail
No such interface.
frr(config-if)# do show ip ospf neighbor 8.8.8.8 detail json
{}
frr(config-if)#
Eugene Bogomazov [Wed, 22 Jun 2022 13:57:08 +0000 (16:57 +0300)]
bgpd: move to switch clause in get name function
bgp_rpki_validation2str implements a switch statement to determine the
correct string response from the validation state. So, switch to a
switch statement when getting a name by role for code consistency.
Eugene Bogomazov [Wed, 22 Jun 2022 13:12:28 +0000 (16:12 +0300)]
bgpd: simplify code fragment for RFC 9234
Roles cannot be applied to iBGP sessions, so we can move this check to
the top of the role configuration method. Thus, we simplify the internal
logic of branching.
Eugene Bogomazov [Wed, 22 Jun 2022 12:09:06 +0000 (15:09 +0300)]
bgpd: simplify ebgp role check for RFC 9234
BGP Role is currently defined only for eBGP session. So, we don't
need to consider which roles can be applied on iBGP session and
thus simplify code fragment.
Eugene Bogomazov [Wed, 22 Jun 2022 09:47:22 +0000 (12:47 +0300)]
bgpd: add AFI/SAFI check for RFC 9234
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
When disabling MLAG leaf configuration with EVPN, logs are
getting flooded for each VNI, This is the result of each Type-2
packets. Ideally, this should be under log debugging, not a warning.
Testing: UT Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>