Xiaodong Xu [Sat, 20 Aug 2022 00:25:41 +0000 (17:25 -0700)]
ospf6d: Don't remove summary route if it is a range
Fix issue #11839.
When the user defines a range in an area other than the backbone area, the
summary route will be announced to the backbone area as an inter-area LSA.
However, if the prefix defined in the range is the same prefix as a connected
route in that area, the LSA won't be announced to the backbone area.
This is because when ospf6d is originating the summary route for the
intra-area route, it finds the range configured by the user and tries to
suppress the route by deleting the existing summary route, which happens to be
the one created by the range.
Although the range definition is not necessary in this case, it should not
fail this use case. So let's just keep the summary route there if it is
created from the user defined range.
Donald Sharp [Wed, 10 Aug 2022 00:09:03 +0000 (20:09 -0400)]
zebra: No need for a rib_delete before a rib_add
In kernel_socket.c, the code is deleting and then adding
the route back in on a change operation. This just translates
too two re's, one for deletion and one for addition. The deletion
will just be ignored. Let's not do the extra deletion.
Donald Sharp [Tue, 2 Aug 2022 17:57:18 +0000 (13:57 -0400)]
zebra: Introduce early route processing on the MetaQ
Currently if an operator does this operation:
sharpd@eva ~/frr8> sudo ip nexthop add id 5000 via 192.168.119.44 dev enp39s0 ; sudo ip route add 10.0.0.1 nhid 5000
2022/06/30 08:52:40 ZEBRA: [ZHQK5-J9M1R] proto2zebra: Please add this protocol(0) to proper rt_netlink.c handling
2022/06/30 08:52:40 ZEBRA: [PS16P-365FK][EC 4043309076] Zebra failed to find the nexthop hash entry for id=5000 in a route entry
sharpd@eva ~/frr8> vtysh -c "show ip route 10.0.0.1"
Routing entry for 0.0.0.0/0
Known via "kernel", distance 0, metric 100, best
Last update 00:01:58 ago
* 192.168.119.1, via enp39s0
The route is dropped by zebra with no warnings. This is not good,
but unlikely to happen at this point in time. In order to fix
this issue route processing from inputs needs to happen after nexthop
group processing from inputs. This was not possible because
nexthop groups are placed on the metaQ. As such the above
nexthop group creation is placed on the metaQ for processing
in META_QUEUE_NHG. Then the route is read in and processed
immediately. The nexthop group is not found ( not processed yet!)
and the route is dropped in zebra.
Modify the code to have early route processing of validity
on the MetaQ. This preserves the order of operations.
Donald Sharp [Tue, 9 Aug 2022 20:00:24 +0000 (16:00 -0400)]
zebra: Convert label processing to Meta-Q
Convert label processing that comes from zapi messages
into being handled by the meta-Q. This is because early
route processing is going to be moved to the meta-Q as
well and we will have a chicken and egg problem without
moving this code to be processed by the meta-Q.
Ordering of messages from ospf as an example:
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:48] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:62] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:43] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:61] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_ROUTE_ADD:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:66] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36]
2022/08/09 08:55:52.740 ZEBRA: [YXG8K-BCYMV] zebra message[ZEBRA_MPLS_LABELS_REPLACE:0:47] comes from socket [36]
The ZEBRA_MPLS_LABELS_REPLACE immediately turn around and attempt to replace nexthop labels on routes that
were added. If the route add is placed on the metaQ, it will not exist yet and as such the label replace
will fail.
Modify the zebra code to take the label operations and place them on the metaQ as well.
pim6d: Register message getting dropped in source node, mroute stuck in RegJ
The socket created for pimv6 was created using AF_INET for PIMV6
too.
Since the api pim_reg_sock is common to both PIMv4 and PIMv6,
need to use PIM_AF instead of AF_INET.
Donatas Abraitis [Tue, 16 Aug 2022 06:28:21 +0000 (09:28 +0300)]
bgpd: Change warning message when BGP community-list is not found
Before:
```
donatas-laptop# show bgp ipv4 unicast community-list testas
% testas is not a valid community-list name
donatas-laptop# con
donatas-laptop(config)# bgp community-list standard testas permit internet
donatas-laptop(config)# do show bgp ipv4 unicast community-list testas
donatas-laptop(config)#
```
`is not a valid community-list name` is a misleading warning message.
Doing the same for filter-list, access-list, prefix-list, route-map.
Donald Sharp [Mon, 15 Aug 2022 16:01:02 +0000 (12:01 -0400)]
pbrd: VTY_GET_CONTEXT can fail
Although VTY_GET_CONTEXT can return a failed value, it will
never happen in pbrd because of how context work. In
any event add some code to make coverity happy
Donald Sharp [Mon, 15 Aug 2022 15:43:27 +0000 (11:43 -0400)]
pimd: vrf may be NULL from pim_cmd_lookup_vrf
The call into pim_cmd_lookup_vrf may be NULL
and dereferencing it before ensuring that the
vrf pointer is non-NULL is a good way to crash.
A crash can be initiated in pim:
eva# show ip msdp vrf NOEXIST mesh-group
vtysh: error reading from pimd: Permission denied (13)Warning: closing connection to pimd because of an I/O error!
eva# 2022/08/15 11:47:38 [PHJDC-499N2][EC 100663314] STARVATION: task vtysh_rl_read (560b77f76de6) ran for 16777ms (cpu time 0ms)
Donald Sharp [Thu, 11 Aug 2022 01:49:37 +0000 (21:49 -0400)]
ospfd: When a neighbor goes down clear the oi->obuf if we can
When a neighbor goes down on an interface and that interface
has no more neighbors in a viable state where packets should
be being sent, then let's clear up the oi->obuf associated
with the interface the neighbor is on.
Donald Sharp [Wed, 10 Aug 2022 20:15:29 +0000 (16:15 -0400)]
ospfd: Increase packets sent at one time in ospf_write
ospf_write pulls an interface off the ospf->oi_write_q
then writes one packet and places it back on the queue,
keeping track of the first one sent. Then it will
stop sending packets even if we get back to the first
interface written too but before we have sent the full
pkt_count. I do not believe this makes a whole bunch
of sense and is very restrictive of how much data can
be sent especially if you have a limited number of peers
but large amounts of data. Why be so restrictive?
Quentin Young [Fri, 11 Jun 2021 23:01:42 +0000 (19:01 -0400)]
bgpd: don't adv conditionally withdrawn routes
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Quentin Young [Tue, 15 Jun 2021 23:49:19 +0000 (19:49 -0400)]
bgpd: apply cond-adv policy to update group
The new outbound filter to apply conditional advertisement policy was
not working properly due to complications with update groups. The two
routemaps were properly copied into the update group peer filter but not
the conditional advertisement state.
Signed-off-by: Quentin Young <qlyoung@nvidia.com> Signed-off-by: Mark Stapp <mstapp@nvidia.com>
anlan_cs [Tue, 9 Aug 2022 00:41:22 +0000 (20:41 -0400)]
ospf6d: fix missing cost change
After all needed interfaces ( for example: interface "a1", vrf "vrf1", and
"a1" is binded to "vrf1" ) are ready/created, then restart/start frr. zebra
at startup will call `netlink_interface()` to process all interfaces and notify
all clients, but its calling `get_iflink_speed()` maybe fails for unexpected
order of the coming interfaces: when processing "a1", "vrf1" maybe is unknown
at that time. `if_zebra_speed_update()` timer is introduced to deal with this
order problem.
Currently only ospfd and ospf6d deal with this speed change to recalculated
route cost. ospfd can deal with this change, but ospf6d will wrongly missed it.
Since both `ipv6 ospf6 cost COST` and `auto-cost reference-bandwidth COST` are
not set, cost of this ospf6 interface should be calculated with interface
speed, but it is wrongly kept to `10`, which is based on interface speed being
`0` for it missed speed change. Further, ECMP function becomes invalid after
restart frr, beacuse some ospf6 interfaces of one ECMP are wrongly with cost
`10`.
To avoid missing, recalculate cost for ospf6 interfaces based on potentially
changed speed.
Donald Sharp [Tue, 9 Aug 2022 14:23:35 +0000 (10:23 -0400)]
zebra: System routes should be processed the same time as kernel
For whatever reason. ZEBRA_ROUTE_SYSTEM routes were being processed
last. Since a system route is just another kernel route type. Let's
just switch it to be processed the same time as kernel routes.
Donald Sharp [Tue, 9 Aug 2022 13:22:43 +0000 (09:22 -0400)]
zebra: Explicitly call out the correct queue name
There were more than a few places where the NHG meta
queue was not being explicitly called out. Let's
be consistent and use the same nomenclature as much
as possible when talking about metaQ's.