Donald Sharp [Wed, 10 Nov 2021 21:58:58 +0000 (16:58 -0500)]
zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:
a) New nexthop group insertion seq 1
b) Route delete operation seq 3
c) Route insertion operation seq 2
Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response. Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.
This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:
a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12
Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:
unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one. This fixes the test code to do the right thing on the second installation.
Donald Sharp [Wed, 10 Nov 2021 20:09:37 +0000 (15:09 -0500)]
zebra: set zd_is_update in 1 spot
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.
Donald Sharp [Wed, 10 Nov 2021 19:12:39 +0000 (14:12 -0500)]
zebra: When we get an implicit or ack or full failure mark status
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code. Let's immediately
mark all of those in the batch pass/fail as needed. Instead
of having them marked else where.
Donald Sharp [Wed, 1 Dec 2021 22:03:38 +0000 (17:03 -0500)]
lib: Update hash.h documentation to warn of a possible crash
Multiple deletions from the hash_walk or hash_iteration calls
during a single invocation of the passed in function can and
will cause the program to crash. Warn against doing such a
thing.
Donald Sharp [Wed, 1 Dec 2021 21:28:42 +0000 (16:28 -0500)]
zebra: Ensure zebra_nhg_sweep_table accounts for double deletes
I'm seeing this crash in various forms:
Program terminated with signal SIGSEGV, Segmentation fault.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f418efbc7c0 (LWP 3580253))]
(gdb) bt
(gdb) f 4
267 (*func)(hb, arg);
(gdb) p hb
$1 = (struct hash_bucket *) 0x558cdaafb250
(gdb) p *hb
$2 = {len = 0, next = 0x0, key = 0, data = 0x0}
(gdb)
I've also seen a crash where data is 0x03.
My suspicion is that hash_iterate is calling zebra_nhg_sweep_entry which
does delete the particular entry we are looking at as well as possibly other
entries when the ref count for those entries gets set to 0 as well.
Then we have this loop in hash_iterate.c:
for (i = 0; i < hash->size; i++)
for (hb = hash->index[i]; hb; hb = hbnext) {
/* get pointer to next hash bucket here, in case (*func)
* decides to delete hb by calling hash_release
*/
hbnext = hb->next;
(*func)(hb, arg);
}
Suppose in the previous loop hbnext is set to hb->next and we call
zebra_nhg_sweep_entry. This deletes the previous entry and also
happens to cause the hbnext entry to be deleted as well, because of nhg
refcounts. At this point in time the memory pointed to by hbnext is
not owned by the pthread anymore and we can end up on a state where
it's overwritten by another pthread in zebra with data for other incoming events.
What to do? Let's change the sweep function to a hash_walk and have
it stop iterating and to start over if there is a possible double
delete operation.
Donald Sharp [Mon, 24 Jan 2022 12:02:37 +0000 (07:02 -0500)]
bgpd: Prevent use after variable goes out of scope
`struct prefix p` was declared inside an if statement
where we assign the address of to a pointer that is
then passed to a sub function. This will eventually
leave us in a bad state.
Tomi Salminen [Wed, 2 Feb 2022 09:19:09 +0000 (11:19 +0200)]
ospfd: Core in ospf_if_down during shutdown.
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Signed-off-by: Tomi Salminen <tsalminen@forcepoint.com>
anlan_cs [Wed, 19 Jan 2022 07:10:18 +0000 (02:10 -0500)]
bfdd: fix broken FSM in passive mode
Problem:
One is with active mode, the other is with passive mode. Sometimes
the one with active mode is in `Down` stauts, but the other one
with passive mode is unluckily stuck in `Init` status:
It doesn't answer its peer with any packets, even receiving continuous
`Down` packets.
Root Cause:
bfdd with passive mode answers its peer only *one* packet in `Down` status,
then it enters into `Init` status and ignores subsequent `Down` packets.
Unluckily that *one* answered packet is lost, at that moment its peer
with active mode can only have to send `Down` packets.
Fix:
1) With passive mode, bfdd should start xmittimer after received `Down` packet.
Refer to RFC5880:
"A system taking the Passive role MUST NOT begin sending BFD packets for
a particular session until it has received a BFD packet for that session, and
thus has learned the remote system's discriminator value."
2) Currently this added xmittimer for passive mode can be safely removed
except receiving `AdminDown` packet:
- `bfd_session_enable/bfd_set_passive_mode` doesn't start xmittimer
- `ptm_bfd_sess_dn/bfd_set_shutdown` can remove xmittimer
Per RFC5880, receiving `AdminDown` packet should be also regarded as `Down`,
so just call `ptm_bfd_sess_dn`, which will safely remove the added xmittimer
for passive mode. In summary, call `ptm_bfd_sess_dn` for two status changes
on receiving `AdminDown`: `Init`->`Down` and `Up`->`Down`.
Mark Stapp [Tue, 1 Feb 2022 18:41:17 +0000 (13:41 -0500)]
zebra: reduce incoming netlink messages for dplane thread
The dataplane pthread only processes a limited set of incoming
netlink notifications: only register for that set of events,
reducing duplicate incoming netlink messages.
Igor Ryzhov [Sun, 23 Jan 2022 17:22:42 +0000 (20:22 +0300)]
zebra: fix cleanup of meta queues on vrf disable
Current code treats all metaqueues as lists of route_node structures.
However, some queues contain other structures that need to be cleaned up
differently. Casting the elements of those queues to struct route_node
and dereferencing them leads to a crash. The crash may be seen when
executing bgp_multi_vrf_topo2.
Fix the code by using the proper list element types.
ckishimo [Mon, 31 Jan 2022 22:58:14 +0000 (23:58 +0100)]
ospfd: restart spf when distance is updated
if r1 has a route received from a neighbor with the default administrative
distance configured
r1# sh ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
O>* 1.1.1.1/32 [110/20] via 10.0.12.2, r1-r2-eth0, weight 1, 00:00:41
this is not applied as there are no changes in the routing table
r1# sh ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
O>* 1.1.1.1/32 [110/20] via 10.0.12.2, r1-r2-eth0, weight 1, 00:00:13
This commit will force the update of the routing table with the new configured distance
r1# sh ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
O>* 1.1.1.1/32 [50/20] via 10.0.12.2, r1-r2-eth0, weight 1, 00:00:14
Donald Sharp [Mon, 31 Jan 2022 17:49:55 +0000 (12:49 -0500)]
ospfd: Convert output to host order from network order for route_tag
FRR stores the route_tag in network byte order. Bug filed indicates
that the `show ip ospf route` command shows the correct value.
Every place route_tag is dumped in ospf_vty.c the ntohl function
is used first.
Fixes: #10450 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Fri, 21 Jan 2022 17:03:04 +0000 (12:03 -0500)]
zebra: Make Router Advertisement warnings show up once every 6 hours
RA packets are pretty chatty and when there is a warning from
a missconfiguration on the network, the log file gets filed
up with warnings. Modify the code in rtadv.c to only spit
out the warning in these cases at most every 6 hours.
ckishimo [Wed, 26 Jan 2022 21:41:36 +0000 (22:41 +0100)]
ospf6d: restart spf when distance is updated
if r1 has a route received from a neighbor and the same route
configured as static, the administrative distance will determine
which route to use
r1(config)# ipv6 route 1:1::1/128 Null0 70
r1# sh ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
S>* 1:1::1/128 [70/0] unreachable (blackhole), weight 1, 00:00:12
O 1:1::1/128 [110/20] via fe80::1833:c9ff:fe7b:3e43, r1-r2-eth0, weight 1, 00:00:49
The static route is selected. If we now change the administrative distance
in ospf6, the OSPF route should be selected
r1# sh ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
S>* 1:1::1/128 [70/0] unreachable (blackhole), weight 1, 00:00:39
O 1:1::1/128 [110/20] via fe80::1833:c9ff:fe7b:3e43, r1-r2-eth0, weight 1, 00:01:16
However the distance is not applied as there are no changes in the routing table
This commit will force the update of the routing table with the new configured distance
r1# sh ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
O>* 1:1::1/128 [50/20] via fe80::8cb7:e6ff:fef5:2344, r1-r2-eth0, weight 1, 00:00:03
S 1:1::1/128 [70/0] unreachable (blackhole), weight 1, 00:00:19
Igor Ryzhov [Thu, 27 Jan 2022 18:05:40 +0000 (21:05 +0300)]
vrrpd: use ipaddr_is_zero when needed
Replace custom implementation or call to ipaddr_isset with a call to
ipaddr_is_zero.
ipaddr_isset is not fully correct, because it's fine to have some
non-zero bytes at the end of the struct in case of IPv4 and the function
doesn't allow that.