From: Donald Sharp Date: Fri, 30 Jun 2023 19:21:43 +0000 (-0400) Subject: ospf6d: Stop crash in ospf6_write X-Git-Tag: docker/8.5.3~27^2~1 X-Git-Url: https://git.puffer.fish/?a=commitdiff_plain;h=48278a95be6b826613c7431a0f8119d72109cb7c;p=matthieu%2Ffrr.git ospf6d: Stop crash in ospf6_write I'm seeing crashes in ospf6_write on the `assert(node)`. The only sequence of events that I see that could possibly cause this to happen is this: a) Someone has scheduled a outgoing write to the ospf6->t_write and placed item(s) on the ospf6->oi_write_q b) A decision is made in ospf6_send_lsupdate() to send an immediate packet via a event_execute(..., ospf6_write,....). c) ospf6_write is called and the oi_write_q is cleaned out. d) the t_write event is now popped and the oi_write_q is empty and FRR asserts on the `assert(node)` When event_execute is called for ospf6_write, just cancel the t_write event. If ospf6_write has more data to send at the end of the function it will reschedule itself. I've only seen this crash one time and am unable to reliably reproduce this at all. But this is the only mechanism that I can see that could make this happen, given how little the oi_write_q is actually touched in code. Signed-off-by: Donald Sharp --- diff --git a/ospf6d/ospf6_message.c b/ospf6d/ospf6_message.c index 9e86a1e3a6..c17d387132 100644 --- a/ospf6d/ospf6_message.c +++ b/ospf6d/ospf6_message.c @@ -2576,9 +2576,7 @@ static void ospf6_send_lsupdate(struct ospf6_neighbor *on, struct ospf6_interface *oi, struct ospf6_packet *op) { - if (on) { - if ((on->ospf6_if->state == OSPF6_INTERFACE_POINTTOPOINT) || (on->ospf6_if->state == OSPF6_INTERFACE_DR) || (on->ospf6_if->state == OSPF6_INTERFACE_BDR)) @@ -2595,6 +2593,8 @@ static void ospf6_send_lsupdate(struct ospf6_neighbor *on, op->dst = alldrouters6; } if (oi) { + struct ospf6 *ospf6; + ospf6_fill_hdr_checksum(oi, op); ospf6_packet_add(oi, op); /* If ospf instance is being deleted, send the packet @@ -2602,12 +2602,27 @@ static void ospf6_send_lsupdate(struct ospf6_neighbor *on, */ if ((oi->area == NULL) || (oi->area->ospf6 == NULL)) return; - if (oi->area->ospf6->inst_shutdown) { + + ospf6 = oi->area->ospf6; + if (ospf6->inst_shutdown) { if (oi->on_write_q == 0) { - listnode_add(oi->area->ospf6->oi_write_q, oi); + listnode_add(ospf6->oi_write_q, oi); oi->on_write_q = 1; } - thread_execute(master, ospf6_write, oi->area->ospf6, 0); + /* + * When ospf6d immediately calls event_execute + * for items in the oi_write_q. The event_execute + * will call ospf6_write and cause the oi_write_q + * to be emptied. *IF* there is already an event + * scheduled for the oi_write_q by something else + * then when it wakes up in the future and attempts + * to cycle through items in the queue it will + * assert. Let's stop the t_write event and + * if ospf6_write doesn't finish up the work + * it will schedule itself again. + */ + thread_cancel(&ospf6->t_write); + thread_execute(master, ospf6_write, ospf6, 0); } else OSPF6_MESSAGE_WRITE_ON(oi); }