ospfd: Do not allow thread drop
When the ospf->oi_write_q is not empty that means that ospf could
already have a thread scheduled for running. Just dropping
the pointer before resheduling does not stop the one currently
scheduled for running from running. The calling of thread_add_write
checks to see if we are already running and does the right thing here
so it is sufficient to just call thread_add_write.
This issue was tracked down from this stack trace:
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [EC
134217739] interface eth2.1032:172.16.4.110: ospf_check_md5 bad sequence
5333618 (expect
5333649)
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: message repeated 3 times: [ [EC
134217739] interface eth2.1032:172.16.4.110: ospf_check_md5 bad sequence
5333618 (expect
5333649)]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Assertion `node’ failed in file ospfd/ospf_packet.c, line 666, function ospf_write
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Backtrace for 8 stack frames:
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 0] /usr/lib/libfrr.so.0(zlog_backtrace+0x3a) [0x7fef3efe9f8a]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 1] /usr/lib/libfrr.so.0(_zlog_assert_failed+0x61) [0x7fef3efea501]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 2] /usr/lib/frr/ospfd(+0x2f15e) [0x562e0c91815e]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 3] /usr/lib/libfrr.so.0(thread_call+0x60) [0x7fef3f00d430]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 4] /usr/lib/libfrr.so.0(frr_run+0xd8) [0x7fef3efe7938]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 5] /usr/lib/frr/ospfd(main+0x153) [0x562e0c901753]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 6] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fef3d83db45]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: [bt 7] /usr/lib/frr/ospfd(+0x190be) [0x562e0c9020be]
Oct 19 18:04:00 VYOS-R1 ospfd[1811]: Current thread function ospf_write, scheduled from file ospfd/ospf_packet.c, line 881
Oct 19 18:04:00 VYOS-R1 zebra[1771]: [EC
4043309116] Client ‘ospf’ encountered an error and is shutting down.
Oct 19 18:04:00 VYOS-R1 zebra[1771]: client 41 disconnected. 0 ospf routes removed from the rib
We had an assert(node) in ospf_write, which means that the list was empty. So I just
searched until I saw a code path that allowed multiple writes to the ospf_write function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>