diff options
| author | Donald Sharp <sharpd@cumulusnetworks.com> | 2019-08-28 12:09:41 -0400 |
|---|---|---|
| committer | Donald Sharp <sharpd@cumulusnetworks.com> | 2019-08-28 12:09:41 -0400 |
| commit | 11375c52740089b6b49ca7d56b2cea0c7208338c (patch) | |
| tree | 4ce8b0c75c2c57bdc77a18c17059a457eca857b7 | |
| parent | 8f910d6c3f9e92422764a625ec3a9c23a87df6bf (diff) | |
lib: Stop arm crash on shutdown
Arm platforms are crashing in our topotests with this callstack;
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0xffffabb591d0 (LWP 18947))]
(gdb) bt
file=file@entry=0xaaaadfed1e48 "lib/memory.c", line=line@entry=80,
function=function@entry=0xaaaadfed1db8 <__func__.10514> "mt_count_free") at lib/log.c:837
(gdb)
So we are crashing because we are attempting to free a mtype that has no allocations
associated with it.
I added this debug code:
@@ -227,7 +230,9 @@ static void rcu_bump(void)
struct rcu_next *rn;
rn = XMALLOC(MTYPE_RCU_NEXT, sizeof(*rn));
-
+ zlog_debug("RCU_BUMP");
+ mtype_dump(MTYPE_RCU_THREAD);
+ mtype_dump(MTYPE_RCU_NEXT);
/* note: each RCUA_NEXT item corresponds to exactly one seqno bump.
* This means we don't need to communicate which seqno is which
* RCUA_NEXT, since we really don't care.
and added a mtype_dump function:
+void mtype_dump(struct memtype *mt)
+{
+ zlog_debug("%s: %d", mt->name, (int)mt->n_alloc);
+}
Which resulted in this output:
2019/08/28 15:41:11 BGP: RCU_BUMP
2019/08/28 15:41:11 BGP: RCU thread: 3
2019/08/28 15:41:11 BGP: RCU thread: 3
If we look at the defintion of the two static memory types:
DEFINE_MTYPE_STATIC(LIB, RCU_THREAD, "RCU thread")
DEFINE_MTYPE_STATIC(LIB, RCU_NEXT, "RCU sequence barrier")
I would have expected the output to be:
RCU_BUMP
RCU thread: 3
RCU sequence barrier: X
instead.
As a thought experiment I reduced the number of static memory types
to 1 in the file and the crash stopped happening.
I suspect we have a systematic error on arm in lib/memory.h
due to the asm code. I am going to leave that alone for the
moment ( and leave the crash issue open ), but see if we
can get this code change into the system so that our CI
system becomes happy again.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
| -rw-r--r-- | lib/frrcu.c | 5 |
1 files changed, 2 insertions, 3 deletions
diff --git a/lib/frrcu.c b/lib/frrcu.c index 7e6475b648..54626f909d 100644 --- a/lib/frrcu.c +++ b/lib/frrcu.c @@ -55,7 +55,6 @@ #include "atomlist.h" DEFINE_MTYPE_STATIC(LIB, RCU_THREAD, "RCU thread") -DEFINE_MTYPE_STATIC(LIB, RCU_NEXT, "RCU sequence barrier") DECLARE_ATOMLIST(rcu_heads, struct rcu_head, head) @@ -226,7 +225,7 @@ static void rcu_bump(void) { struct rcu_next *rn; - rn = XMALLOC(MTYPE_RCU_NEXT, sizeof(*rn)); + rn = XMALLOC(MTYPE_RCU_THREAD, sizeof(*rn)); /* note: each RCUA_NEXT item corresponds to exactly one seqno bump. * This means we don't need to communicate which seqno is which @@ -269,7 +268,7 @@ static void rcu_bump(void) * "last item is being deleted - start over" case, and then we may end * up accessing old RCU queue items that are already free'd. */ - rcu_free_internal(MTYPE_RCU_NEXT, rn, head_free); + rcu_free_internal(MTYPE_RCU_THREAD, rn, head_free); /* Only allow the RCU sweeper to run after these 2 items are queued. * |
