From: Donald Sharp Date: Tue, 1 Mar 2022 14:02:33 +0000 (-0500) Subject: lib: Fix FreeBSD clock_gettime(CLOCK_THREAD_CPUTIME_ID,..) going backwards X-Git-Tag: pim6-testing-20220430~269^2 X-Git-Url: https://git.puffer.fish/?a=commitdiff_plain;h=refs%2Fpull%2F10697%2Fhead;p=mirror%2Ffrr.git lib: Fix FreeBSD clock_gettime(CLOCK_THREAD_CPUTIME_ID,..) going backwards On FreeBSD I have noticed that subsuquent calls to clock_gettime(..) can return an after time that is before first calls value. This in turn is generating CPU_HOG's because the subtraction is wrapping into very very large numbers: 2022/02/28 20:12:58 SHARP: [PTDQA-70FG5] start: 35.741981000 now: 35.740581000 2022/02/28 20:12:58 SHARP: [XK9YH-ZD8FA][EC 100663313] CPU HOG: task zclient_read (800744240) ran for 0ms (cpu time 18446744073709550ms) (Please note I added the first line of debug to figure this issue out). I have been asked to open a FreeBSD bug report and have done so. In the mean time I think that it is important that FRR does not generate bogus CPU HOG's on FreeBSD ( especially since this may or may not be easily fixed and FRR has no control over what version of the operating system, operators are going to be running with FRR. So, add a bit of specialized code that checks to see if the after time in FreeBSD is before the now time in thread_consumed_time and do some quick manipulations to not have this issue. Signed-off-by: Donald Sharp --- diff --git a/lib/thread.c b/lib/thread.c index 6d91ca497b..90074b3d89 100644 --- a/lib/thread.c +++ b/lib/thread.c @@ -1884,6 +1884,27 @@ unsigned long thread_consumed_time(RUSAGE_T *now, RUSAGE_T *start, unsigned long *cputime) { #ifdef HAVE_CLOCK_THREAD_CPUTIME_ID + +#ifdef __FreeBSD__ + /* + * FreeBSD appears to have an issue when calling clock_gettime + * with CLOCK_THREAD_CPUTIME_ID really close to each other + * occassionally the now time will be before the start time. + * This is not good and FRR is ending up with CPU HOG's + * when the subtraction wraps to very large numbers + * + * What we are going to do here is cheat a little bit + * and notice that this is a problem and just correct + * it so that it is impossible to happen + */ + if (start->cpu.tv_sec == now->cpu.tv_sec && + start->cpu.tv_nsec > now->cpu.tv_nsec) + now->cpu.tv_nsec = start->cpu.tv_nsec + 1; + else if (start->cpu.tv_sec > now->cpu.tv_sec) { + now->cpu.tv_sec = start->cpu.tv_sec; + now->cpu.tv_nsec = start->cpu.tv_nsec + 1; + } +#endif *cputime = (now->cpu.tv_sec - start->cpu.tv_sec) * TIMER_SECOND_MICRO + (now->cpu.tv_nsec - start->cpu.tv_nsec) / 1000; #else