summaryrefslogtreecommitdiff
path: root/bgpd/bgp_keepalives.c
AgeCommit message (Collapse)Author
2025-03-06Revert "bgpd: Make keepalive pthread be connection based."Donald Sharp
This reverts commit 23bdaba147517abb4a70d03166ad74f759888d46.
2025-02-28bgpd: Make keepalive pthread be connection based.Donald Sharp
Again instead of making the keepalives be peer based use the connection to make it happen. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-28bgpd: Convert bgp_keepalive_send to use a connectionDonald Sharp
The peer is going to eventually have a incoming and outgoing connection. Let's send the data based upon the connection not the peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-10bgpd: make bgp_keepalives_on|off connection orientedDonald Sharp
The bgp_keepalives_on|off functions should use a peer_connection as a basis for it's operation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert THREAD_XXX macros to EVENT_XXX macrosDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24*: Convert thread_fetch and thread_call to event_fetch and event_callDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-21*: Add a hash_clean_and_free() functionDonald Sharp
Add a hash_clean_and_free() function as well as convert the code to use it. This function also takes a double pointer to the hash to set it NULL. Also it cleanly does nothing if the pointer is NULL( as a bunch of code tested for ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-01-17Merge pull request #12641 from samanvithab/bgpd_crashRuss White
bgpd: Fix crash during shutdown due to race condition
2023-01-16bgpd: Fix crash during shutdown due to race conditionSamanvitha B Bhargav
[New LWP 2524] [New LWP 2539] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/opt/avi/bin/bgpd -f /run/frr/avi_ns3_bgpd.config -i /opt/avi/etc/avi_ns3_bgpd.'. Program terminated with signal SIGABRT, Aborted. [Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))] 0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))] 0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6 1 0x00007f92acb17859 in abort () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007f92acb17729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 3 0x00007f92acb28fd6 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 4 0x00007f92accf2164 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 5 0x000055b46be1ef63 in bgp_keepalives_wake () at bgpd/bgp_keepalives.c:311 6 0x000055b46be1f111 in bgp_keepalives_stop (fpt=0x55b46cfacf20, result=<optimized out>) at bgpd/bgp_keepalives.c:323 7 0x00007f92acea9521 in frr_pthread_stop (fpt=0x55b46cfacf20, result=result@entry=0x0) at lib/frr_pthread.c:176 8 0x00007f92acea9586 in frr_pthread_stop_all () at lib/frr_pthread.c:188 9 0x000055b46bdde54a in bgp_pthreads_finish () at bgpd/bgpd.c:8150 10 0x000055b46bd696ca in bgp_exit (status=0) at bgpd/bgp_main.c:210 11 sigint () at bgpd/bgp_main.c:154 12 0x00007f92acecc1e9 in quagga_sigevent_process () at lib/sigevent.c:105 13 0x00007f92aced689a in thread_fetch (m=m@entry=0x55b46cf23540, fetch=fetch@entry=0x7fff95379238) at lib/thread.c:1487 14 0x00007f92aceb2681 in frr_run (master=0x55b46cf23540) at lib/libfrr.c:1010 15 0x000055b46bd676f4 in main (argc=11, argv=0x7fff953795a8) at bgpd/bgp_main.c:482 Root cause: This is due to race condition between main thread & keepalive thread during clean-up. This happens when the keepalive thread is processing a wake signal owning the mutex, when meanwhile the main thread tries to stop the keepalives thread. In main thread, the keepalive thread’s running bit (fpt->running) is set to false, without taking the mutex & then it blocks on mutex. Meanwhile, keepalive thread which owns the mutex sees that the running bit is false & executes bgp_keepalives_finish() which also frees up mutex. Main thread that is waiting on mutex with pthread_mutex_lock() will cause core while trying to access mutex. Fix: Take the lock in main thread while setting the fpt->running to false. Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
2022-12-05bgpd: Make bgp_keepalives.c not use MTYPE_TMPDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-20bgpd: Modify keepalive debug categoryvivek
Log keepalive timer expiry against 'debug bgp keepalive' instead of 'debug bgp neighbor-events'. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
2022-09-06bgpd: release rcu lock in bgp keepalive pthreadMark Stapp
Don't hold the rcu lock in the bgp keepalive pthread: it blocks the rcu pthread and prevents log-file deletion. Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-07-20*: frr_with_mutex change to follow our standardDonald Sharp
convert: frr_with_mutex(..) to: frr_with_mutex (..) To make all our code agree with what clang-format is going to produce Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-05-11*: Properly use memset() when zeroingDonatas Abraitis
Wrong: memset(&a, 0, sizeof(struct ...)); Good: memset(&a, 0, sizeof(a)); Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-03*: remove the checking returned value for hash_get()anlan_cs
Firstly, *keep no change* for `hash_get()` with NULL `alloc_func`. Only focus on cases with non-NULL `alloc_func` of `hash_get()`. Since `hash_get()` with non-NULL `alloc_func` parameter shall not fail, just ignore the returned value of it. The returned value must not be NULL. So in this case, remove the unnecessary checking NULL or not for the returned value and add `void` in front of it. Importantly, also *keep no change* for the two cases with non-NULL `alloc_func` - 1) Use `assert(<returned_data> == <searching_data>)` to ensure it is a created node, not a found node. Refer to `isis_vertex_queue_insert()` of isisd, there are many examples of this case in isid. 2) Use `<returned_data> != <searching_data>` to judge it is a found node, then free <searching_data>. Refer to `aspath_intern()` of bgpd, there are many examples of this case in bgpd. Here, <returned_data> is the returned value from `hash_get()`, and <searching_data> is the data, which is to be put into hash table. Signed-off-by: anlan_cs <vic.lan@pica8.com>
2019-12-02*: generously apply constDavid Lamparter
const const const your boat, merrily down the stream... Signed-off-by: David Lamparter <equinox@diac24.net>
2019-09-16bgpd: do not send keepalives when KA timer is 0Quentin Young
RFC4271 specifies behavior when the hold timer is sent to zero - we should not send keepalives or run a hold timer. But FRR, and other vendors, allow the keepalive timer to be set to zero with a nonzero hold timer. In this case we were sending keepalives constantly and maxing out a pthread to do so. Instead behave similarly to other vendors and do not send keepalives. Unsure what the utility of this is, but blasting keepalives is definitely the wrong thing to do. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-09-03lib: add frr_with_mutex() block-wrapperDavid Lamparter
frr_with_mutex(...) { ... } locks and automatically unlocks the listed mutex(es) when the block is exited. This adds a bit of safety against forgetting the unlock in error paths & co. and makes the code a slight bit more readable. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-05-14lib: hashing functions should take const argumentsQuentin Young
It doesn't make much sense for a hash function to modify its argument, so const the hash input. BGP does it in a couple places, those cast away the const. Not great but not any worse than it was. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-02-25*: Rename backet to bucketTim Bray
Presume typo from original author Signed-off-by: Tim Bray <tim@kooky.org>
2019-01-24Treewide: use ANSI function definitionsRuben Kerkhof
Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2019-01-09lib, bgpd: Convert frr_pthread_set_name to only cause it to set os name of ↵Donald Sharp
the thread The current invocation of frr_pthread_set_name was causing it reset the os_name. There is no need for this, we now always create the pthread appropriately to have both name and os_name. So convert this function to a simple call through of the pthread call now. Before(any of these changes): sharpd@robot ~/frr1> ps -L -p 16895 PID LWP TTY TIME CMD 16895 16895 ? 00:01:39 bgpd 16895 16896 ? 00:00:54 16895 16897 ? 00:00:07 bgpd_ka After: sharpd@donna ~/frr1> ps -L -p 1752 PID LWP TTY TIME CMD 1752 1752 ? 00:00:00 bgpd 1752 1753 ? 00:00:00 bgpd_io 1752 1754 ? 00:00:00 bgpd_ka Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-19*: Replace hash_cmp function return value to a boolDonald Sharp
The ->hash_cmp and linked list ->cmp functions were sometimes being used interchangeably and this really is not a good thing. So let's modify the hash_cmp function pointer to return a boolean and convert everything to use the new syntax. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-09-19lib: remove frr_pthread->idDavid Lamparter
All I can see is an unneccessary complication. If there's some purpose here it needs to be documented... Signed-off-by: David Lamparter <equinox@diac24.net>
2018-08-29*: pthread set name abstractionChirag Shah
Testing Done: TOR#cat /proc/2670/task/2672/comm bgpd_ka TOR# ps H -C bgpd -o 'pid tid cmd comm' PID TID CMD COMMAND 2670 2670 /usr/lib/frr/bgpd -M snmp - bgpd 2670 2671 /usr/lib/frr/bgpd -M snmp - bgpd 2670 2672 /usr/lib/frr/bgpd -M snmp - bgpd_ka Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2018-08-16bgpd: zlog_warn to assert for code that must be executed firstDonald Sharp
In bgp_keepalives.c, it was noticed that we were ensuring that we called an intialization function first, but this is a development escape in that once this was fixed we never see it. So if a developer moves this assumption around, let's crash the program and lead them to this spot instead of silently ignoring the problem. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-07-27bgpd: add keepalive thread nameChirag Shah
Testing Done: Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2018-02-21bgpd: fix incorrect keepalive timer evaluationQuentin Young
Incorrect check for sentinel value effectively caused peers to sometimes use the keepalive timer value of other peers, which sometimes led to hold timer expiry. Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
2018-01-24bgpd: check flags before attempting keepalive opsQuentin Young
If a peer already has keepalives turned on when asking to turn them on, return immediately. Same thing for turning them off. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-01-24bgpd: update pthreads to use lib changesQuentin Young
Use the new threading facilities provided in lib/ to streamline the threads used in bgpd. In particular, all of the lifecycle code has been removed from the I/O thread and replaced with the default loop. Did not do the same to the keepalives thread as it is much smaller (doesn't need the event system). Also cleaned up some comments to match the style guide. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: turn off keepalives when sending NOTIFYQuentin Young
This is necessary because otherwise between the time we wipe the output buffer and the time we push the NOTIFY onto it, the KA generation thread could have pushed a KEEPALIVE in the middle. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: rebase onto masterQuentin Young
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: fix includes for bgp_keeaplives.cQuentin Young
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: restyle bgp_keepalives.[ch]Quentin Young
And update copyright header. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: rename peer_keepalives* --> bgp_keepalives*Quentin Young
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: implement buffered readsQuentin Young
* Move and modify all network input related code to bgp_io.c * Add a real input buffer to `struct peer` * Move connection initialization to its own thread.c task instead of piggybacking off of bgp_read() * Tons of little fixups Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch]. Changes made elsewhere are almost exclusively refactoring peer->ibuf to peer->curr since peer->ibuf is now the true FIFO packet input buffer while peer->curr represents the packet currently being processed by the main pthread. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: use new threading infraQuentin Young
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: use hash table for bgp_keepalives.cQuentin Young
Large numbers of peers makes insertion and removal time for a linked list non-negligible. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: dynamically allocate synchronization primitivesQuentin Young
Changes all synchronization primitives to be dynamically allocated. This should help catch any subtle errors in pthread lifecycles. This change also pre-initializes synchronization primitives before threads begin to run, eliminating a potential race condition that probably would have caused a segfault on startup on a very fast box. Also changes mutex and condition variable allocations to use MTYPE_PTHREAD and updates tests to do the proper initializations. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: remove unused `struct thread` from peerQuentin Young
* Remove t_write * Remove t_keepalive These have been replaced by pthreads and are no longer needed. Since some code looks at these values to determine if the threads are scheduled, also add a new bitfield to store the same information. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30bgpd: put BGP keepalives in a pthreadQuentin Young
This patch, in tandem with moving packet writes into a dedicated kernel thread, fixes session flaps caused by long-running internal operations starving the (old) userspace write thread. BGP keepalives are now produced by a kernel thread and placed onto the peer's output queue. These are then consumed by the write thread. Both of these tasks are concurrent with the rest of bgpd, obviating the session flaps described above. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>