Rafael Zalamena [Thu, 19 Jan 2023 15:16:11 +0000 (12:16 -0300)]
*: introduce function for sequence numbers
Don't directly use `time()` for generating sequence numbers for two
reasons:
1. `time()` can go backwards (due to NTP or time adjustments)
2. Coverity Scan warns every time we truncate a `time_t` variable for
good reason (verify that we are Y2K38 ready).
Rafael Zalamena [Thu, 19 Jan 2023 15:09:29 +0000 (12:09 -0300)]
pimd: fix mtracebis tool warning
Use `getpid()` to initialize the sequence number. This change silences
Coverity Scan warning about truncated use of `time()` which in this case
is not a problem.
bgpd: imported vpn entries get appropriate distance
MPLS VPN networks can either peer with iBGP or eBGP. When
calculating the distance to send to zebra, the imported prefix
is never sent with distance information, even if the vty
command is used under the ipv4 unicast address family:
The expectation is that the incoming prefix has to follow the
distance that is configured, or the distance derived from the peer
relationship established by the parent prefix.
In the case, an iBGP relationship is done, and no distance
configuration is done, the below show is expected:
bgpd: Fix crash during shutdown due to race condition
[New LWP 2524]
[New LWP 2539]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/opt/avi/bin/bgpd -f /run/frr/avi_ns3_bgpd.config -i /opt/avi/etc/avi_ns3_bgpd.'.
Program terminated with signal SIGABRT, Aborted.
[Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))]
0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))]
0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007f92acb17859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007f92acb17729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3 0x00007f92acb28fd6 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
4 0x00007f92accf2164 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
5 0x000055b46be1ef63 in bgp_keepalives_wake () at bgpd/bgp_keepalives.c:311
6 0x000055b46be1f111 in bgp_keepalives_stop (fpt=0x55b46cfacf20, result=<optimized out>) at bgpd/bgp_keepalives.c:323
7 0x00007f92acea9521 in frr_pthread_stop (fpt=0x55b46cfacf20, result=result@entry=0x0) at lib/frr_pthread.c:176
8 0x00007f92acea9586 in frr_pthread_stop_all () at lib/frr_pthread.c:188
9 0x000055b46bdde54a in bgp_pthreads_finish () at bgpd/bgpd.c:8150
10 0x000055b46bd696ca in bgp_exit (status=0) at bgpd/bgp_main.c:210
11 sigint () at bgpd/bgp_main.c:154
12 0x00007f92acecc1e9 in quagga_sigevent_process () at lib/sigevent.c:105
13 0x00007f92aced689a in thread_fetch (m=m@entry=0x55b46cf23540, fetch=fetch@entry=0x7fff95379238) at lib/thread.c:1487
14 0x00007f92aceb2681 in frr_run (master=0x55b46cf23540) at lib/libfrr.c:1010
15 0x000055b46bd676f4 in main (argc=11, argv=0x7fff953795a8) at bgpd/bgp_main.c:482
Root cause:
This is due to race condition between main thread & keepalive thread during clean-up.
This happens when the keepalive thread is processing a wake signal owning the mutex, when meanwhile the main thread tries to stop the keepalives thread.
In main thread, the keepalive thread’s running bit (fpt->running) is set to false, without taking the mutex & then it blocks on mutex.
Meanwhile, keepalive thread which owns the mutex sees that the running bit is false & executes bgp_keepalives_finish() which also frees up mutex.
Main thread that is waiting on mutex with pthread_mutex_lock() will cause core while trying to access mutex.
Fix:
Take the lock in main thread while setting the fpt->running to false.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
Donald Sharp [Tue, 10 Jan 2023 15:29:33 +0000 (10:29 -0500)]
tests: route_scale tests are failing on micronet
I'm seeing test failures after in micronet runs in CI
after 7 seconds * 30 attempts at seeing if it succeeds.
Let's see if another 60 seconds of attempts allows
this to work properly.
bgpd: Do not send routes back received from a peer
Before this patch, we needed to explicitly define a neighbor to be SOLO
(= separate update-group). Let's ease this functionality for an operator to
avoid confusions.
tor-1# show vrf default vni
VRF VNI VxLAN IF
L3-SVI State Rmac
default 100 None
None Down None
tor-1#
tor-1# show vrf all vni
VRF VNI VxLAN IF
L3-SVI State Rmac
default 100 None
None Down None
sym_1 8888 vxlan99
vlan490_l3 Up 44:38:39:ff:ff:25
sym_2 8889 vxlan99
vlan491_l3 Up 44:38:39:ff:ff:25
sym_3 8890 vxlan99
vlan492_l3 Up 44:38:39:ff:ff:25
sym_4 8891 vxlan99
vlan493_l3 Up 44:38:39:ff:ff:25
sym_5 8892 vxlan99
vlan494_l3 Up 44:38:39:ff:ff:25
tor-1#
tor-1#
This change alters the behavior of existing test code. The
default mode (before any call to luSetWaitType()) is now
"strict".
The historical behavior of luCommand(op="wait) is to ignore
failures to match the specified regexp in the specified time.
In those cases, no result was logged and no error was signaled.
This change introduces a new "strict" mode for luCommand(op="wait):
in "strict" wait mode, each invocation of luCommand(op="wait)
generates an explicit, logged failure result when it fails to match
the specified regexp in the specified time. These failures signal
an error for the test.
Calling luSetWaitType("nostrict") restores the historical behavior.
Calling luSetWaitType("strict") (re)enables the new strict behavior.
Individual calls to luCommand() may also specify op="wait-nostrict"
to override any default and use the historical behavior.
Individual calls to luCommand() may also specify op="wait-strict"
to override any default and use the new behavior.
Rafael Zalamena [Wed, 24 Mar 2021 12:41:04 +0000 (09:41 -0300)]
topotests: test BFD static route integration
Test that BFD static monitoring works:
When BFD session is up the routes are installed in the RIB and
distributed with routing protocol (in this case BGP). When the session
is down it is removed from RIB and propagated.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Rafael Zalamena [Mon, 28 Nov 2022 19:44:37 +0000 (16:44 -0300)]
staticd: fix bug on route reinstallation
When configuring a route with multiple next hops on boot and the
interface is missing the route never shows up (even after interface is
properly learned).
Fix: force route clean up by uninstalling it and putting it back
entirely instead of just updating the missing next hop.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Rafael Zalamena [Tue, 15 Nov 2022 14:22:09 +0000 (11:22 -0300)]
lib: BFD automatic source selection
Implement new BFD library issue to allow protocols to configure BFD
sessions with automatic source selection.
The source selection will be based on the Next Hop Tracking feature:
`zebra` will do RIB lookups to determine the output interface and the
primary source address of that interface will be used as source.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
json support extended for show [ip] bgp vrfs <vrf-name> json
Before:
```
tor-2# show ip bgp vrfs default json
% JSON option not yet supported for specific VRF
tor-2#
tor-2# show bgp vrfs sym_1 json
% JSON option not yet supported for specific VRF
tor-2#
```
Notice the missing pimreg11 device needed in vrf blue:
2023-01-11 17:15:06,731.731 DEBUG: topolog.r1: LinuxNamespace(r1): cmd_status("['/bin/bash', '-c', 'vtysh -c "show int brief" 2>/dev/null']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2023-01-11 17:15:06,781.781 INFO: topolog.r1: vtysh result:
Interface Status VRF Addresses
--------- ------ --- ---------
blue up blue 192.168.0.1/32
r1-eth0 up blue 192.168.100.1/24
r1-eth1 up blue 192.168.101.1/24
Interface Status VRF Addresses
--------- ------ --- ---------
erspan0 down default
gre0 down default
gretap0 down default
lo up default
pimreg up default
Interface Status VRF Addresses
--------- ------ --- ---------
r1-eth2 up red 192.168.100.1/24
r1-eth3 up red 192.168.101.1/24
red up red 192.168.0.1/32
While on a 5.4 machine we have this:
mininet310# show int brief
Interface Status VRF Addresses
--------- ------ --- ---------
blue up blue
dummy1 up blue
dummy2 up blue
pimreg11 up blue
As such let's limit the test to a 4.19 kernel or above that our
documentations states we need for proper pim operation.
Donald Sharp [Tue, 10 Jan 2023 13:36:50 +0000 (08:36 -0500)]
zebra: Continue fpm_read when we decide a netlink message is not needed
When FRR receives a netlink message that it decides to stop parsing
it returns a 0 ( instead of a -1 ). Just make the dplane continue
reading other data instead of aborting the read.