Donald Sharp [Tue, 19 Jul 2022 17:57:56 +0000 (13:57 -0400)]
zebra: Add some more data to rtadv socket failures
The creation of the rtadv socket can fail but there
is very very little data associated with this event
to let the operator know something has gone terribly
wrong.
Please note if this socket fails to create or fails
the setsockopt's rtadv is basically just really really
messed up. I am not sure what can be done here.
Donald Sharp [Fri, 8 Jul 2022 20:35:51 +0000 (16:35 -0400)]
tests: Make bgp_snmp_mplsl3vpn be more forgiving
I rarely get this failure:
@classname: bgp_snmp_mplsl3vpn.test_bgp_snmp_mplsvpn
@name: test_pe1_converge_evpn
@time: 44.875
@message: AssertionError: BGP SNMP does not seem to be running
assert False
+ where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
+ where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
"Wait for protocol convergence"
tgen = get_topogen()
assertmsg = "BGP SNMP does not seem to be running"
> assert r1_snmp.test_oid("bgpVersion", "10"), assertmsg
E AssertionError: BGP SNMP does not seem to be running
E assert False
E + where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
E + where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
Under heavy system load a quick test before BGP can fully come up can result in a failed
test. Add some extra time for snmp to come up properly.
currently "show bgp summary" and "sho bgp summary wide" commands
provide a description string until a whitespace is occuring this
respectively with size limits of 20 and 60 chars
now theses two commands are providing strings with all
characters until the last witespace before size limit
zebra: Cleanup the memory from the hash for MPLS stuff
==1595641== 280 (80 direct, 200 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EC880: mpls_lsp_install (zebra_mpls.c:3192)
==1595641== by 0x1C51BB: zread_vrf_label (zapi_msg.c:3197)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
==1595641==
==1595641== 1,400 (400 direct, 1,000 indirect) bytes in 5 blocks are definitely lost in loss record 35 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EBD7C: mpls_zapi_labels_process (zebra_mpls.c:2915)
==1595641== by 0x1C35D9: zread_mpls_labels_add (zapi_msg.c:2513)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
bgpd: Add constants for some repetitive CLI strings
"Address Family\n"
"Address Family modifier\n"
Before:
```
donatas-laptop(config-router)# address-family ipv4
<cr>
flowspec Address Family Modifier
labeled-unicast Address Family modifier
multicast Address Family modifier
unicast Address Family Modifier
vpn Address Family modifier
```
After:
```
donatas-laptop(config-router)# address-family
ipv4 Address Family
ipv6 Address Family
l2vpn Address Family
donatas-laptop(config-router)# address-family ipv4
<cr>
flowspec Address Family modifier
labeled-unicast Address Family modifier
multicast Address Family modifier
unicast Address Family modifier
vpn Address Family modifier
```
ldpd: Check if the thread is scheduled before calling for remained time
LDPD crashes when hold time is configured to 65535:
(gdb) bt
0 0x00007f8c3fc224bb in raise () from /lib64/libpthread.so.0
1 0x00007f8c4138a3dd in core_handler () from /lib64/libfrr.so.0
2 <signal handler called>
3 0x00007f8c3fc1ccc0 in pthread_mutex_lock () from /lib64/libpthread.so.0
4 0x00007f8c4139914b in thread_timer_remain_msec () from /lib64/libfrr.so.0
5 0x00007f8c41399209 in thread_timer_remain_second () from /lib64/libfrr.so.0
6 0x000000000040eb19 in adj_to_ctl ()
7 0x0000000000427b38 in ldpe_nbr_ctl ()
8 0x000000000042fd68 in control_dispatch_imsg ()
9 0x00007f8c4139a628 in thread_call () from /lib64/libfrr.so.0
10 0x00000000004265fc in ldpe ()
11 0x000000000040a68f in main ()
==395247== 8,268 (32 direct, 8,236 indirect) bytes in 1 blocks are definitely lost in loss record 199 of 205
==395247== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==395247== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x1FBF63: community_intern (in /usr/lib/frr/bgpd)
==395247== by 0x1FC0C5: community_parse (in /usr/lib/frr/bgpd)
==395247== by 0x1F0B66: bgp_attr_community (in /usr/lib/frr/bgpd)
==395247== by 0x1F4185: bgp_attr_parse (in /usr/lib/frr/bgpd)
==395247== by 0x26BC29: bgp_update_receive (in /usr/lib/frr/bgpd)
==395247== by 0x26E887: bgp_process_packet (in /usr/lib/frr/bgpd)
==395247== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==361630== 24,780 (96 direct, 24,684 indirect) bytes in 3 blocks are definitely lost in loss record 94 of 97
==361630== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==361630== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1FD3CC: bgp_ca_alias_insert (in /usr/lib/frr/bgpd)
==361630== by 0x2CF8E5: bgp_community_alias_magic (in /usr/lib/frr/bgpd)
==361630== by 0x2C980B: bgp_community_alias (in /usr/lib/frr/bgpd)
==361630== by 0x48E3556: cmd_execute_command_real (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E384B: cmd_execute_command_strict (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3D41: command_config_read_one_line (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3EBA: config_from_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x499065C: vty_read_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4990FF4: vty_read_config (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491CB95: frr_config_read_in (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==361630==
==361630== 24,780 (96 direct, 24,684 indirect) bytes in 3 blocks are definitely lost in loss record 95 of 97
==361630== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==361630== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1FD39C: bgp_ca_community_insert (in /usr/lib/frr/bgpd)
==361630== by 0x2CF8F4: bgp_community_alias_magic (in /usr/lib/frr/bgpd)
==361630== by 0x2C980B: bgp_community_alias (in /usr/lib/frr/bgpd)
==361630== by 0x48E3556: cmd_execute_command_real (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E384B: cmd_execute_command_strict (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3D41: command_config_read_one_line (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3EBA: config_from_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x499065C: vty_read_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4990FF4: vty_read_config (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491CB95: frr_config_read_in (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
Christian Hopps [Thu, 14 Jul 2022 15:33:39 +0000 (11:33 -0400)]
tests: check memleaks end of module and ignore daemonizing parent
- ignore parent from daemonize valgrind files these allocations will be
checked in the child.
- check for memleaks at end of module/file not just after tests.
pimd,pim6d: Set RP to true if the address matches, ignore prefix-length
The API pim_rp_check_interface_addrs checks if the RP address matches
with the primary address then it returns true.
In case of PIMv4 this condition is true, therefore the router becomes RP.
But in case of PIMv6, this condition does not pass because primary address
for PIMv6 is link-local address.
Also PIMv4 allows secondary addresses to be used as RP
if it is a host route in case primary does not match.
Fixing it by only checking the configured
RP address with the interface address and ignoring the prefix
length since it does not matter.
In several places, we are getting the vrf structure using
vrf_lookup_by_name(). Again we are passing vrf->vrf_id to
pim_get_pim_instance() to get the pim_instance.
The API pim_get_pim_instance() again get the VRF structure using
vrf_lookup_by_id(). This is avoided in this PR.
Louis Scalbert [Fri, 8 Jul 2022 10:37:57 +0000 (12:37 +0200)]
bgpd: rename "struct bgp" variables in mplsvpn
The "struct bgp" variable names in the mplsvpn bgp code do not
explicitly say whether they refer to a source or destination BGP
instance. Some variable declarations are commented out with "from" and
"to" but this does not avoid confusion within the functions. The names
of "struct bgp" variables are reused in different functions but their
names sometimes refer to a source instance and sometimes to a
destination instance.
Rename the "struct bgp" variable names to from_bgp and to_bgp.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
tools: Do not wrap the pidfile into double-quotes for frrcommon.sh
The problem is that when we run watchfrr.sh/frrinit.sh, we get something like:
```
cat: '"/var/run/frr/staticd.pid"': No such file or directory
cat: '"/var/run/frr/babeld.pid"': No such file or directory
cat: '"/var/run/frr/zebra.pid"': No such file or directory
```
bgpd: Free ->raw_data from Hard Notification message after we use it
==175785== 0 bytes in 1 blocks are definitely lost in loss record 1 of 88
==175785== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==175785== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x269823: bgp_notify_decapsulate_hard_reset (in /usr/lib/frr/bgpd)
==175785== by 0x26C85D: bgp_notify_receive (in /usr/lib/frr/bgpd)
==175785== by 0x26E94E: bgp_process_packet (in /usr/lib/frr/bgpd)
==175785== by 0x4985349: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==175785==
pimd: Correct the order of show json for interface traffic
"show ip pim interface traffic json" shows pruneTx first and then
pruneRx stats
where as
"show ip pim interface <ifname> json" shows pruneRx first and then
pruneTx stats.
Although the values are right but the display looks odd.
Making it same as other stats, first display Rx and then Tx.
API to verify static route was checking whether
router is installed with expected nexthop. In
this particular scenario we has same route as
Connected and Static both. In heavy loaded
system static routes was taking time to get
installed and API was doing the verification
on Connected route instead Static route.
Enhanced scripts to check only static routes.