Brian Rak [Mon, 26 Sep 2022 15:18:26 +0000 (11:18 -0400)]
tools: Configure systemd to always restart FRR, regardless of exit code
The current service file configures restarts on-abnormal, which translates to "unclean signal", "timeout", or "watchdog". This patch updates it to always restart, as there's never really a time watchfrr should exit by itself at all.
When removing VRF ( all routes of this VRF), zebra mistakenly forgot to check
whether its routes are in update queue of FPM. So FPM module will crash during
its dealing with these routes, which are already freed.
Add a new HOOK `rib_shutdown()`, `zebra_rtable_node_cleanup()` will use it
to remove these routes from update queue of FPM module before freeing them.
Donald Sharp [Fri, 23 Sep 2022 19:16:40 +0000 (15:16 -0400)]
doc: Align docs to recommend integrated config
Docs were recommending both integrated and non-integrated
config in different sections. Remove the recommendation
for non-integrated config from vtysh.rst.
Donald Sharp [Mon, 19 Sep 2022 16:34:18 +0000 (12:34 -0400)]
isisd: Fix memory leak on shutdown with prefix lists
==2623619==
==2623619== 6 bytes in 1 blocks are definitely lost in loss record 3 of 75
==2623619== at 0x483877F: malloc (vg_replace_malloc.c:307)
==2623619== by 0x4B55E4A: strdup (strdup.c:42)
==2623619== by 0x493C992: qstrdup (memory.c:128)
==2623619== by 0x1A9212: isis_instance_fast_reroute_level_1_remote_lfa_prefix_list_modify (isis_nb_config.c:1599)
==2623619== by 0x494837C: nb_callback_modify (northbound.c:1083)
==2623619== by 0x4948C6E: nb_callback_configuration (northbound.c:1352)
==2623619== by 0x494919D: nb_transaction_process (northbound.c:1473)
==2623619== by 0x4947DA9: nb_candidate_commit_apply (northbound.c:906)
==2623619== by 0x4947EBA: nb_candidate_commit (northbound.c:938)
==2623619== by 0x494EB9E: nb_cli_classic_commit (northbound_cli.c:64)
==2623619== by 0x494F3DC: nb_cli_apply_changes_internal (northbound_cli.c:250)
==2623619== by 0x494F4E2: nb_cli_apply_changes (northbound_cli.c:268)
==2623619== by 0x1BEF0F: isis_frr_remote_lfa_plist_magic (isis_cli.c:1899)
==2623619== by 0x1B7636: isis_frr_remote_lfa_plist (isis_cli_clippy.c:3406)
==2623619== by 0x48EBA75: cmd_execute_command_real (command.c:997)
==2623619== by 0x48EBD4E: cmd_execute_command_strict (command.c:1108)
==2623619== by 0x48EC1E6: command_config_read_one_line (command.c:1268)
==2623619== by 0x48EC35B: config_from_file (command.c:1313)
==2623619== by 0x4999CC1: vty_read_file (vty.c:2347)
==2623619== by 0x499A4AF: vty_read_config (vty.c:2567)
==2623619== by 0x4924B12: frr_config_read_in (libfrr.c:984)
==2623619== by 0x498F5E3: thread_call (thread.c:2008)
==2623619== by 0x49253DA: frr_run (libfrr.c:1198)
==2623619== by 0x14FC53: main (isis_main.c:273)
Abhishek N R [Mon, 29 Aug 2022 12:17:08 +0000 (05:17 -0700)]
pimd, pim6d: Changing IGMP to GM in few macro's.
Changing
IGMP_DEFAULT_ROBUSTNESS_VARIABLE to GM_DEFAULT_ROBUSTNESS_VARIABLE,
IGMP_GENERAL_QUERY_INTERVAL to GM_GENERAL_QUERY_INTERVAL,
IGMP_QUERY_MAX_RESPONSE_TIME_DSEC to GM_QUERY_MAX_RESPONSE_TIME_DSEC and
IGMP_SPECIFIC_QUERY_MAX_RESPONSE_TIME_DSEC to GM_SPECIFIC_QUERY_MAX_RESPONSE_TIME_DSEC
to accomodate both igmp and mld. And moved it to common file.
When R1 and R2 establish BGP session, R1 begins to send initial updates.
If R2 sends a route-refresh request before EoR, it's silently ignored
by R1, and routes received earlier have no chance to be processed again.
RFC7313 says, "for a BGP speaker that supports the BGP Graceful Restart,
it MUST NOT send a BoRR for an <AFI, SAFI> to a neighbor before it sends
the EoR for the <AFI, SAFI> to the neighbor." But it doesn't forbid
route-refresh request to be sent before receiving EoR.
To handle this scenario, postpone response to refresh request until EoR
is sent.
CID 1519843 (#2 of 2): Uninitialized scalar variable (UNINIT)
43. uninit_use_in_call: Using uninitialized value pkt_src->sin6_addr when calling gm_rx_process
Donald Sharp [Wed, 14 Sep 2022 17:48:31 +0000 (13:48 -0400)]
lib: Fix skip of every other plist deletion
When bulk deleting prefix lists on shutdown the code
was calling plist_delete, which removed the item
from the master->str list, and then popping the next
item on the list and just dropping it on the floor.
The pop is not needed.
Abhishek N R [Wed, 14 Sep 2022 06:29:33 +0000 (23:29 -0700)]
pimd, pim6d: Changing IGMP to GM in debug macros.
Changed PIM_DEBUG_IGMP_TRACE to PIM_DEBUG_GM_TRACE and
PIM_DEBUG_IGMP_TRACE_DETAIL to PIM_DEBUG_GM_TRACE_DETAIL.
Hence, these macros can be used for both v6 and v4.
Issue: #11895
Co-authored-by: Sai Gomathi N <nsaigomathi@vmware.com> Signed-off-by: Abhishek N R <abnr@vmware.com>
bgpd: Fix memory leak for `conf_copy()` - SoO ecommunity
==1179738== 48 (40 direct, 8 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 29
==1179738== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1179738== by 0x493C8D5: qcalloc (memory.c:116)
==1179738== by 0x208F0C: ecommunity_dup (bgp_ecommunity.c:267)
==1179738== by 0x2B300C: conf_copy (bgp_updgrp.c:170)
==1179738== by 0x2B35BF: peer2_updgrp_copy (bgp_updgrp.c:277)
==1179738== by 0x2B5189: update_group_find (bgp_updgrp.c:826)
==1179738== by 0x2B70D0: update_group_adjust_peer (bgp_updgrp.c:1769)
==1179738== by 0x23DB7D: update_group_adjust_peer_afs (bgp_updgrp.h:519)
==1179738== by 0x243B21: bgp_establish (bgp_fsm.c:2129)
==1179738== by 0x244B94: bgp_event_update (bgp_fsm.c:2597)
==1179738== by 0x26B0E6: bgp_process_packet (bgp_packet.c:2895)
==1179738== by 0x498F5FD: thread_call (thread.c:2008)
==1179738== by 0x49253DA: frr_run (libfrr.c:1198)
==1179738== by 0x1EEC38: main (bgp_main.c:520)
==536197== 400 (160 direct, 240 indirect) bytes in 4 blocks are definitely lost in loss record 19 of 21
==536197== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==536197== by 0x491C753: qcalloc (memory.c:116)
==536197== by 0x303FA9: aspath_dup (bgp_aspath.c:698)
==536197== by 0x304B2A: aspath_replace_specific_asn (bgp_aspath.c:1219)
==536197== by 0x256840: bgp_peer_as_override (bgp_route.c:1781)
==536197== by 0x256840: subgroup_announce_check (bgp_route.c:2216)
==536197== by 0x258345: subgroup_process_announce_selected (bgp_route.c:2804)
==536197== by 0x27F2CA: group_announce_route_walkcb (bgp_updgrp_adv.c:199)
==536197== by 0x4905A51: hash_walk (hash.c:285)
==536197== by 0x27E8D1: update_group_af_walk (bgp_updgrp.c:1866)
==536197== by 0x2809D3: group_announce_route (bgp_updgrp_adv.c:1022)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:3189)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:2975)
==536197== by 0x2581F7: bgp_process_wq (bgp_route.c:3330)
==536197== by 0x4961787: work_queue_run (workqueue.c:285)
==536197== by 0x4957745: thread_call (thread.c:2008)
==536197== by 0x4910B77: frr_run (libfrr.c:1198)
==536197== by 0x1ED6AC: main (bgp_main.c:520)
bgpd: Fix memory leak for `set as-path replace` route-map command
==1174993== 252 (120 direct, 132 indirect) bytes in 3 blocks are definitely lost in loss record 77 of 100
==1174993== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1174993== by 0x493C8D5: qcalloc (memory.c:116)
==1174993== by 0x378E38: aspath_dup (bgp_aspath.c:698)
==1174993== by 0x2A39E2: route_set_aspath_replace (bgp_routemap.c:2259)
==1174993== by 0x4965C71: route_map_apply_ext (routemap.c:2664)
==1174993== by 0x27BCC8: bgp_input_modifier (bgp_route.c:1657)
==1174993== by 0x281AB9: bgp_update (bgp_route.c:3992)
==1174993== by 0x286368: bgp_nlri_parse_ip (bgp_route.c:5890)
==1174993== by 0x264D20: bgp_nlri_parse (bgp_packet.c:347)
==1174993== by 0x2682FE: bgp_update_receive (bgp_packet.c:1921)
==1174993== by 0x26AA67: bgp_process_packet (bgp_packet.c:2822)
==1174993== by 0x498F5FD: thread_call (thread.c:2008)
==1174993== by 0x49253DA: frr_run (libfrr.c:1198)
==1174993== by 0x1EEC38: main (bgp_main.c:520)
Mark Stapp [Thu, 8 Sep 2022 20:14:36 +0000 (16:14 -0400)]
bgpd: avoid notify race between io and main pthreads
The "bgp_notify_" apis in bgp_packet.c generate a notification
to a peer, usually during error handling. The io pthread wants
to send notifications in a couple of cases during early
received-packet validation - but the existing api interacts
with the peer struct itself, and that's not safe.
Add a new api for use by the io pthread, and adjust the main
notify api so that it can avoid touching the peer struct.
To resolve link dependencies of unordered interfaces, the commit
`520ebf72b27c2462ce8b0dc5a1d4cb83956df69c` has separated assignment of
`zif->link_ifindex` and `zif->link` from `netlink_interface()` during startup.
The fixup stage of `zebra_if_update_all_links()` goes into the last of
`interface_lookup_netlink()`, it can't be executed in the case of error in
above `netlink_parse_info()`s.
`RTM_GETTUNNEL` is not supported in linux kernel until 5.18, so
`netlink_parse_info()` will throw error with the previous versions.
If two conditions are met, (it is a common case)
1. Interfaces are created before frr restart/start
2. Linux kernel version < 5.18
the link dependencies will not be done, then evpn feature will be broken.
IMO we should just ignore this error.
Before it worked only when configured initially via CLI. Later, when we
receive a new route, that should match a decent MED, we just skip it, because
MED mismatch is not recalculated.