Donald Sharp [Wed, 12 Oct 2022 18:53:21 +0000 (14:53 -0400)]
bgpd: Allow `network XXX` to work with bgp suppress-fib-pending
When bgp is using `bgp suppress-fib-pending` and the end
operator is using network statements, bgp was not sending
the network'ed prefix'es to it's peers. Fix this.
Also update the test cases for bgp_suppress_fib to test
this new corner case( I am sure that there are going to
be others that will need to be added ).
Donald Sharp [Tue, 11 Oct 2022 17:21:03 +0000 (13:21 -0400)]
lib: Free some memory in scripting subsystem at shutdown
Pre:
staticd: showing active allocations in memory group libfrr
staticd: memstats: Scripting : 16 * (variably sized)
staticd: memstats: Hash : 2 * (variably sized)
staticd: memstats: Hash Bucket : 8 * 32
staticd: memstats: Hash Index : 1 * (variably sized)
staticd: memstats: Link List : 1 * 40
staticd: memstats: Link Node : 1 * 24
staticd: showing active allocations in memory group logging subsystem
staticd: memstats: log file target : 1 * 88
staticd: showing active allocations in memory group staticd
Post:
staticd: showing active allocations in memory group libfrr
staticd: showing active allocations in memory group logging subsystem
staticd: memstats: log file target : 1 * 88
staticd: showing active allocations in memory group staticd
Donatas Abraitis [Tue, 11 Oct 2022 19:36:26 +0000 (22:36 +0300)]
tools: Handle sequence numbers for BGP community (large/ext) in frr-reload.py
If we add/modify community/large/ext lists without sequence numbers, and
doing frr-reload.py, then rules with sequence numbers (show running-config
always adds sequence numbers) will be deleted and new ones will be re-added.
Donald Sharp [Wed, 20 Jul 2022 20:43:17 +0000 (16:43 -0400)]
ospfclient: Ensure ospf_apiclient_lsa_originate cannot accidently write into stack
Even though OSPF_MAX_LSA_SIZE is quite large and holds the upper bound
on what can be written into a lsa, let's add a small check to ensure
it is not possible to do a bad thing.
This wins one of the long standing bug awards. 2003!
Donald Sharp [Fri, 30 Sep 2022 12:57:43 +0000 (08:57 -0400)]
bgpd: Ensure FRR has enough data to read 2 bytes in bgp_open_option_parse
In bgp_open_option_parse the code is checking that the
stream has at least 2 bytes to read ( the opt_type and
the opt_length). However if BGP_OPEN_EXT_OPT_PARAMS_CAPABLE(peer)
is configured then FRR is reading 3 bytes. Which is not good
since the packet could be badly formateed. Ensure that
FRR has the appropriate data length to read the data.
Donald Sharp [Fri, 30 Sep 2022 12:51:45 +0000 (08:51 -0400)]
bgpd: Ensure FRR has enough data to read 2 bytes in peek_for_as4_capability
In peek_for_as4_capability the code is checking that the
stream has at least 2 bytes to read ( the opt_type and the
opt_length ). However if BGP_OPEN_EXT_OPT_PARAMS_CAPABLE(peer)
is configured then FRR is reading 3 bytes. Which is not good
since the packet could be badly formated. Ensure that
FRR has the appropriate data length to read the data.
Quentin Young [Fri, 11 Jun 2021 23:01:42 +0000 (19:01 -0400)]
bgpd: don't adv conditionally withdrawn routes
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Quentin Young [Tue, 15 Jun 2021 23:49:19 +0000 (19:49 -0400)]
bgpd: apply cond-adv policy to update group
The new outbound filter to apply conditional advertisement policy was
not working properly due to complications with update groups. The two
routemaps were properly copied into the update group peer filter but not
the conditional advertisement state.
Signed-off-by: Quentin Young <qlyoung@nvidia.com> Signed-off-by: Mark Stapp <mstapp@nvidia.com>
pimd: fix static mroute to also take into account the input interface
Allow the same group/source route to be configured on more than one interface.
Currently FRR doesn't allow adding the same mroute on different input interfaces.
Current behavior, if we have the following config:
```
interface eth1
ip mroute eth0 239.0.0.1
interface eth2
ip mroute eth0 239.0.0.1
```
Only one multicast route will be installed with an input interface of the last
interface configured.
Trey Aspelund [Thu, 4 Aug 2022 01:43:31 +0000 (01:43 +0000)]
bgpd: fix show bgp l2vpn evpn route rd crashes
bgpd was crashing every time `show bgp l2vpn evpn route rd` was issued
with an RD that didn't match "all". This was introduced by 9b01d289883
which changed how argv_find() is handled in various vtysh commands, but
the new changes forgot a "!". So let's re-add the "!".
Before:
```
ub20# show bgp l2vpn evpn route rd 399672:100
vtysh: error reading from bgpd: Resource temporarily unavailable (11)Warning: closing connection to bgpd because of an I/O error!
ub20#
ub20# show bgp l2vpn evpn route rd 399672:100 mac 11:11:11:11:11:11
vtysh: error reading from bgpd: Resource temporarily unavailable (11)Warning: closing connection to bgpd because of an I/O error!
ub20#
```
anlan_cs [Mon, 1 Aug 2022 07:30:07 +0000 (03:30 -0400)]
zebra: fix bond down for evpn-mh
The test case is with `redirect-off` in evpn multi-homing environment:
```
evpn mh redirect-off
```
After the environment is setup, do the following steps:
1) Let one member of ES learn one mac:
```
2e:52:bb:bb:2f:46 dev ae1 vlan 100 master bridge0 static
```
Now everything is ok and the mac can be synced to other ES peers.
2) Shutdown bond1. At this time, zebra will get three netlink messages,
not one as current code expected. Like:
```
e4:f0:04:89:b6:46 dev vxlan10030 vlan 30 master bridge0 static <-A
e4:f0:04:89:b6:46 dev vxlan10030 nhid 536870913 self extern_learn <-B
e4:f0:04:89:b6:46 dev vxlan10030 vlan 30 self <-C
```
With A), zebra will wrongly remove this mac again:
```
ZEBRA: dpAdd remote MAC e4:f0:04:89:b6:46 VID 30
ZEBRA: Add/update remote MAC e4:f0:04:89:b6:46 intf vxlan10030(26) VNI 10030 flags 0xa01 - del local
ZEBRA: Send MACIP Del f None MAC e4:f0:04:89:b6:46 IP (null) seq 0 L2-VNI 10030 ESI - to bgp
```
With C), zebra will wrongly add this mac again:
```
ZEBRA: Rx RTM_NEWNEIGH AF_BRIDGE IF 26 VLAN 30 st 0x2 fl 0x2 MAC e4:f0:04:89:b6:46 nhg 0
ZEBRA: dpAdd remote MAC e4:f0:04:89:b6:46 VID 30
```
zebra should skip the two messages with `vid`. Otherwise, it will send many
*wrong* messages to bgpd, and the logic is wrong.
`nhg/dst` is in 2nd message without `vid`, it is useful to call
`zebra_evpn_add_update_local_mac()`. But it will fail with "could not find EVPN"
warning for no `vid`, can't call `zebra_evpn_add_update_local_mac()`:
With B):
```
ZEBRA: Rx RTM_NEWNEIGH AF_BRIDGE IF 26 st 0x2 fl 0x12 MAC e4:f0:04:89:b6:46 nhg 536870913
ZEBRA: dpAdd local-nw-MAC e4:f0:04:89:b6:46 VID 0
ZEBRA: Add/Update MAC e4:f0:04:89:b6:46 intf ae1(18) VID 0, could not find EVPN
```
Here, we can get `vid` from vxlan interface instead of from netlink message.
In summary, `zebra_vxlan_dp_network_mac_add()` will process the three messages
wrongly expecting only one messsage, so its logic is wrong. Just skip the two
unuseful messages with `vid`.
the Zapi ZEBRA_RULE_ADD message was modified but
the bgp version was not updated appropriately and
when zebra received the message it did not properly
read it.
Donald Sharp [Thu, 21 Jul 2022 19:42:51 +0000 (15:42 -0400)]
bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer. Not have
one's that write over others. Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC. This is insufficient in that
we were accidently overwriting the one LL with other
data. This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.
zebra: Cleanup the memory from the hash for MPLS stuff
==1595641== 280 (80 direct, 200 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EC880: mpls_lsp_install (zebra_mpls.c:3192)
==1595641== by 0x1C51BB: zread_vrf_label (zapi_msg.c:3197)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
==1595641==
==1595641== 1,400 (400 direct, 1,000 indirect) bytes in 5 blocks are definitely lost in loss record 35 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EBD7C: mpls_zapi_labels_process (zebra_mpls.c:2915)
==1595641== by 0x1C35D9: zread_mpls_labels_add (zapi_msg.c:2513)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
ldpd: Check if the thread is scheduled before calling for remained time
LDPD crashes when hold time is configured to 65535:
(gdb) bt
0 0x00007f8c3fc224bb in raise () from /lib64/libpthread.so.0
1 0x00007f8c4138a3dd in core_handler () from /lib64/libfrr.so.0
2 <signal handler called>
3 0x00007f8c3fc1ccc0 in pthread_mutex_lock () from /lib64/libpthread.so.0
4 0x00007f8c4139914b in thread_timer_remain_msec () from /lib64/libfrr.so.0
5 0x00007f8c41399209 in thread_timer_remain_second () from /lib64/libfrr.so.0
6 0x000000000040eb19 in adj_to_ctl ()
7 0x0000000000427b38 in ldpe_nbr_ctl ()
8 0x000000000042fd68 in control_dispatch_imsg ()
9 0x00007f8c4139a628 in thread_call () from /lib64/libfrr.so.0
10 0x00000000004265fc in ldpe ()
11 0x000000000040a68f in main ()
**General**
- Add camelcase json keys in addition to pascalcase (Wrong JSON keys will be depracated)
- Fix corruption when route-map delete/add sequence happens (fast re-add)
- Reworked gRPC
- RFC5424 & journald extended syslog target
**bfdd**
- Fix broken FSM in active/passive modes
**bgpd**
- Notification Message Support for BGP Graceful Restart (rfc8538)
- BGP Cease Notification Subcode For BFD
- Send Hold Timer for BGP (own implementation without an additional knob)
- New `set as-path replace` command for BGP route-map
- New `match peer` command for BGP route-map
- New `ead-es-frag evi-limit` command for EVPN
- New `match evpn route-type` command for EVPN route-map to match Type-1/Type-4
- JSON outputs for all RPKI show commands
- Set attributes via route-map for BGP conditional advertisements
- Pass non-transitive extended communities between RS and RS-clients
- Send MED attribute when aggregate prefix is created
- Require librtr >= 0.8.0 for RPKI to fix connection handling (failover)
- Fix aspath memory leak in aggr_suppress_map_test
- Fix crash for `show ip bgp vrf all all neighbors 192.168.0.1 ...`
- Fix crash for `show ip bgp vrf all all`
- Fix memory leak for BGP Community Alias in CLI
- Fix memory leak when setting BGP community at egress
- Fix memory leak when setting BGP large-community at egress
- Fix SR color nexthop processing in BGP
- Fix setting local-preference in route-map using +/-
- Fix crash using Lua and route-map to set attributes via scripts
- Fix crash when issuing various forms of `bgp no-rib`
**isisd**
- JSON output for show summary command
- Fix crash when MTU mismatch occurs
- Fix crash with xfrm interface type
- Fix infinite loop when parsing LSPs
- Fix router capability TLV parsing issues
**vtysh**
- New `show thread timers` command
**ospfd6**
- Add LSA statistics to LSA database
- Add LSA stats to `show area json` output
- Show time left in hello timer for `show ipv6 ospf6 int`
- Permit route deletion without nexthops
- Restart SPF when distance is updated
- Stop refreshing Type-5 from NSSA
- Support keychain for ospf6 authentication
**ospfd**
- New `show ip ospf reachable-routers` command
- Restart SPF when distance is updated
- Use consistent JSON keys for `show ip ospf neighbor` and detail version
**pimd**
- Add additional IGMP stats
- Add IGMP join sent/failed statistics
- Add IGMP total groups and total source groups to statistics
- New `debug igmp trace detail` command
- New `ip pim passive` command
- JSON support added for command `show ip igmp sources`
- Allow the LPM match work properly with prefix lists and normal RP's
- Do not allow 224.0.0.0/24 range in IGMP join
- Fix IGMP packet/query check
- Handle PIM join/prune receive flow for IPv6
- Handle receive of (*,G) register stop with source address as 0
- Handle of exclude mode IGMPv3 report messages for SSM-aware group
- Handle of IGMPv2 report message for SSM-aware group range
- Send immediate join with possible sg rpt prune bit set
- Show group-type under `show ip pim rp-info`
- Show total received messages IGMP stats
**staticd**
- Capture zebra's advertised ECMP limit
- Don't register existing nexthop to Zebra
- Reject route config with too many nexthops
- Track nexthops per-safi
**watchfrr**
- Add some more information to `show watchfrr`
- Send operational state to systemd
**zebra**
- Add ability to know when FRR is not ASIC offloaded
- Add command for setting protodown bit
- Add dplane type for netconf data
- Add ECMP supported to `show zebra`
- Add EVPN status to `show zebra`
- Add if v4/v6 forwarding is turned on/off to `show zebra`
- Add initial zebra tracepoint support
- Add kernel nexthop group support to `show zebra`
- Add knowledge about ra and rfc 5549 to `show zebra`
- Add mpls status to `show zebra`
- Add netlink debug dump for netconf messages
- Add netlink debugs for ip rules
- Add OS and version to `show zebra`
- Add support for end.dt4
- Add to `show zebra` the type of vrf devices being used
- Allow *BSD to specify a receive buffer size
- Allow multiple connected routes to be choosen for kernel routes
- Allow system routes to recurse through themselves
- Don't send RAs w/o link-local v6 or on bridge-ports
- Evpn disable remove l2vni from l3vni list
- Evpn-mh bonds protodown check for set
- Evpn-mh use protodown update reason api
- Fix cleanup of meta queues on vrf disable
- Fix crash in evpn neigh cleanup all
- Fix missing delete vtep during vni transition
- Fix missing vrf change of l2vni on vxlan interface
- Fix rtadv startup when config read in is before interface up
- Fix use after deletion event in FreeBSD
- Fix v6 route replace failure turned into success
- Get zebra graceful restart working when restarting on *BSD
- Handle FreeBSD routing socket enobufs
- Handle protodown netlink for vxlan device
- Include mpls enabled status in interface output
- Include old reason in evpn-mh bond update
- Keep the interface flags safe on multiple ioctl calls
- Let /32 host route with same ip cross vrf
- Make router advertisement warnings show up once every 6 hours
- Prevent crash if zebra_route_all is used for a route type
- Prevent installation of connected multiple times
- Protodown-up event trigger interface up
- Register nht nexthops with proper safi
- Update advertise-svi-ip macips w/ new mac
- When handling unprocessed messages from kernel print usable string
- New `show ip nht mrib` command
- Handle ENOBUFS errors for FreeBSD
==395247== 8,268 (32 direct, 8,236 indirect) bytes in 1 blocks are definitely lost in loss record 199 of 205
==395247== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==395247== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x1FBF63: community_intern (in /usr/lib/frr/bgpd)
==395247== by 0x1FC0C5: community_parse (in /usr/lib/frr/bgpd)
==395247== by 0x1F0B66: bgp_attr_community (in /usr/lib/frr/bgpd)
==395247== by 0x1F4185: bgp_attr_parse (in /usr/lib/frr/bgpd)
==395247== by 0x26BC29: bgp_update_receive (in /usr/lib/frr/bgpd)
==395247== by 0x26E887: bgp_process_packet (in /usr/lib/frr/bgpd)
==395247== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==395247== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==361630== 24,780 (96 direct, 24,684 indirect) bytes in 3 blocks are definitely lost in loss record 94 of 97
==361630== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==361630== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1FD3CC: bgp_ca_alias_insert (in /usr/lib/frr/bgpd)
==361630== by 0x2CF8E5: bgp_community_alias_magic (in /usr/lib/frr/bgpd)
==361630== by 0x2C980B: bgp_community_alias (in /usr/lib/frr/bgpd)
==361630== by 0x48E3556: cmd_execute_command_real (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E384B: cmd_execute_command_strict (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3D41: command_config_read_one_line (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3EBA: config_from_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x499065C: vty_read_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4990FF4: vty_read_config (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491CB95: frr_config_read_in (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==361630==
==361630== 24,780 (96 direct, 24,684 indirect) bytes in 3 blocks are definitely lost in loss record 95 of 97
==361630== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==361630== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x490BB12: hash_get (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1FD39C: bgp_ca_community_insert (in /usr/lib/frr/bgpd)
==361630== by 0x2CF8F4: bgp_community_alias_magic (in /usr/lib/frr/bgpd)
==361630== by 0x2C980B: bgp_community_alias (in /usr/lib/frr/bgpd)
==361630== by 0x48E3556: cmd_execute_command_real (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E384B: cmd_execute_command_strict (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3D41: command_config_read_one_line (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x48E3EBA: config_from_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x499065C: vty_read_file (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4990FF4: vty_read_config (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491CB95: frr_config_read_in (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x4985380: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==361630== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
pimd: During prune pending, behave as NOINFO state
Fixed ANVL Conformance PIM-SM 16.3 test case.
When (S,G,rpt) prune is received, we were installing
the mroute immediately with none as OIF.
This leads to dropping the (S,G) traffic during prune
pending time as well.
Also we should not install the mroute if there is no
change in the rpf update.
These 2 things lead to the failure of the test case.
Fixed it by blocking the installation in this scenario.
When prune pending timer pops, it will take care of
installing the mroute with none as OIF.
bgpd: Free ->raw_data from Hard Notification message after we use it
==175785== 0 bytes in 1 blocks are definitely lost in loss record 1 of 88
==175785== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==175785== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x269823: bgp_notify_decapsulate_hard_reset (in /usr/lib/frr/bgpd)
==175785== by 0x26C85D: bgp_notify_receive (in /usr/lib/frr/bgpd)
==175785== by 0x26E94E: bgp_process_packet (in /usr/lib/frr/bgpd)
==175785== by 0x4985349: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==175785==