Don Slice [Fri, 19 Mar 2021 19:10:14 +0000 (15:10 -0400)]
tools: frr-reload fixes for deleting vrf static routes
Problems reported that in certain cases, frr-reload.py would
delete vrf static routes inadvertantly due to two different
reasons. First, vrf statics with null0 or Null0 nexthops would
fail the match since rendered as blackholes. This was already
fixed for non-vrf statics so added for vrf-based. Second,
frr-reload would fail to match due to different formats for
adding the command. If entered in the old way
"ip route x.x.x.x/x y.y.y.y vrf NAME" and rendered
in the new sway "vrf NAME\nip route x.x.x.x/x y.y.y.y" it would
fail to match do an inadvertant delete.
Igor Ryzhov [Thu, 1 Apr 2021 12:42:53 +0000 (15:42 +0300)]
bbfd: clear nb config entries when removing bfd node
When bfd node is removed, we must clear all NB entries set by its
children - sessions and profiles. Let's store some fake data as an entry
for the bfd node to be able to unset it later.
Igor Ryzhov [Mon, 29 Mar 2021 11:47:43 +0000 (14:47 +0300)]
ospfd: fix counting of "ip ospf area" commands
Instead of trying to maintain if_ospf_cli_count, let's directly count
the number of configured interfaces when it is needed. Current approach
sometimes leads to an incorrect counter.
Igor Ryzhov [Sun, 14 Feb 2021 02:39:00 +0000 (05:39 +0300)]
zebra: fix vni configuration in default vrf
VNI configuration is done without NB layer in default VRF. It leads to
the following problems:
```
vtysh -c "conf" -c "vni 1"
vtysh -c "conf" -c "vrf default" -c "no vni"
```
Second command does nothing, because the NB node is not created by the
first command.
```
vtysh -c "conf" -c "vrf default" -c "vni 1"
vtysh -c "conf" -c "no vni 1"
```
Second command doesn't delete the NB node created by the first command.
ckishimo [Tue, 16 Mar 2021 22:47:18 +0000 (23:47 +0100)]
ospf6d: fix iface commands lost when removing from area
In OSPFv3 when removing the interface from an area, all ospf6
interface commands are lost, so when changing the area you need
to reconfigure all ospf6 interface commands again
r1# conf t
r1(config)# router ospf6
r1(config-ospf6)# no interface r1-r2-eth0 area 0.0.0.0
r1(config-ospf6)# exit
r1# sh run
interface r1-r2-eth0
ipv6 address 2013:12::1/64
! <----- missing all ipv6 ospf6 commands
router ospf6
ospf6 router-id 1.1.1.1
!
This is because the interface is being deleted instead of disabled
(see PR#7717) I believe the interface should be left as disabled
(not deleted) when removing the interface from the area
Chirag Shah [Mon, 25 Jan 2021 19:44:56 +0000 (11:44 -0800)]
lib: fix a crash in plist update
Problem:
Prefix-list with mulitiple rules, an update to
a rule/sequence with different prefix/prefixlen
reset prefix-list next-base pointer to avoid
having stale value.
In some case the old next-bast's reference leads
to an assert in tri (trie_install_fn ) add.
bt:
(object=0x55576a4c8a00, updptr=0x55576a4b97e0) at lib/plist.c:560
(plist=0x55576a4a1770, pentry=0x55576a4c8a00) at lib/plist.c:585
(ple=0x55576a4c8a00) at lib/plist.c:745
(args=0x7fffe04beb50) at lib/filter_nb.c:1181
Solution:
Reset prefix-list next-base pointer whenver a
sequence/rule is updated.
Ticket:CM-33109
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com> Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Igor Ryzhov [Tue, 9 Mar 2021 22:17:47 +0000 (01:17 +0300)]
bfdd: fix starting echo receive timer
Currently this timer is only started when we receive the first echo
packet. If we never receive the packet, the timer is never started and
the user falsely assumes that echo function is working.
Mark Stapp [Tue, 9 Mar 2021 16:13:41 +0000 (11:13 -0500)]
bgpd: handle socket read errors in the main pthread
Add a handler for socket errors that runs in the main pthread,
rather than the io pthread. When the io pthread encounters a
read error, capture the error and schedule a task for the main
pthread.
Igor Ryzhov [Wed, 10 Mar 2021 19:11:19 +0000 (22:11 +0300)]
bfdd: make sessions administratively up by default
Current behavior is inconsistent. When the session is created by another
daemon, it is up by default. When we later configure peer in bfdd, the
session is still up, but the NB layer thinks that it is down.
More than that, even when the session is created in bfdd using peer
command, it is created in DOWN state, not ADM_DOWN. And it actually
starts sending and receiving packets. The sessions is marked with
SHUTDOWN flag only when we try to reconfigure some parameter. This
behavior is also very unexpected.
Igor Ryzhov [Fri, 26 Feb 2021 16:17:28 +0000 (19:17 +0300)]
lib: fix crash when iterating over nb operational data
Example:
```
show yang operational-data /frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-staticd:staticd'][name='staticd'][vrf='default'] staticd
```
Igor Ryzhov [Tue, 9 Mar 2021 20:08:41 +0000 (23:08 +0300)]
bfdd: fix detect timeout
RFC 5880 Section 6.8.4:
In Asynchronous mode, the Detection Time calculated in the local
system is equal to the value of Detect Mult received from the remote
system, multiplied by the agreed transmit interval of the remote
system (the greater of bfd.RequiredMinRxInterval and the last
received Desired Min TX Interval).
ospf6 keeps a flag to remember whether the cost for an interface
was manually added via config or computed automatically, but if
the configured value matches the auto-computed one we were not
setting this flag, meaning that the config would not show up in
the config.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Igor Ryzhov [Thu, 4 Mar 2021 18:17:20 +0000 (21:17 +0300)]
bfdd: fix echo configuration in profile
It's not currently possible to configure echo mode in profile node:
```
(config)# bfd
(config-bfd)# profile test
(config-bfd-profile)# echo-mode
% Echo mode is only available for single hop sessions.
(config-bfd-profile)# echo-interval 20
% Echo mode is only available for single hop sessions.
```
Rafael Zalamena [Tue, 1 Dec 2020 11:01:37 +0000 (08:01 -0300)]
bfdd: session specific command type checks
Replace the unclear error message:
```
% Failed to edit configuration.
YANG error(s):
Schema node not found.
YANG path: /frr-bfdd:bfdd/bfd/sessions/single-hop[dest-addr='192.168.253.6'][interface=''][vrf='default']/minimum-ttl
```
With:
```
frr(config-bfd-peer)# minimum-ttl 250
% Minimum TTL is only available for multi hop sessions.
! or
frr(config-bfd-peer)# echo
% Echo mode is only available for single hop sessions.
frr(config-bfd-peer)# echo-interval 300
% Echo mode is only available for single hop sessions.
```
Reported-by: Trae Santiago Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Trey Aspelund [Thu, 4 Mar 2021 02:05:56 +0000 (02:05 +0000)]
bgpd: fix bgp statistics for l2vpn evpn
'show bgp l2vpn evpn statistics' was returning 0 for all stats
because bgp_table_stats_walker bailed out if afi != AFI_IP or AFI_IP6.
Add case condition to catch AFI_L2VPN.
Igor Ryzhov [Wed, 3 Mar 2021 21:13:44 +0000 (00:13 +0300)]
doc: fix link for python2 get-pip.py
Script by the current link doesn't work with Python 2 anymore:
```
ERROR: This script does not work on Python 2.7 The minimum supported Python version is 3.6.
Please use https://bootstrap.pypa.io/2.7/get-pip.py instead.
```
Donald Sharp [Wed, 17 Mar 2021 02:28:29 +0000 (22:28 -0400)]
bgpd: If we have a SAFI conflict do not allow labeled unicast to reset
If we have a SAFI conflict, ie we are trying to activate safi's
UNICAST and LABELED_UNICAST at the same time, we should not
cause bestpath to be rerun and we should not try to put
labels on everything.
Martin Winter [Thu, 4 Mar 2021 02:14:50 +0000 (03:14 +0100)]
FRRouting Release 7.5.1
This is a maintenance release with the following fixes:
BABEL
Fix connected route leak on change
BFD
Session lookup was sometimes wrong
Memory leak and handling cleanups
In some situations handle vrf appropriately when receiving packets
BGP
Peer Group Inheritance Fixes
Dissallow attempt to peer peers reachable via blackholes
Send BMP down message when reachability fails
Cleanup handling of aggregator data when the AGG AS is 0
Handle `neighbor <peer-group allowas-in` config changes properly
Properly parse community and lcommunity values in some circumstances
Allow peer-groups to configure `ttl-security hops`
Prevent v6 routes with v4 nexthops from being installed
Allow `default-originate` to be cleared from a peer group
Fix evpn route-map vni filter at origin
local routes were using non-default distance
Properly track if the nexthop was updated in some circumstances
Cleanup `show running` when running bgp with `-e X` values
Various Memory leaks in show commands
Properly withdraw exported routes when deleting a VRF
Avoid resetting ebgp-multihop if peer setting is the same as peer-group
Properly encode flowspec rules to zebra in some rare circumstances
Generate statistics for routes in bgp when we have exactly 1 route
Properly apply route-map for the default-originate command
EIGRP
Properly set MTU for eigrp packets sent
Various memory leaks and using uninited data fixes
ISIS
When last area address is removed, resign if we were the DR
Various memory leaks and using uninited data fixes
LDP
Various memory leaks and using uninited data fixes
NHRP
Use onlink routes when prefix == nh
Shortcut routes are installed with proper nexthop
OSPF
Prevent duplicate packet read in multiple vrf situation
Fix area removal at interface level
Restore Point to MultiPoint interface types
Correctly handle MTU change on startup
Multi Instance initialization sometimes was not successful
NSSA translate-always was not working properly
OSPFv3
Don't send hellos on loopback interfaces
Handle ECMP better when a sub-path is removed
Memory leak and handling fixes
Fix Link LSA not updating when router priority is modified
Some output from show commands was wrong
Intra area remote connected prefixes sometimes not installed
PBR
Various memory leaks and using uninited data fixes
PIM
SGRpt prune received during prune didn't override holdtime
Various memory leaks and using uninited data fixes
STATIC
Fix VRF and usage on startup in some instances
Tableid was being mishandled in some cases
VTYSH
Disable bracketed paste in readline.
WATCHFRR
Various memory leaks and using uninited data fixes
ZEBRA
Always install blackhole routes using kernel routes instead of nexthops
Various memory leaks and using uninited data fixes
Dissallow resolution to duplicate nexthops that created infinite nexthops
Apply the route-map delay-timer globally
Some routes were stuck in Queued state when using the FPM
Better handle vrf creation when using namespaces
Set NUD_NOARP on sticky mac entries in addtion to NTF_STICKY
Allow `set src X` to work on startup
FRR Library
Fix a variety of memory leaks
Fix VRF Creation in some instances
RPKI context editing was not properly handled in reload situations
routemap code was not properly handling modification of CLI in some instances
SNAPCRAFT
Update to using rtrlib 0.7.0
Fix passthrough path for Libyang 1.x
ALPINE
Remove old docker deps
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Mark Stapp [Mon, 21 Sep 2020 19:57:59 +0000 (15:57 -0400)]
lib: avoid signal-handling race with event loop poll call
Manage the main pthread's signal mask to avoid a signal-handling
race. Before entering poll, check for pending signals that the
application needs to handle. Use ppoll() to re-enable those
signals during the poll call.
Mark Stapp [Wed, 2 Sep 2020 20:25:00 +0000 (16:25 -0400)]
lib: add sigevent_check api
Add an api that blocks application-handled signals (SIGINT,
SIGTERM, e.g.) then tests whether any signals have been received.
This helps to manage a race between signal reception and the poll
call in the main event loop.
Igor Ryzhov [Tue, 16 Feb 2021 09:57:01 +0000 (12:57 +0300)]
lib: register dependency between control plane protocol and vrf nb nodes
When the control plane protocol is created, the vrf structure is
allocated, and its address is stored in the northbound node.
The vrf structure may later be deleted by the user, which will lead to
a stale pointer stored in this node.
Instead of this, allow daemons that use the vrf pointer to register the
dependency between the control plane protocol and vrf nodes. This will
guarantee that the nodes will always be created and deleted together, and
there won't be any stale pointers.
sudhanshukumar22 [Wed, 27 Jan 2021 04:08:40 +0000 (20:08 -0800)]
bgpd: Bgp peer group issue
Description:
Holdtime and keepalive parameters weren't copied from
peer-group to peer-group members. Fixed the issue by copying holdtime
and keepalive parameters from peer-group to its members.
Problem Description/Summary :
Holdtime and keepalive parameters weren't copied from
peer-group to peer-group members. Fixed the issue by copying holdtime
and keepalive parameters from peer-group to its members. Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
bgpd: upon bgp deletion, do not systematically ask to remove main bgp
Dependencies between bgp instances is necessary only when it comes to
configure some specific services like ipv4-vpn, ipv6-vpn or l2vpn-evpn.
The list of config possibilities is listed, and an error is returned if
one of the above services is configured on the bgp vrf instance.
There may be some missingn services not covered. For clarification, here
are services configured on bgp vrf instances, while trying to delete
main bgp instance:
- if evpn main instance is the main bgp instance, and if evpn rt5
service is configured (with advertise command)
- if a vni is configured in the vrf instance
- if l3vpn import/export commands are solicitated for
importing/exporting entries from a vpnv4/6 network located on main bgp
instance. (in l3vpn, the main bgp instance is the location where vpnv4/6
sits).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Thu, 18 Feb 2021 11:55:29 +0000 (06:55 -0500)]
bgpd: Fix crash when we don't have a nexthop
Recent changes to allow bgpd to handle v6 LL slightly
differently in the nexthop tracking code has not
interacted well with the blackhole nexthop change
for peers. Modify the code to do the right thing
Runar Borge [Fri, 22 Jan 2021 23:15:41 +0000 (00:15 +0100)]
frr-reload: rpki context exiting uses exit and not end
Issue:
The rpki subcontext uses exit instead of end to exit.
This makes issues with frr-reload in the way that frr-reload never exits
rpki context until it reaches the next end statement. this also happens when
parsing the configuration from vtysh.
Donald Sharp [Thu, 11 Feb 2021 14:54:34 +0000 (09:54 -0500)]
bgpd: Blackhole nexthops are not reachable
When bgp registers for a nexthop that is not reachable due
to the nexthop pointing to a blackhole, bgp is never going
to be able to reach it when attempting to open a connection.
Broken behavior:
<show bgp nexthop>
192.168.161.204 valid [IGP metric 0], #paths 0, peer 192.168.161.204
blackhole
Last update: Thu Feb 11 09:46:10 2021
eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.161.204 0 0 never Waiting for peer OPEN
The log file fills up with this type of message:
2021-02-09T18:53:11.653433+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:21.654005+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:31.654381+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:41.654729+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:51.655147+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
As that the connect to a blackhole is correctly rejected by the kernel
Fixed behavior:
eva# show bgp ipv4 uni summ
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
annie(192.168.161.2) 4 64539 126264 39 0 0 0 00:01:36 38 40 N/A
192.168.161.178 4 0 0 0 0 0 0 never Active 0 N/A
Total number of neighbors 2
eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.161.178 0 0 never Waiting for NHT
Total number of neighbors 2
eva# show bgp nexthop
Current BGP nexthop cache:
192.168.161.2 valid [IGP metric 0], #paths 38, peer 192.168.161.2
if enp39s0
Last update: Thu Feb 11 09:52:05 2021
192.168.161.131 valid [IGP metric 0], #paths 0, peer 192.168.161.131
if enp39s0
Last update: Thu Feb 11 09:52:05 2021
192.168.161.178 invalid, #paths 0, peer 192.168.161.178
Must be Connected
Last update: Thu Feb 11 09:53:37 2021
eva#
Igor Ryzhov [Wed, 17 Feb 2021 12:06:20 +0000 (15:06 +0300)]
staticd: fix vrf enabling
When enabling the VRF, we should not install the nexthops that rely on
non-existent VRF.
For example, if we have route "1.1.1.0/24 2.2.2.2 vrf red nexthop-vrf blue",
and VRF red is enabled, we should not install it if VRF blue doesn't exist.
Igor Ryzhov [Wed, 17 Feb 2021 11:19:40 +0000 (14:19 +0300)]
staticd: fix nexthop creation and installation
Currently, staticd creates a VRF for the nexthop it is trying to install.
Later, when this nexthop is deleted, the VRF stays in the system and can
not be deleted by the user because "no vrf" command doesn't work for this
VRF because it was not created through northbound code.
There is no need to create the VRF. Just set nh_vrf_id to VRF_UNKNOWN
when the VRF doesn't exist.
Donald Sharp [Tue, 16 Feb 2021 20:54:08 +0000 (15:54 -0500)]
zebra: use AF_INET for protocol family
When looking up the conversion from kernel protocol to
internal protocol family make sure we use the correct
AF_INET( what the kernel uses ) instead of AFI_IP (which
is what FRR uses ).
Routes from OSPF will show up from the kernel as OSPF6 instead of
OSPF. Which will cause mayhem
Ticket: CM-33306 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Quentin Young [Thu, 11 Feb 2021 23:54:27 +0000 (18:54 -0500)]
bgpd: send correct BMP down message when nht fails
When sending BMP messages for a status change event for a peer whose NHT
has failed, we were sending a Peer Down Reason Code of 1 (Local system
closed, NOTIFICATION follows) with no NOTIFICAION PDU (because there was
none). This is wrong. Also, the reason code of 1 is semantically off, it
should be 2 (Local system closed, FSM event follows).
This patch:
- adds definitions of all BGP FSM event codes per RFC4271
- changes the BMP reason code emitted when a peer changes state due to
NHT failure to 2 and encodes FSM event 18 (TcpConnectionFails)
- changes the catch-all case where we have not yet
implemented the appropriate BMP response to indicate reason code 2
with FSM event 0 (no relevant Event code is defined).
These changes ought to prevent the BMP session from being torn down due
to an improperly formatted message.
Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
Igor Ryzhov [Tue, 2 Feb 2021 22:02:15 +0000 (01:02 +0300)]
bfdd: fix session lookup
BFD key has optional fields "local" and "ifname" which can be empty when
the BFD session is created. In this case, the hash key will be calculated
with these fields filled with zeroes.
Later, when we're looking for the BFD session using the key with fields
"local" and "ifname" populated with actual values, the hash key will be
different. To work around this issue, we're doing multiple hash lookups,
first with full key, then with fields "local" and "ifname" filled with
zeroes.
But there may be another case when the initial key has the actual values
for "local" and "ifname", but the key we're using for lookup has empty
values. This case is covered for IPv4 by using additional hash walk with
bfd_key_lookup_ignore_partial_walker function but is not covered for IPv6.
Instead of introducing more hacks and workarounds, the following solution
is proposed:
- the hash key is always calculated in bfd_key_hash_do using only
required fields
- the hash data is compared in bfd_key_hash_cmp, taking into account the
fact that fields "local" and "ifname" may be empty
Using this solution, it's enough to make only one hash lookup.
Soman K S [Wed, 10 Feb 2021 11:15:22 +0000 (16:45 +0530)]
ospf6d : fix issue in ecmp inter area route
Issue: When a path in the inter area ecmp route is deleted, the route is removed
Fix: The fix is to remove the specific path from the inter area route using
ospf6_abr_old_route_remove() when abr route entry is not found.
In the function ospf6_abr_old_route_remove() the path to be removed needs
to match adv router and link state ID
Fixed memory leak in ospf6_intra_prefix_update_route_origin() caused by
route node lock not getting released.
Donald Sharp [Thu, 11 Feb 2021 12:31:05 +0000 (07:31 -0500)]
ospfd: Prevent duplicate packet read in certain vrf situations
Currently if the sysctl net.ipv4.raw_l3mdev_accept is 1, packets
destined to a specific vrf also end up being delivered to the default
vrf. We will see logs like this in ospf:
2021/02/10 21:17:05.245727 OSPF: ospf_recv_packet: fd 20(default) on interface 1265(swp1s1.26)
2021/02/10 21:17:05.245740 OSPF: Hello received from [9.9.36.12] via [swp1s1.26:200.254.26.13]
2021/02/10 21:17:05.245741 OSPF: src [200.254.26.14],
2021/02/10 21:17:05.245743 OSPF: dst [224.0.0.5]
2021/02/10 21:17:05.245769 OSPF: ospf_recv_packet: fd 45(vrf1036) on interface 1265(swp1s1.26)
2021/02/10 21:17:05.245774 OSPF: Hello received from [9.9.36.12] via [swp1s1.26:200.254.26.13]
2021/02/10 21:17:05.245775 OSPF: src [200.254.26.14],
2021/02/10 21:17:05.245777 OSPF: dst [224.0.0.5]
This really really makes ospf unhappy in the vrf we are running in.
I am approaching the problem by just dropping the packet if read in the
default vrf because of:
lib: Allow bgp to always create a listen socket for the vrf
Effectively if we have `router ospf vrf BLUE` but no ospf running
in the default vrf, we will not have a listener and that would
require a fundamental change in our approach to handle the ospf->fd
at a global level. I think this is less than ideal at the moment
but it will get us moving again and allow FRR to work with
a bunch of vrf's and ospf neighbors.
Igor Ryzhov [Tue, 9 Feb 2021 18:38:45 +0000 (21:38 +0300)]
vrf: mark vrf as configured when entering vrf node
The VRF must be marked as configured when user enters "vrf NAME" command.
Otherwise, the following problem occurs:
`ip link add red type vrf table 1`
VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is called, and pointer to the VRF structure is stored
to the nb_config_entry.
`ip link del red`
VRF structure is freed (because it is not marked as configured), but
the pointer is still stored in the nb_config_entry.
`vtysh -c "conf t" -c "no vrf red"`
Nothing happens, because VRF structure doesn't exist. It means that
`lib_vrf_destroy` is not called, and nb_config_entry still exists in
the running config with incorrect pointer.
`ip link add red type vrf table 1`
New VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is NOT called, because the nb_config_entry for that
VRF name still exists in the running config.
After that all NB commands for this VRF will use incorrect pointer to
the freed VRF structure.
Martin Buck [Fri, 29 Jan 2021 15:40:04 +0000 (16:40 +0100)]
ospf6d: Fix LSA formatting out-of-bounds access
Check whether full struct ospf6_router_lsdesc/ospf6_prefix is accessible
before accessing its contents. Previously, we only checked for the first
byte in ospf6_router_lsa_get_nbr_id() or not even that (due to an additional
off-by-one error) in ospf6_link_lsa_get_prefix_str() and
ospf6_intra_prefix_lsa_get_prefix_str().
Also check *before* accessing the first prefix instead of starting the
checks only at the 2nd prefix.
The previous code could cause out-of-bounds accesses with valid LSAs in case
of ospf6_link_lsa_get_prefix_str() and
ospf6_intra_prefix_lsa_get_prefix_str() and with specially crafted LSAs
(bad length field) in case of ospf6_router_lsa_get_nbr_id().
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Donald Sharp [Sun, 7 Feb 2021 20:03:51 +0000 (15:03 -0500)]
bfdd: Prevent use after free ( again )
Valgrind is still reporting:
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
466020-==466020== Address 0x5a7d544 is 84 bytes inside a block of size 272 free'd
466020:==466020== at 0x48399AB: free (vg_replace_malloc.c:538)
466020-==466020== by 0x490A947: qfree (memory.c:140)
466020-==466020== by 0x48F2AE8: if_delete (if.c:322)
466020-==466020== by 0x48F250D: if_destroy_via_zapi (if.c:195)
466020-==466020== by 0x497071E: zclient_interface_delete (zclient.c:2040)
466020-==466020== by 0x49745F6: zclient_read (zclient.c:3687)
466020-==466020== by 0x4955AEC: thread_call (thread.c:1684)
466020-==466020== by 0x48FF64E: frr_run (libfrr.c:1126)
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
466020-==466020== Block was alloc'd at
466020:==466020== at 0x483AB65: calloc (vg_replace_malloc.c:760)
466020-==466020== by 0x490A805: qcalloc (memory.c:115)
466020-==466020== by 0x48F23D6: if_new (if.c:160)
466020-==466020== by 0x48F257F: if_create_name (if.c:214)
466020-==466020== by 0x48F3493: if_get_by_name (if.c:558)
466020-==466020== by 0x49705F2: zclient_interface_add (zclient.c:1989)
466020-==466020== by 0x49745E0: zclient_read (zclient.c:3684)
466020-==466020== by 0x4955AEC: thread_call (thread.c:1684)
466020-==466020== by 0x48FF64E: frr_run (libfrr.c:1126)
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
Apparently the bs->ifp pointer is being set even in cases when
the bs->key.ifname is not being set. So go through and just
match the interface pointer and cut-to-the-chase.
Donald Sharp [Sun, 7 Feb 2021 19:59:53 +0000 (14:59 -0500)]
*: Fix usage of bfd_adj_event
Valgrind reports:
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A090D: bgp_bfd_dest_update (bgp_bfd.c:416)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A093C: bgp_bfd_dest_update (bgp_bfd.c:421)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
On looking at bgp_bfd_dest_update the function call into bfd_get_peer_info
when it fails to lookup the ifindex ifp pointer just returns leaving
the dest and src prefix pointers pointing to whatever was passed in.
Let's do two things:
a) The src pointer was sometimes assumed to be passed in and sometimes not.
Forget that. Make it always be passed in
b) memset the src and dst pointers to be all zeros. Then when we look
at either of the pointers we are not making decisions based upon random
data in the pointers.
Martin Buck [Fri, 29 Jan 2021 18:26:49 +0000 (19:26 +0100)]
ospf6d: Fix LSA formatting inconsistent retvals
Make return values for lh_get_prefix_str LSA handlers consistent, i.e.
return NULL in case of error without having written to the passed buffer
and non-NULL (address of buffer) if a string was written to the buffer.
Previously, it was possible in certain cases (bogus LSAs) to not initialize
(and 0-terminate) the buffer but still return non-NULL, causing the caller
to print random junk.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
saravanank [Thu, 19 Mar 2020 10:33:41 +0000 (03:33 -0700)]
pimd: SGRpt prune received during prune didn't override holdtime
RCA: There were 2 problems.
1. SGRpt prune expiry didn't create S,G entry with none oil when no other
interfaces were part of the oil.
2. When restarting the timer with new hold value, comparision was missing and
old timer was not stopping.
Fix:
SGRpt Prune pending expiry will put SG entry with none oil if no other
Signed-off-by: Saravanan K <saravanank@vmware.com>
interfaces present. If present we will be deleting the inherited oif from oil.
Deleting the oif in that scenario will take care of changing mroute.
When alone interface expires in SGRpt prune pending state, we shall detect by
checking installed flag. if not installed, install mroute.
Donald Sharp [Sun, 31 Jan 2021 13:32:15 +0000 (08:32 -0500)]
eigrpd: Correctly set the mtu for eigrp packets sent
This version of eigrp pre-calculated the eigrp metric
to be a default of 1500 bytes, but unfortunately it
had entered the byte order wrong.
Modify the code to properly set the byte order
according to the eigrp rfc as well as actually
read in and transmit the mtu of the interface
instead of hard coding it to 1500 bytes.
Fixes: #7986 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Sun, 31 Jan 2021 13:56:00 +0000 (08:56 -0500)]
zebra: Prevent sending of unininted data
valgrind is reporting: 2448137-==2448137== Thread 5 zebra_apic: 2448137-==2448137== Syscall param writev(vector[...]) points to uninitialised byte(s) 2448137:==2448137== at 0x4D6FDDD: __writev (writev.c:26) 2448137-==2448137== by 0x4D6FDDD: writev (writev.c:24) 2448137-==2448137== by 0x48A35F5: buffer_flush_available (buffer.c:431) 2448137-==2448137== by 0x48A3504: buffer_flush_all (buffer.c:237) 2448137-==2448137== by 0x495948: zserv_write (zserv.c:263) 2448137-==2448137== by 0x4904B7E: thread_call (thread.c:1681) 2448137-==2448137== by 0x48BD3E5: fpt_run (frr_pthread.c:308) 2448137-==2448137== by 0x4C61EA6: start_thread (pthread_create.c:477) 2448137-==2448137== by 0x4D78DEE: clone (clone.S:95) 2448137-==2448137== Address 0x720c3ce is 62 bytes inside a block of size 4,120 alloc'd 2448137:==2448137== at 0x483877F: malloc (vg_replace_malloc.c:307) 2448137-==2448137== by 0x48D2977: qmalloc (memory.c:110) 2448137-==2448137== by 0x48A30E3: buffer_add (buffer.c:135) 2448137-==2448137== by 0x48A30E3: buffer_put (buffer.c:161) 2448137-==2448137== by 0x49591B: zserv_write (zserv.c:256) 2448137-==2448137== by 0x4904B7E: thread_call (thread.c:1681) 2448137-==2448137== by 0x48BD3E5: fpt_run (frr_pthread.c:308) 2448137-==2448137== by 0x4C61EA6: start_thread (pthread_create.c:477) 2448137-==2448137== by 0x4D78DEE: clone (clone.S:95) 2448137-==2448137== Uninitialised value was created by a stack allocation 2448137:==2448137== at 0x43E490: zserv_encode_vrf (zapi_msg.c:103)
Effectively we are sending `struct vrf_data` without ensuring
data has been properly initialized.
Donald Sharp [Sun, 31 Jan 2021 13:52:44 +0000 (08:52 -0500)]
ospf6d: prevent use after free
Valgrind reports:
2437395-==2437395== Invalid read of size 8 2437395:==2437395== at 0x40B610: ospf6_asbr_update_route_ecmp_path (ospf6_asbr.c:327) 2437395-==2437395== by 0x40BC7C: ospf6_asbr_lsa_add (ospf6_asbr.c:544) 2437395-==2437395== by 0x40C5DF: ospf6_asbr_lsentry_add (ospf6_asbr.c:829) 2437395-==2437395== by 0x42D88D: ospf6_top_brouter_hook_add (ospf6_top.c:185) 2437395-==2437395== by 0x4188E3: ospf6_intra_brouter_calculation (ospf6_intra.c:2320) 2437395-==2437395== by 0x42C624: ospf6_spf_calculation_thread (ospf6_spf.c:638) 2437395-==2437395== by 0x4904B7E: thread_call (thread.c:1681) 2437395-==2437395== by 0x48CAA27: frr_run (libfrr.c:1126) 2437395-==2437395== by 0x40AF43: main (ospf6_main.c:232) 2437395-==2437395== Address 0x5c668a8 is 24 bytes inside a block of size 256 free'd 2437395:==2437395== at 0x48399AB: free (vg_replace_malloc.c:538) 2437395-==2437395== by 0x429027: ospf6_route_delete (ospf6_route.c:419) 2437395-==2437395== by 0x429027: ospf6_route_unlock (ospf6_route.c:460) 2437395-==2437395== by 0x429027: ospf6_route_remove (ospf6_route.c:887) 2437395-==2437395== by 0x40B343: ospf6_asbr_update_route_ecmp_path (ospf6_asbr.c:318) 2437395-==2437395== by 0x40BC7C: ospf6_asbr_lsa_add (ospf6_asbr.c:544) 2437395-==2437395== by 0x40C5DF: ospf6_asbr_lsentry_add (ospf6_asbr.c:829) 2437395-==2437395== by 0x42D88D: ospf6_top_brouter_hook_add (ospf6_top.c:185) 2437395-==2437395== by 0x4188E3: ospf6_intra_brouter_calculation (ospf6_intra.c:2320) 2437395-==2437395== by 0x42C624: ospf6_spf_calculation_thread (ospf6_spf.c:638) 2437395-==2437395== by 0x4904B7E: thread_call (thread.c:1681) 2437395-==2437395== by 0x48CAA27: frr_run (libfrr.c:1126) 2437395-==2437395== by 0x40AF43: main (ospf6_main.c:232) 2437395-==2437395== Block was alloc'd at 2437395:==2437395== at 0x483AB65: calloc (vg_replace_malloc.c:760) 2437395-==2437395== by 0x48D2A32: qcalloc (memory.c:115) 2437395-==2437395== by 0x427CE4: ospf6_route_create (ospf6_route.c:402) 2437395-==2437395== by 0x40BA8A: ospf6_asbr_lsa_add (ospf6_asbr.c:490) 2437395-==2437395== by 0x40C5DF: ospf6_asbr_lsentry_add (ospf6_asbr.c:829) 2437395-==2437395== by 0x42D88D: ospf6_top_brouter_hook_add (ospf6_top.c:185) 2437395-==2437395== by 0x4188E3: ospf6_intra_brouter_calculation (ospf6_intra.c:2320) 2437395-==2437395== by 0x42C624: ospf6_spf_calculation_thread (ospf6_spf.c:638) 2437395-==2437395== by 0x4904B7E: thread_call (thread.c:1681) 2437395-==2437395== by 0x48CAA27: frr_run (libfrr.c:1126) 2437395-==2437395== by 0x40AF43: main (ospf6_main.c:232)
ospfv3 loops through the ecmp routes to decide what to clean up. In some
situations the code free's up an existing route at the head of the list.
Cleaning the pointers in the list but never touching the original pointer.
In that case notice and update the old pointer.