Donald Sharp [Thu, 2 Sep 2021 00:50:31 +0000 (20:50 -0400)]
bgpd: Do not randomly generate a vrf id for -Z
When FRR added the -Z parameter the bgp daemon was setting
a vrf identifier based upon a number starting at 1. This
caused issues when we upgraded the code to the outgoing
sockets to use vrf_bind always.
FRR should never just randomly select a vrf identifier.
Let's just use VRF_DEFAULT when we are in a -Z environment.
It's a safe bet.
Fixes: #9519 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Igor Ryzhov [Thu, 12 Aug 2021 16:07:53 +0000 (19:07 +0300)]
bgpd: fix segfault when re-adding "match evpn default-route" rule
When using "match evpn default-route" rule, match_arg is NULL and strcmp
is not happy with that. There's already a special function named rulecmp
that handles such situations.
batmancn [Mon, 30 Nov 2020 12:04:44 +0000 (20:04 +0800)]
zebra: bugfix of error quit of zebra, due to no nexthop ACTIVE
There exists some rare situations where fpm will attempt
to send a route update with no valid nexthops. In that
case an assert would be hit. This is not good for
trying to keep your routing daemons up and running
when we can safely just recover the situation.
Fixes #7588 Signed-off-by: batmancn <batmanustc@gmail.com>
<fixed commit message, and used zlog_err> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 5306e6cf00c58a4c4558609d623ecbbd79faabf1)
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io> Signed-off-by: Christian Poessinger <christian@poessinger.com>
(cherry picked from commit ee96c52a2828335c698c27ee7f7374204a97d4cb)
Igor Ryzhov [Thu, 17 Jun 2021 16:31:03 +0000 (19:31 +0300)]
ospfd: fix routemap update
Currently, if the routemap already exists, we delete the pointer to it
when it is updated. We should delete the pointer only if the route-map
is actually deleted.
Igor Ryzhov [Fri, 18 Jun 2021 10:06:13 +0000 (13:06 +0300)]
lib: remove pure attribute from functions that modify memory
Almost all functions currently marked with pure attribute acquire a
route_node lock. By marking them pure we allow compiler to optimize the
code and not call them when it already knows the return value. This is
completely incorrect.
Only two of eleven functions can be marked as pure. And they still won't
be optimized because they are never called from the same function twice.
Let's remove the ext_pure macro completely to reduce the chance of
repeating this mistake in the future.
Trey Aspelund [Fri, 21 May 2021 22:04:15 +0000 (22:04 +0000)]
lib: fix handling of rmap prefix-tree default node
Prior to this commit, updating a prefix-list that is referenced by a
route-map clause will unconditionally delete the root node of that
route-map's prefix-tree (used with route-map optimization).
This is problematic because routes not matching a more specific node
in the tree (i.e. other prefix-list sequences) will not fall-back to
the default node, thus they will not hit any route-map sequences.
This commit ensures that an update to a prefix-list will only delete
the default node while adding the first/only seq to the list.
Example config:
========
ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16
ip prefix-list peer475-out-pfxlist seq 50 permit 0.0.0.0/0
!
route-map peer475-out permit 5
match ip address prefix-list peer475-out-pfxlist
Before:
========
ub20# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20# conf t
ub20(config)# ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16 le 32
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
2.138.0.0/16 (2)
(P)
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
2.138.0.0/16 (2)
(P)
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20(config)#
After:
========
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20(config)# ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16 le 32
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
bgpd: changing graceful-restart parameters should not be considered as error
vtysh will return an informational message to the user that changing any
graceful-shutdown related parameter will require a peer reset. This is should
not be treated as an error message (resulting in a return code of 1) but
rather as a simple information to the user.
This fixes GitHub issue https://github.com/FRRouting/frr/issues/8403
$ vtysh -c configure -c 'router bgp 100' -c 'bgp graceful-restart'
Graceful restart configuration changed, reset all peers to take effect
$ echo $?
0
David Lamparter [Thu, 8 Apr 2021 11:35:09 +0000 (13:35 +0200)]
lib: correctly exit CLI nodes on file config load
The (legacy) code for reading split configs tries to execute config
commands in parent nodes, but doesn't call the node_exit function when
it goes up to a parent node. This breaks BGP RPKI setup (and extended
syslog, which is in the next commit.)
Doing this correctly is a slight bit involved since the node_exit
callbacks should only be called if the command is actually executed on a
parent node.
David Lamparter [Sat, 10 Apr 2021 19:02:06 +0000 (21:02 +0200)]
lib: fix possible assert() fail in zlog_fd()
If the last message in a batched logging operation isn't printed due to
priority, this skips the code that flushes prepared messages through
writev() and can trigger the assert() at the end of zlog_fd().
Since any logmsg above info priority triggers a buffer flush, running
into this situation requires a log file target configured for info
priority, at least 1 message of info priority buffered, a debug message
buffered after that, and then a buffer flush (explicit or due to buffer
full).
I haven't seen this chain of events happen in the wild, but it needs
fixing anyway.
David Lamparter [Mon, 22 Mar 2021 19:21:19 +0000 (20:21 +0100)]
tools: run `vtysh -b` once for all-startup
As noted by Donald:
When FRR is starting all daemons (or restarting them all) FRR is reading
in the configuration 1 time for each daemon specified to run. This is
not a big deal if you have a very small configuration. But with large
configurations FRR is taking long enough that watchfrr is not
establishing connection to all the daemons and starting some over.
Modify the code so that vtysh is only read in at the end of a all
sequence. If we are restarting an individual daemon allow the read in of
the whole config.
Reported-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: David Lamparter <equinox@diac24.net>
Igor Ryzhov [Wed, 14 Apr 2021 10:08:18 +0000 (13:08 +0300)]
lib: fix access-list deletion
Problems with the current implementation:
* Delete hook is called before the deletion of the access-list from the
master list, which means that daemons processing this hook successfully
find this access-list, store a pointer to it in their structures, and
right after that the access-list is freed. Daemons end up having stale
pointer to the freed structure.
* Route-maps are not notified of the deletion.
Igor Ryzhov [Thu, 8 Apr 2021 12:43:07 +0000 (15:43 +0300)]
lib: fix interface nb stale pointers
The first change in this commit is the processing of the VRF termination.
When we terminate the VRF, we should not delete the underlying interfaces,
because there may be pointers to them in the northbound configuration. We
should move them to the default VRF instead.
Because of the first change, the VRF interface itself is also not deleted
when deleting the VRF. It should be handled in netlink_link_change. This
is done by the second change.
Don Slice [Fri, 19 Mar 2021 19:10:14 +0000 (15:10 -0400)]
tools: frr-reload fixes for deleting vrf static routes
Problems reported that in certain cases, frr-reload.py would
delete vrf static routes inadvertantly due to two different
reasons. First, vrf statics with null0 or Null0 nexthops would
fail the match since rendered as blackholes. This was already
fixed for non-vrf statics so added for vrf-based. Second,
frr-reload would fail to match due to different formats for
adding the command. If entered in the old way
"ip route x.x.x.x/x y.y.y.y vrf NAME" and rendered
in the new sway "vrf NAME\nip route x.x.x.x/x y.y.y.y" it would
fail to match do an inadvertant delete.
Igor Ryzhov [Thu, 1 Apr 2021 12:42:53 +0000 (15:42 +0300)]
bbfd: clear nb config entries when removing bfd node
When bfd node is removed, we must clear all NB entries set by its
children - sessions and profiles. Let's store some fake data as an entry
for the bfd node to be able to unset it later.
Igor Ryzhov [Mon, 29 Mar 2021 11:47:43 +0000 (14:47 +0300)]
ospfd: fix counting of "ip ospf area" commands
Instead of trying to maintain if_ospf_cli_count, let's directly count
the number of configured interfaces when it is needed. Current approach
sometimes leads to an incorrect counter.
Igor Ryzhov [Sun, 14 Feb 2021 02:39:00 +0000 (05:39 +0300)]
zebra: fix vni configuration in default vrf
VNI configuration is done without NB layer in default VRF. It leads to
the following problems:
```
vtysh -c "conf" -c "vni 1"
vtysh -c "conf" -c "vrf default" -c "no vni"
```
Second command does nothing, because the NB node is not created by the
first command.
```
vtysh -c "conf" -c "vrf default" -c "vni 1"
vtysh -c "conf" -c "no vni 1"
```
Second command doesn't delete the NB node created by the first command.
ckishimo [Tue, 16 Mar 2021 22:47:18 +0000 (23:47 +0100)]
ospf6d: fix iface commands lost when removing from area
In OSPFv3 when removing the interface from an area, all ospf6
interface commands are lost, so when changing the area you need
to reconfigure all ospf6 interface commands again
r1# conf t
r1(config)# router ospf6
r1(config-ospf6)# no interface r1-r2-eth0 area 0.0.0.0
r1(config-ospf6)# exit
r1# sh run
interface r1-r2-eth0
ipv6 address 2013:12::1/64
! <----- missing all ipv6 ospf6 commands
router ospf6
ospf6 router-id 1.1.1.1
!
This is because the interface is being deleted instead of disabled
(see PR#7717) I believe the interface should be left as disabled
(not deleted) when removing the interface from the area
Chirag Shah [Mon, 25 Jan 2021 19:44:56 +0000 (11:44 -0800)]
lib: fix a crash in plist update
Problem:
Prefix-list with mulitiple rules, an update to
a rule/sequence with different prefix/prefixlen
reset prefix-list next-base pointer to avoid
having stale value.
In some case the old next-bast's reference leads
to an assert in tri (trie_install_fn ) add.
bt:
(object=0x55576a4c8a00, updptr=0x55576a4b97e0) at lib/plist.c:560
(plist=0x55576a4a1770, pentry=0x55576a4c8a00) at lib/plist.c:585
(ple=0x55576a4c8a00) at lib/plist.c:745
(args=0x7fffe04beb50) at lib/filter_nb.c:1181
Solution:
Reset prefix-list next-base pointer whenver a
sequence/rule is updated.
Ticket:CM-33109
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com> Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Igor Ryzhov [Tue, 9 Mar 2021 22:17:47 +0000 (01:17 +0300)]
bfdd: fix starting echo receive timer
Currently this timer is only started when we receive the first echo
packet. If we never receive the packet, the timer is never started and
the user falsely assumes that echo function is working.
Mark Stapp [Tue, 9 Mar 2021 16:13:41 +0000 (11:13 -0500)]
bgpd: handle socket read errors in the main pthread
Add a handler for socket errors that runs in the main pthread,
rather than the io pthread. When the io pthread encounters a
read error, capture the error and schedule a task for the main
pthread.
Igor Ryzhov [Wed, 10 Mar 2021 19:11:19 +0000 (22:11 +0300)]
bfdd: make sessions administratively up by default
Current behavior is inconsistent. When the session is created by another
daemon, it is up by default. When we later configure peer in bfdd, the
session is still up, but the NB layer thinks that it is down.
More than that, even when the session is created in bfdd using peer
command, it is created in DOWN state, not ADM_DOWN. And it actually
starts sending and receiving packets. The sessions is marked with
SHUTDOWN flag only when we try to reconfigure some parameter. This
behavior is also very unexpected.
Igor Ryzhov [Fri, 26 Feb 2021 16:17:28 +0000 (19:17 +0300)]
lib: fix crash when iterating over nb operational data
Example:
```
show yang operational-data /frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-staticd:staticd'][name='staticd'][vrf='default'] staticd
```
Igor Ryzhov [Tue, 9 Mar 2021 20:08:41 +0000 (23:08 +0300)]
bfdd: fix detect timeout
RFC 5880 Section 6.8.4:
In Asynchronous mode, the Detection Time calculated in the local
system is equal to the value of Detect Mult received from the remote
system, multiplied by the agreed transmit interval of the remote
system (the greater of bfd.RequiredMinRxInterval and the last
received Desired Min TX Interval).
ospf6 keeps a flag to remember whether the cost for an interface
was manually added via config or computed automatically, but if
the configured value matches the auto-computed one we were not
setting this flag, meaning that the config would not show up in
the config.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Igor Ryzhov [Thu, 4 Mar 2021 18:17:20 +0000 (21:17 +0300)]
bfdd: fix echo configuration in profile
It's not currently possible to configure echo mode in profile node:
```
(config)# bfd
(config-bfd)# profile test
(config-bfd-profile)# echo-mode
% Echo mode is only available for single hop sessions.
(config-bfd-profile)# echo-interval 20
% Echo mode is only available for single hop sessions.
```
Rafael Zalamena [Tue, 1 Dec 2020 11:01:37 +0000 (08:01 -0300)]
bfdd: session specific command type checks
Replace the unclear error message:
```
% Failed to edit configuration.
YANG error(s):
Schema node not found.
YANG path: /frr-bfdd:bfdd/bfd/sessions/single-hop[dest-addr='192.168.253.6'][interface=''][vrf='default']/minimum-ttl
```
With:
```
frr(config-bfd-peer)# minimum-ttl 250
% Minimum TTL is only available for multi hop sessions.
! or
frr(config-bfd-peer)# echo
% Echo mode is only available for single hop sessions.
frr(config-bfd-peer)# echo-interval 300
% Echo mode is only available for single hop sessions.
```
Reported-by: Trae Santiago Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Trey Aspelund [Thu, 4 Mar 2021 02:05:56 +0000 (02:05 +0000)]
bgpd: fix bgp statistics for l2vpn evpn
'show bgp l2vpn evpn statistics' was returning 0 for all stats
because bgp_table_stats_walker bailed out if afi != AFI_IP or AFI_IP6.
Add case condition to catch AFI_L2VPN.
Igor Ryzhov [Wed, 3 Mar 2021 21:13:44 +0000 (00:13 +0300)]
doc: fix link for python2 get-pip.py
Script by the current link doesn't work with Python 2 anymore:
```
ERROR: This script does not work on Python 2.7 The minimum supported Python version is 3.6.
Please use https://bootstrap.pypa.io/2.7/get-pip.py instead.
```
Donald Sharp [Wed, 17 Mar 2021 02:28:29 +0000 (22:28 -0400)]
bgpd: If we have a SAFI conflict do not allow labeled unicast to reset
If we have a SAFI conflict, ie we are trying to activate safi's
UNICAST and LABELED_UNICAST at the same time, we should not
cause bestpath to be rerun and we should not try to put
labels on everything.
Martin Winter [Thu, 4 Mar 2021 02:14:50 +0000 (03:14 +0100)]
FRRouting Release 7.5.1
This is a maintenance release with the following fixes:
BABEL
Fix connected route leak on change
BFD
Session lookup was sometimes wrong
Memory leak and handling cleanups
In some situations handle vrf appropriately when receiving packets
BGP
Peer Group Inheritance Fixes
Dissallow attempt to peer peers reachable via blackholes
Send BMP down message when reachability fails
Cleanup handling of aggregator data when the AGG AS is 0
Handle `neighbor <peer-group allowas-in` config changes properly
Properly parse community and lcommunity values in some circumstances
Allow peer-groups to configure `ttl-security hops`
Prevent v6 routes with v4 nexthops from being installed
Allow `default-originate` to be cleared from a peer group
Fix evpn route-map vni filter at origin
local routes were using non-default distance
Properly track if the nexthop was updated in some circumstances
Cleanup `show running` when running bgp with `-e X` values
Various Memory leaks in show commands
Properly withdraw exported routes when deleting a VRF
Avoid resetting ebgp-multihop if peer setting is the same as peer-group
Properly encode flowspec rules to zebra in some rare circumstances
Generate statistics for routes in bgp when we have exactly 1 route
Properly apply route-map for the default-originate command
EIGRP
Properly set MTU for eigrp packets sent
Various memory leaks and using uninited data fixes
ISIS
When last area address is removed, resign if we were the DR
Various memory leaks and using uninited data fixes
LDP
Various memory leaks and using uninited data fixes
NHRP
Use onlink routes when prefix == nh
Shortcut routes are installed with proper nexthop
OSPF
Prevent duplicate packet read in multiple vrf situation
Fix area removal at interface level
Restore Point to MultiPoint interface types
Correctly handle MTU change on startup
Multi Instance initialization sometimes was not successful
NSSA translate-always was not working properly
OSPFv3
Don't send hellos on loopback interfaces
Handle ECMP better when a sub-path is removed
Memory leak and handling fixes
Fix Link LSA not updating when router priority is modified
Some output from show commands was wrong
Intra area remote connected prefixes sometimes not installed
PBR
Various memory leaks and using uninited data fixes
PIM
SGRpt prune received during prune didn't override holdtime
Various memory leaks and using uninited data fixes
STATIC
Fix VRF and usage on startup in some instances
Tableid was being mishandled in some cases
VTYSH
Disable bracketed paste in readline.
WATCHFRR
Various memory leaks and using uninited data fixes
ZEBRA
Always install blackhole routes using kernel routes instead of nexthops
Various memory leaks and using uninited data fixes
Dissallow resolution to duplicate nexthops that created infinite nexthops
Apply the route-map delay-timer globally
Some routes were stuck in Queued state when using the FPM
Better handle vrf creation when using namespaces
Set NUD_NOARP on sticky mac entries in addtion to NTF_STICKY
Allow `set src X` to work on startup
FRR Library
Fix a variety of memory leaks
Fix VRF Creation in some instances
RPKI context editing was not properly handled in reload situations
routemap code was not properly handling modification of CLI in some instances
SNAPCRAFT
Update to using rtrlib 0.7.0
Fix passthrough path for Libyang 1.x
ALPINE
Remove old docker deps
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Mark Stapp [Mon, 21 Sep 2020 19:57:59 +0000 (15:57 -0400)]
lib: avoid signal-handling race with event loop poll call
Manage the main pthread's signal mask to avoid a signal-handling
race. Before entering poll, check for pending signals that the
application needs to handle. Use ppoll() to re-enable those
signals during the poll call.
Mark Stapp [Wed, 2 Sep 2020 20:25:00 +0000 (16:25 -0400)]
lib: add sigevent_check api
Add an api that blocks application-handled signals (SIGINT,
SIGTERM, e.g.) then tests whether any signals have been received.
This helps to manage a race between signal reception and the poll
call in the main event loop.