Quentin Young [Thu, 28 Sep 2023 14:49:37 +0000 (10:49 -0400)]
doc: unpin sphinx from 4.0.2
requirements.txt was pinning sphinx at a very old version. This version
doesn't work in recent versions of Python; the new RTD configuration
made RTD respect our requirements file, breaking the build.
Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
Igor Ryzhov [Wed, 27 Sep 2023 23:45:05 +0000 (02:45 +0300)]
vtysh: fix entering configuration node in file-lock mode
When the config node is entered in file-lock mode, we should actually
remember it to correctly apply the workaround in `vtysh_exit`.
Otherwise, the file-lock mode is dropped once we exit any node one level
below the config node.
Igor Ryzhov [Wed, 27 Sep 2023 23:41:16 +0000 (02:41 +0300)]
vty: fix working in file-lock mode
When the configuration node is entered in file-lock mode, candidate
and running datastores are locked. Any configuration change is followed
by an implicit commit which leads to a crash of mgmtd, because double
lock is prohibited by an assert. When working in file-lock mode, we
shouldn't do implicit commits which is disabled by allowing pending
configuration changes.
David Lamparter [Sun, 24 Sep 2023 18:12:42 +0000 (20:12 +0200)]
lib: assert for VTY_PASSFD expectations
Coverity is complaining that vty->state could be VTY_PASSFD here. It
can't, it really shouldn't, and if it actually is then something went
seriously wrong somewhere earlier so assert()ing out is the best thing
to do.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In test_bgp_srv6l3vpn_sid.py we have a comment containing some '\'
characters. Python mistakenly tries to interpret such "\" characters
as escape sequences, which leads to the above warning.
Let's tell Python to treat the comment as a raw string,
so that it simply treats backslashes as literal characters rather than
escape sequences.
bgpd: Initialise timebuf arrays to zeros for dampening reuse timer
Avoid having something like this in outputs:
Before:
```
munet> r1 shi vtysh -c 'show bgp dampening damp'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
```
munet> r1 shi vtysh -c 'show bgp dampening damp '
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Donald Sharp [Thu, 21 Sep 2023 19:30:08 +0000 (15:30 -0400)]
bgpd: Ensure send order is 100% consistent
When BGP is sending updates to peers on a neighbor up event
it was noticed that the bgp updates being sent were in reverse
order being sent to the first peer.
Imagine r1 -- r2 -- r3. r1 and r2 are ebgp peers and
r2 and r3 are ebgp peers. r1's interface to r2 is currently
shutdown. Prior to this fix the send order would look like this:
r1 -> r2 send of routes to r2 and then they would be installed in order
received:
10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
r2 would then send these routes to r3 and then they would be installed
in order received:
10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
Not that big of a deal right? Well imagine a situation where r1 is
originating several ten's of thousands of routes. It sends routes to r2
r2 is processing routes but in reverse order and at the same time it
is sending routes to r3, in the correct order of the bgp table.
r3 will have the early 10.0.0.1/32 routes installed and start forwarding
while r2 will not have those routes installed yet( since they were at the
end and zebra is slightly slower for processing routes than bgp is ).
Ensure that the order sent is a true FIFO. What is happening is that
there is an update fifo which stores all routes. And off that FIFO
is a bgp advertise attribute list which stores the list of prefixes
which share the same attribute that allow for more efficient packing
this list was being stored in reverse order causing the problem for
the initial send. When adding items to this list put them at the
end so we keep the fifo order that is traversed when we walk through
the bgp table.
After the fix:
r2 installation order:
10.0.0.0 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
r3 installation order:
10.0.0.0 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
When isis_zebra_process_srv6_locator_chunk() returns prematurely
due to an error, do not forget to free memory allocated by
srv6_locator_chunk_alloc().
tests: Adding BGP convergence verification before starting PIM tests
Issue: Sometimes BGP neighbors are not up before doing any PIM
operation, that is causing some tests failures.
https://github.com/FRRouting/frr/issues/14441
Fix: Added BGP convergence for all tests where BGP is used to make
sure all BGP neigbhors
Philippe Guibert [Wed, 20 Sep 2023 11:58:29 +0000 (13:58 +0200)]
isisd: fix crash when configuring srv6 locator without isis instance
After the ISIS daemon is launched, the configuration of an srv6
locator in zebra triggers a crash:
> #4 0x00007f1f0ea980f3 in core_handler (signo=11, siginfo=0x7ffdb750de70, context=0x7ffdb750dd40)
> at /build/make-pkg/output/_packages/cp-routing/src/lib/sigevent.c:262
> #5 <signal handler called>
> #6 0x00005651a05783ef in isis_zebra_process_srv6_locator_add (cmd=117, zclient=0x5651a21d9bd0, length=25, vrf_id=0)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_zebra.c:1258
> #7 0x00007f1f0ead5ac9 in zclient_read (thread=0x7ffdb750e750) at /build/make-pkg/output/_packages/cp-routing/src/lib/zclient.c:4246
> #8 0x00007f1f0eab19d4 in thread_call (thread=0x7ffdb750e750) at /build/make-pkg/output/_packages/cp-routing/src/lib/thread.c:1825
> #9 0x00007f1f0ea4862e in frr_run (master=0x5651a1f65a40) at /build/make-pkg/output/_packages/cp-routing/src/lib/libfrr.c:1155
> #10 0x00005651a051131a in main (argc=5, argv=0x7ffdb750e998, envp=0x7ffdb750e9c8)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_main.c:282
> (gdb) f 6
> #6 0x00005651a05783ef in isis_zebra_process_srv6_locator_add (cmd=117, zclient=0x5651a21d9bd0, length=25, vrf_id=0)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_zebra.c:1258
> (gdb) print isis
> $1 = (struct isis *) 0x0
> (gdb) print isis->area_list
> Cannot access memory at address 0x28
The isis pointer is NULL, because no instances have already been
configured on the ISIS instance.
Fix this by checking that there is any isis instance available when
zebra hooks related to srv6 are received.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd,lib,sharpd,zebra: srv6 introduce multiple segs/SIDs in nexthop
Append zebra and lib to use muliple SRv6 segs SIDs, and keep one
seg SID for bgpd and sharpd.
Note: bgpd and sharpd compilation relies on the lib and zebra files,
i.e if we separate this: lib or zebra or bgpd or sharpd in different
commits - this will not compile.
David Lamparter [Wed, 20 Sep 2023 12:46:10 +0000 (14:46 +0200)]
lib: straight return on error on log open fail
I think I originally had some other code at the tail end of that
function, but that's not the case anymore, and dropping out of the
function with a straight "return -1" is more useful than trucking on
with an invalid fd.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>