Mark Stapp [Fri, 29 Sep 2023 16:08:17 +0000 (12:08 -0400)]
zebra: be more careful removing 'installed' flag from nhgs
When interface addresses change, we examine nhgs associated
with the interface in case they need to be reinstalled. As
part of that, we may need to reinstall ecmp nhgs that use the
interface being examined - but not always.
In test_bgp_srv6l3vpn_sid.py we have a comment containing some '\'
characters. Python mistakenly tries to interpret such "\" characters
as escape sequences, which leads to the above warning.
Let's tell Python to treat the comment as a raw string,
so that it simply treats backslashes as literal characters rather than
escape sequences.
bgpd: Initialise timebuf arrays to zeros for dampening reuse timer
Avoid having something like this in outputs:
Before:
```
munet> r1 shi vtysh -c 'show bgp dampening damp'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
```
munet> r1 shi vtysh -c 'show bgp dampening damp '
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Donald Sharp [Thu, 21 Sep 2023 19:30:08 +0000 (15:30 -0400)]
bgpd: Ensure send order is 100% consistent
When BGP is sending updates to peers on a neighbor up event
it was noticed that the bgp updates being sent were in reverse
order being sent to the first peer.
Imagine r1 -- r2 -- r3. r1 and r2 are ebgp peers and
r2 and r3 are ebgp peers. r1's interface to r2 is currently
shutdown. Prior to this fix the send order would look like this:
r1 -> r2 send of routes to r2 and then they would be installed in order
received:
10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
r2 would then send these routes to r3 and then they would be installed
in order received:
10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
Not that big of a deal right? Well imagine a situation where r1 is
originating several ten's of thousands of routes. It sends routes to r2
r2 is processing routes but in reverse order and at the same time it
is sending routes to r3, in the correct order of the bgp table.
r3 will have the early 10.0.0.1/32 routes installed and start forwarding
while r2 will not have those routes installed yet( since they were at the
end and zebra is slightly slower for processing routes than bgp is ).
Ensure that the order sent is a true FIFO. What is happening is that
there is an update fifo which stores all routes. And off that FIFO
is a bgp advertise attribute list which stores the list of prefixes
which share the same attribute that allow for more efficient packing
this list was being stored in reverse order causing the problem for
the initial send. When adding items to this list put them at the
end so we keep the fifo order that is traversed when we walk through
the bgp table.
After the fix:
r2 installation order:
10.0.0.0 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
r3 installation order:
10.0.0.0 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
When isis_zebra_process_srv6_locator_chunk() returns prematurely
due to an error, do not forget to free memory allocated by
srv6_locator_chunk_alloc().
tests: Adding BGP convergence verification before starting PIM tests
Issue: Sometimes BGP neighbors are not up before doing any PIM
operation, that is causing some tests failures.
https://github.com/FRRouting/frr/issues/14441
Fix: Added BGP convergence for all tests where BGP is used to make
sure all BGP neigbhors
Philippe Guibert [Wed, 20 Sep 2023 11:58:29 +0000 (13:58 +0200)]
isisd: fix crash when configuring srv6 locator without isis instance
After the ISIS daemon is launched, the configuration of an srv6
locator in zebra triggers a crash:
> #4 0x00007f1f0ea980f3 in core_handler (signo=11, siginfo=0x7ffdb750de70, context=0x7ffdb750dd40)
> at /build/make-pkg/output/_packages/cp-routing/src/lib/sigevent.c:262
> #5 <signal handler called>
> #6 0x00005651a05783ef in isis_zebra_process_srv6_locator_add (cmd=117, zclient=0x5651a21d9bd0, length=25, vrf_id=0)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_zebra.c:1258
> #7 0x00007f1f0ead5ac9 in zclient_read (thread=0x7ffdb750e750) at /build/make-pkg/output/_packages/cp-routing/src/lib/zclient.c:4246
> #8 0x00007f1f0eab19d4 in thread_call (thread=0x7ffdb750e750) at /build/make-pkg/output/_packages/cp-routing/src/lib/thread.c:1825
> #9 0x00007f1f0ea4862e in frr_run (master=0x5651a1f65a40) at /build/make-pkg/output/_packages/cp-routing/src/lib/libfrr.c:1155
> #10 0x00005651a051131a in main (argc=5, argv=0x7ffdb750e998, envp=0x7ffdb750e9c8)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_main.c:282
> (gdb) f 6
> #6 0x00005651a05783ef in isis_zebra_process_srv6_locator_add (cmd=117, zclient=0x5651a21d9bd0, length=25, vrf_id=0)
> at /build/make-pkg/output/_packages/cp-routing/src/isisd/isis_zebra.c:1258
> (gdb) print isis
> $1 = (struct isis *) 0x0
> (gdb) print isis->area_list
> Cannot access memory at address 0x28
The isis pointer is NULL, because no instances have already been
configured on the ISIS instance.
Fix this by checking that there is any isis instance available when
zebra hooks related to srv6 are received.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd,lib,sharpd,zebra: srv6 introduce multiple segs/SIDs in nexthop
Append zebra and lib to use muliple SRv6 segs SIDs, and keep one
seg SID for bgpd and sharpd.
Note: bgpd and sharpd compilation relies on the lib and zebra files,
i.e if we separate this: lib or zebra or bgpd or sharpd in different
commits - this will not compile.
David Lamparter [Wed, 20 Sep 2023 12:46:10 +0000 (14:46 +0200)]
lib: straight return on error on log open fail
I think I originally had some other code at the tail end of that
function, but that's not the case anymore, and dropping out of the
function with a straight "return -1" is more useful than trucking on
with an invalid fd.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Tue, 19 Sep 2023 19:03:24 +0000 (21:03 +0200)]
lib: constrain hash table "tabshift" both ways
The previous change to assume() did address the coverity warning about
one direction of the shift in HASH_KEY, let's constrain the other in
HASH_SIZE as well.
To be fair, the hash table *will* break at 1G entries, but at that point
we have other problems RAM-wise. (Could bump the thing to 64-bit, but
then we need better item hash functions too on every single user.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Thu, 14 Sep 2023 11:18:37 +0000 (07:18 -0400)]
pimd: Use a better name for oil_parent
Use oil_incoming_vif instead of oil_parent. I had
to go look this up as that I failed to remember that
the linux kernel calls this parent for some bizarre
reason.
David Lamparter [Sat, 16 Sep 2023 12:17:24 +0000 (14:17 +0200)]
ospf6d: fix uninitialized warnings
GCC 13.2.0 complains:
```
ospf6d/ospf6_intra.c:139:25: error: ‘json_arr’ may be used uninitialized [-Werror=maybe-uninitialized]
ospf6d/ospf6_intra.c:485:20: error: ‘json_arr’ may be used uninitialized [-Werror=maybe-uninitialized]
```
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This patch includes:
* Implementation of RFC 5709 support in OSPF. Using
openssl library and FRR key-chain,
one can use SHA1, SHA256, SHA384, SHA512 and
keyed-MD5( backward compatibility with RFC 2328) HMAC algs.
* Updating documentation of OSPF
* add topotests for new HMAC algorithms
*** CID 1568129: Null pointer dereferences (REVERSE_INULL)
/isisd/isis_tlvs.c: 2813 in unpack_item_srv6_end_sid()
2807 sid->subsubtlvs = NULL;
2808 }
2809
2810 append_item(&subtlvs->srv6_end_sids, (struct isis_item *)sid);
2811 return 0;
2812 out:
>>> CID 1568129: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "sid" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
2813 if (sid)
2814 free_item_srv6_end_sid((struct isis_item *)sid);
2815 return 1;
2816 }
2817
2818 /* Functions related to TLVs 1 Area Addresses */