Donald Sharp [Thu, 13 Oct 2022 19:54:43 +0000 (15:54 -0400)]
pimd: Remove pim_br vestiges
If PIM had received a register packet with the Border Router
bit set, pimd would have crashed. Since I wrote this code
in 2015 and really have pretty much no memory of this and
no-one has ever reported this crash, let's just remove this
code.
Louis Scalbert [Wed, 19 Oct 2022 12:34:43 +0000 (14:34 +0200)]
lib: fix the default TE bandwidth
When enabling the interface link-params, a default bandwidth is assigned
to the Max, Reservable and Unreserved Bandwidth variables. If the
bandwidth is set at in the interface context, this value is used.
Otherwise, a default bandwidth value of 10 Gbps is set.
Revert the default value to 10 Mbps as it was intended in the initial
commit. 10 Mbps is a low value so that the link will not be prioritized
when computing the paths.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Donald Sharp [Thu, 27 Oct 2022 13:21:41 +0000 (09:21 -0400)]
bgpd: Clarify what NHT error message means
When waiting on a path to reach the peer, modify the debug/show
output to give a better understanding to the operator about what
they should be looking for.
Sai Gomathi N [Thu, 27 Oct 2022 09:36:00 +0000 (02:36 -0700)]
pimd: Dereference before null check
In pim_ecmp_nexthop_search: All paths that lead to this null pointer comparison already dereference the pointer earlier
There may be a null pointer dereference, or else the comparison against null is unnecessary.
Sai Gomathi N [Thu, 27 Oct 2022 08:52:31 +0000 (01:52 -0700)]
pimd: Unchecked return value
In tib_sg_oil_setup: Value returned from a function is not checked for errors before being used.
If the function returns an error value, the error value may be mistaken for a normal value.
Here, only the nexthop value is being used. So casted the return type to void.
Ryoga Saito [Thu, 27 Oct 2022 01:17:50 +0000 (10:17 +0900)]
bgpd: Fix the condition whether nexthop is changed
Given that the following topology, route server MUST not modify NEXT_HOP
attribute because route server isn't in the actual routing path. This
behavior is required to comply RFC7947
(Router A) <-(eBGP peer)-> (Route Server) <-(eBGP peer)-> (Router B)
RFC7947 says as follows:
> As the route server does not participate in the actual routing of
> traffic, the NEXT_HOP attribute MUST be passed unmodified to the route
> server clients, similar to the "third-party" next-hop
> feature described in Section 5.1.3. of [RFC4271].
However, current FRR is violating RFC7947 in some cases. If routers and
route server established BGP peer over IPv6 connection and routers
advertise ipv4-vpn routes through route server, route server will modify
NEXT_HOP attribute in these advertisements.
This is because the condition to check whether NEXT_HOP attribute should
be changed or not is wrong. We should use (afi, safi) as the key to
check, but (nhafi, safi) is actually used. This causes the RFC7947
violation.
Trey Aspelund [Wed, 26 Oct 2022 20:53:09 +0000 (20:53 +0000)]
bgpd: Check for IP-format Site-of-Origin
When deciding whether to apply "neighbor soo" filtering towards a peer,
we were only looking for SoO ecoms that use either AS or AS4 encoding.
This makes sure we also check for IPv4 encoding, since we allow a user
to configure that encoding style against the peer.
Config:
```
router bgp 1
address-family ipv4 unicast
network 100.64.0.2/32 route-map soo-foo
neighbor 192.168.122.12 soo 3.3.3.3:20
exit-address-family
!
route-map soo-foo permit 10
set extcommunity soo 3.3.3.3:20
exit
```
Before:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
*> 100.64.0.2/32 0.0.0.0 0 100 32768 i
Total number of prefixes 2
```
After:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
Lou Berger [Fri, 21 Oct 2022 20:35:13 +0000 (20:35 +0000)]
ospf: optimization for FRR's P2MP mode
FRR implements a non-standard, but compatible approach for
sending update LSAs (it always send to 224.0.0.5) on P2MP
interfaces. This change makes it so acks are also sent to
224.0.0.5.
Since the acks are multicast, this allows an optimization
where we don't send back out the incoming P2MP interface
immediately allow time to rx multicast ack from neighbors
on the same net that rx'ed the original (multicast) update.
Wayne Morrison [Tue, 25 Oct 2022 14:45:35 +0000 (10:45 -0400)]
bgpd: fixed misaligned columns in BGP routes table
Column headers in BGP routes table are not aligned with data when
RPKI status is available. This was fixed to insert a space at the
beginning of the header and at the beginning of lines that do not
have RPKI status.
This fix requires that several testing templates be adjusted to
match the new output.
Signed-off-by: Wayne Morrison <wmorrison@netgate.com>
Manoj Naragund [Tue, 25 Oct 2022 07:43:10 +0000 (00:43 -0700)]
ospf6d: Fix for memory leak issues in ospf6.
Problem:
Multiple memory leaks in ospf6.
260 ==6637== 32 bytes in 1 blocks are definitely lost in loss record 5 of 24
261 ==6637== at 0x4C31FAC: calloc (vg_replace_malloc.c:762)
262 ==6637== by 0x4E8A1BF: qcalloc (memory.c:111)
263 ==6637== by 0x11EE27: ospf6_summary_add_aggr_route_and_blackhole (ospf6_asbr.c:2779)
264 ==6637== by 0x11EEBA: ospf6_originate_new_aggr_lsa (ospf6_asbr.c:2811)
265 ==6637== by 0x4E7C6A7: hash_clean (hash.c:325)
266 ==6637== by 0x11FA93: ospf6_handle_external_aggr_update (ospf6_asbr.c:3164)
267 ==6637== by 0x11FA93: ospf6_asbr_summary_process (ospf6_asbr.c:3386)
268 ==6637== by 0x4EB739B: thread_call (thread.c:1692)
269 ==6637== by 0x4E85B17: frr_run (libfrr.c:1068)
270 ==6637== by 0x119535: main (ospf6_main.c:228)
356 ==6637== 240 bytes in 12 blocks are indirectly lost in loss record 13 of 24
357 ==6637== at 0x4C2FE96: malloc (vg_replace_malloc.c:309)
358 ==6637== by 0x4E8A0DA: qmalloc (memory.c:106)
359 ==6637== by 0x13545C: ospf6_lsa_alloc (ospf6_lsa.c:724)
360 ==6637== by 0x1354E3: ospf6_lsa_create_headeronly (ospf6_lsa.c:756)
361 ==6637== by 0x1355F2: ospf6_lsa_copy (ospf6_lsa.c:790)
362 ==6637== by 0x13B58B: ospf6_dbdesc_recv_slave (ospf6_message.c:976)
363 ==6637== by 0x13B58B: ospf6_dbdesc_recv (ospf6_message.c:1038)
364 ==6637== by 0x13B58B: ospf6_read_helper (ospf6_message.c:1838)
365 ==6637== by 0x13B58B: ospf6_receive (ospf6_message.c:1875)
366 ==6637== by 0x4EB739B: thread_call (thread.c:1692)
367 ==6637== by 0x4E85B17: frr_run (libfrr.c:1068)
368 ==6637== by 0x119535: main (ospf6_main.c:228)
RCA:
1. when the ospf6 area is being deleted, the neighbor related information
was not being cleaned up.
2. when aggr route gets deleted from rt_aggr_tbl the corrsponding summary
route attched to the aggr route was not being deleted.
Fix:
Added the ospf6_neighbor_delete in ospf6_area_delete to free the
neighbor related information and added ospf6_route_delete while
freeing external aggr route to free the summary route.
Nico Berlee [Sun, 23 Oct 2022 14:42:51 +0000 (16:42 +0200)]
vtysh: Ensure an empty string does not get printed for host/domain
vtysh show running-config is showing:
frr version 8.3.1_git
frr defaults traditional
hostname test
log file /etc/frr/frr.log informational
log timestamp precision 3
domainname
service integrated-vtysh-config
domainname should not be printed in this case at all. If the
host has no search/domainname configured, frr_reload.py
crashes on invalid config from `vtysh show running-config`
Stephen Worley [Fri, 21 Oct 2022 16:45:50 +0000 (12:45 -0400)]
bgpd,doc: limit InQ buf to allow for back pressure
Add a default limit to the InQ for messages off the bgp peer
socket. Make the limit configurable via cli.
Adding in this limit causes the messages to be retained in the tcp
socket and allow for tcp back pressure and congestion control to kick
in.
Before this change, we allow the InQ to grow indefinitely just taking
messages off the socket and adding them to the fifo queue, never letting
the kernel know we need to slow down. We were seeing under high loads of
messages and large perf-heavy routemaps (regex matching) this queue
would cause a memory spike and BGP would get OOM killed. Modifying this
leaves the messages in the socket and distributes that load where it
should be in the socket buffers on both send/recv while we handle the
mesages.
Also, changes were made to allow the ringbuffer to hold messages and
continue to be filled by the IO pthread while we wait for the Main
pthread to handle the work on the InQ.
Memory spike seen with large numbers of routes flapping and route-maps
with dozens of regex matching:
```
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 516 KiB
Used small blocks: 0 bytes
Used ordinary blocks: 160 MiB
Free small blocks: 3680 bytes
Free ordinary blocks: > 2GB
Ordinary blocks: 121244
Small blocks: 83
Holding blocks: 1
```
With most of it being held by the inQ (seen from the stream datastructure info here):
```
Type : Current# Size Total Max# MaxBytes
...
...
Stream : 115543 variable 26963208159707403571708768
```
With this change that memory is capped and load is left in the sockets:
RECV Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 265350 0 [fe80::4080:30ff:feb0:cee3]%veth1:36950 [fe80::4c14:9cff:fe1d:5bfd]:179 users:(("bgpd",pid=1393334,fd=26))
skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61)
```
SEND Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 0 1275012 [fe80::4c14:9cff:fe1d:5bfd]%veth1:179 [fe80::4080:30ff:feb0:cee3]:36950 users:(("bgpd",pid=1393443,fd=27))
skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0)
```
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Mark Stapp [Thu, 20 Oct 2022 20:47:12 +0000 (16:47 -0400)]
bgpd: fix config of allowas_in; add to show output
Ensure that un-configuring allowas-in for a peer or group
clears the related flags and integer value. Tighten the use
of the integer counter so that it's only used when the config
flag is set. Add show output if allowas-in is enabled.
anlan_cs [Fri, 21 Oct 2022 05:17:29 +0000 (01:17 -0400)]
bgpd: return failure for wildcard ERT
The "RTLIST..." list should be maintained integrity. If wildcard check
failed, it should immediately return failure. Otherwise user configuration
will be partial.
```
anlan(config-router-af)# route-target export *:55 33:33
% Wildcard '*' only applicable for import
anlan(config-router-af)# route-target both *:55 33:33
% Wildcard '*' only applicable for import
```
With this commit, the RTs without wildcard will not be executed as before. And
the same for `no` form.
Louis Scalbert [Mon, 17 Oct 2022 15:35:12 +0000 (17:35 +0200)]
isisd: fix sending remote interface ip address after enabling MPLS TE
If MPLS TE is enabled, the router encodes the local and remote interface
IP address in the "Extended Reachability" TLV.
> east-vm(config)# do show isis database detail east-vm.00-00
> Extended Reachability: 0007.e901.3333.00 (Metric: 10)
> Local Interface IP Address(es): 10.126.0.2
> Remote Interface IP Address(es): 10.126.0.3
> Maximum Bandwidth: 1.76258e+08 (Bytes/sec)
The remote interface is added when the circuit adjacency comes up after
setting MPLS TE. However, if MPLS TE is enabled after, the remote
address is not added. It happens after disabling and re-enabling the
MPLS TE.
> east-vm(config)# router isis 1
> east-vm(config-router)# no mpls on
> east-vm(config-router)# mpls on
> east-vm(config)# do show isis database detail east-vm.00-00
> Extended Reachability: 0007.e901.3333.00 (Metric: 10)
> Local Interface IP Address(es): 10.126.0.2
> Maximum Bandwidth: 1.76258e+08 (Bytes/sec)
Update the remote IPv4 and IPv6 of all adjacencies after enabling MPLS
TE.
Fixes: 1b3f47d04c ("isisd: Update TLVs processing for TE, RI & SR") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Stephen Worley [Fri, 21 Oct 2022 15:18:12 +0000 (11:18 -0400)]
bgpd: fix vni_str NULL check in evpn rt show run
Fix the vni_str NULL check for wildcard route-targets
in evpn show run. This will never be NULL if we add 1
here. Though it should also never be NULL since ":" should
always exist. Better to be safe than sorry.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Donald Sharp [Wed, 19 Oct 2022 16:57:28 +0000 (12:57 -0400)]
lib: Remove unnecessary comparison, for linked list
In the comparison function for a linked list code was
always checking against passed in NULL's. The comparison
function will never receive a NULL value for data from
the linklist.c code.
Donald Sharp [Wed, 19 Oct 2022 16:44:55 +0000 (12:44 -0400)]
zebra: Fix debug of filtering out prefix due to routemap
The debug for notification about a filtered prefix was
just printing the nexthop ifindex and vrf id. Not all
nexthops have this data. Just print out the actual nexthop
Sarita Patra [Tue, 11 Oct 2022 01:38:14 +0000 (18:38 -0700)]
pimd, pim6d: Don't configure link-local, Multicast, Unspecified address as RP
Problem:
=======
frr(config)# do show ipv6 pim interface
Interface State Address PIM Nbrs PIM DR FHR IfChannels
ens192 up fe80::250:56ff:feb7:3619 0 local 0 1
Configure ens192 interface link-local address as RP.
frr(config)# ipv6 pim rp fe80::250:56ff:feb7:3619
No Path to RP address specified: fe80::250:56ff:feb7:3619
frr(config)# do show ipv6 pim rp-info
RP address group/prefix-list OIF I am RP Source Group-Type
fe80::250:56ff:feb7:3619 ff00::/8 Unknown yes Static ASM
Fix:
===
RP should not be link-local, multicast and unspecified address.
Introduced common api pim_process_unicast_bsm_cmd,
pim_process_no_unicast_bsm_cmd which will process
both "[no] ip pim unicast-bsm" command and "[no] ipv6 pim
unicast-bsm" command.
Currently, the SID transposition algorithm implemented in bgpd handles
incorrectly the SRv6 locators with function length greater than 20 bits.
To prevent issues, we currently limit the function length to 20 bits.
This limit will be removed when the bgpd SID transposition is fixed.
According to RFC 8986, the SRv6 SID length cannot exceed 128 bits. This
commit ensures that the condition
`block_len + node_len + function_len + arg_len <= 128` is satisfied when
a new SRv6 locator is created.
This commit extends the `bgp_srv6l3vpn_to_bgp_vrf3` topotest by adding
two tests:
* prevent bgpd from exporting routes from a VRF to the VPN RIB
(`no sid vpn per-vrf export`);
* enable bgpd to export routes from a VRF to the VPN RIB
(`sid vpn per-vrf export auto`).
The command `sid vpn per-vrf export (1-255)|auto` can be used to export
IPv4 and IPv6 routes from a VRF to the VPN RIB using a single SRv6 SID
(End.DT46 behavior).
This commit implements the no form of the above command, which can be
used to disable the export of the IPv4/IPv6 routes:
`no sid vpn per-vrf export`.
This commit adds a new test case to the
test_zebra_seg6local_route topotest. The new test case performs two
operations:
* try to install a seg6local route with an End.DT46 action
* verify that the route is created correctly