Donatas Abraitis [Fri, 18 Nov 2022 13:47:50 +0000 (15:47 +0200)]
bgpd: Allow overriding MPLS VPN next-hops via route-maps
Just do not reset next-hop for MPLS VPN routes.
Example of 172.16.255.1/32 (using extended next-hop capability):
```
pe2# sh bgp ipv4 vpn
BGP table version is 4, local router ID is 10.10.10.20, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Ryoga Saito [Sat, 12 Nov 2022 08:45:19 +0000 (17:45 +0900)]
bgpd: fix invalid ipv4-vpn nexthop for IPv6 peer
Given that two routers are connected each other and they have IPv6
addresses and they establish BGP peer with extended-nexthop capability
and one router tries to advertise locally-generated IPv4-VPN routes to
other router.
In this situation, bgpd on the router that tries to advertise IPv4-VPN
routes will be crashed with "invalid MP nexthop length (AFI IP6)".
This issue is happened because MP_REACH_NLRI path attribute is not
generated correctly when ipv4-vpn routes are advertised to IPv6 peer.
When IPv4 routes are leaked from VRF RIB, the nexthop of these routes
are also IPv4 address (0.0.0.0/0 or specific addresses). However,
bgp_packet_mpattr_start only covers the case of IPv6 nexthop (for IPv6
peer).
ipv4-unicast routes were not affected by this issue because the case of
IPv4 nexthop is covered in `else` block.
bgpd: authorise to select bgp self peer prefix on rr case
This commit addresses an issue that happens when using bgp
peering with a rr client, with a received prefix which is the
local ip address of the bgp session.
When using bgp ipv4 unicast session, the local prefix is
received by a peer, and finds out that the proposed prefix
and its next-hop are the same. To avoid a route loop locally,
no nexthop entry is referenced for that prefix, and the route
will not be selected.
When the received peer is a route reflector, the prefix has
to be selected, even if the route can not be installed locally.
Fixes: ("fb8ae704615c") bgpd: prevent routes loop through itself Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Mon, 14 Nov 2022 13:28:45 +0000 (08:28 -0500)]
zebra: Fix dplane_fpm_nl to allow for fast configuration
If you have this order in your configuration file:
no fpm use-next-hop-groups
fpm address 127.0.0.1
the dplane code was using the same event thread t_event and the second
add event in the code was going, you already have an event scheduled
and as such the second event does not overwrite it. Leaving
no code to actually start the whole processing. There are probably
other cli iterations that will cause this fun as well, but I'm
not going to spend the time sussing them out at the moment.
Fixes: #12314 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Sarita Patra [Fri, 11 Nov 2022 06:59:58 +0000 (22:59 -0800)]
pimd, pim6d: Update upstream rpf disable/enable pim on interface
Problem:
When "no ip pim" is executed on source connected interface, its
ifp->info is set to NULL. But KAT on this interface is still
running, it wrongly dereferences NULL. This leads to crash.
Root Cause:
pim upstream IIF is still pointing towards the source connected
interface which is not pim enabled and Mroute is still present in
the kernel.
Fix:
When “no ip pim” command gets executed on source connected interface,
then loop through all the pnc->nexthop, if any new nexthop found,
then update the upstream IIF accordindly, if not found then update
the upstream IIF as Unknown and uninstall the mroute from kernel.
When “ip pim” command gets executed on source connected interface,
then also loop through all the pnc->nexthop and update the upstream IIF,
install the mroute in kernel.
https://github.com/FRRouting/frr/pull/11465 enabled account verification,
but the pam config declares rootok as sufficient in authentication only
and not in account verification, what causes warning in the log:
Add the documentation for the `behavior usid` command to zebra.
When the `behavior usid` command is set, a flag is added to the locator
to indicate that the locator is a uSID locator. When a locator is
specified as a uSID locator, the bgpd will install SRv6 behaviors with
the uSID in the dataplane and use the SRv6 uSID codepoints in the BGP
update message.
This test ensures that the command `behavior usid` works properly.
When the `behavior usid` command is set, a flag is added to the locator
to indicate that the locator is a uSID locator. This test verifies that
the locator works correctly when you set / unset the `behavior usid`
command.
Install a new command `behavior usid` into the `SRV6_LOC_NODE` CLI node.
This command allows the user to set/unset the `SRV6_LOCATOR_USID` flag
for an SRv6 locator. The `SRV6_LOCATOR_USID` flag indicates whether a
locator is a uSID locator or not. When the flag is set, the routing
daemons (e.g., bgpd) will install SRv6 behaviors with the uSID in the
dataplane.
In this commit, we add two helper functions
`zebra_notify_srv6_locator_add` and `zebra_notify_srv6_locator_delete`.
These functions are used to notify locator additions/deletions to
zclients.
bgpd: Use SRv6 codepoints in the BGP Advertisement
Currently bgpd uses the opaque codepoint (0xFFFF) in the BGP
advertisement. In this commit, we update bgpd to use the SRv6 codepoints
defined in the IANA SRv6 Endpoint Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml)
In this commit, we introduce a new enumeration to encode the SRv6
Endpoint Behaviors codepoints defined in the IANA SRv6 Endpoint
Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml).
Donald Sharp [Tue, 8 Nov 2022 19:38:02 +0000 (14:38 -0500)]
bgpd: rpki was decrementing the node lock one time too many
The code was this:
1) match = bgp_table_subtree_lookup(rrp->bgp->rib[rrp->afi][rrp->safi],
&rrp->prefix);
2) node = match;
while (node) {
if (bgp_dest_has_bgp_path_info_data(node)) {
revalidate_bgp_node(node, rrp->afi, rrp->safi);
}
3) node = bgp_route_next_until(node, match);
}
if (match)
4) bgp_dest_unlock_node(match);
At 1) match was locked and became +1
At 2) match and node are now equal
At 3) On first iteration, match is decremented( as that node points
at it ) and the next item is locked, if it is found, and returned which becomes node
If 3 is run again because node is non-null then, current node is decremented
and the next node found is incremented and returned which becomes node again.
So if we get to 4) match is unlocked again which is now a double unlock
which, frankly, is not good. In all code paths that I can see the
test for `if (match) ...` is not needed so let's just remove it.
zebra/netconf_netlink.c: In function 'netlink_netconf_change':
zebra/netconf_netlink.c:109:32: error: 'AF_MPLS' undeclared (first use in this function)
109 | if (ncm->ncm_family == AF_MPLS)
| ^~~~~~~
Donald Sharp [Tue, 8 Nov 2022 12:36:56 +0000 (07:36 -0500)]
bgpd: Make rpki soft_reconfig calling events
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive. Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.
ospf6d: Show if the interface is passive for `show ipv6 ospf6 interface`
donatas-pc# sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
Interface ID: 2
Internet Address:
inet : 192.168.10.17/24
inet6: fe80::ca5d:fd0d:cd8:1bb7/64
Instance ID 0, Interface MTU 1500 (autodetect: 1500)
MTU mismatch detection: enabled
Area ID 0.0.0.0, Cost 1000
State Waiting, Transmit Delay 1 sec, Priority 1
Timer intervals configured:
Hello 10(8.149), Dead 40, Retransmit 5
DR: 0.0.0.0 BDR: 0.0.0.0
Number of I/F scoped LSAs is 1
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication Trailer is disabled
donatas-pc# con
donatas-pc(config)# int enp3s0
donatas-pc(config-if)# ipv6 ospf6 passive
donatas-pc(config-if)# do sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
Interface ID: 2
Internet Address:
inet : 192.168.10.17/24
inet6: fe80::ca5d:fd0d:cd8:1bb7/64
Instance ID 0, Interface MTU 1500 (autodetect: 1500)
MTU mismatch detection: enabled
Area ID 0.0.0.0, Cost 1000
State Waiting, Transmit Delay 1 sec, Priority 1
Timer intervals configured:
No Hellos (Passive interface)
DR: 0.0.0.0 BDR: 0.0.0.0
Number of I/F scoped LSAs is 1
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication Trailer is disabled
donatas-pc(config-if)#
Seems that if using PCRE2, we need to escape outer `()` chars and `|`. Sounds
like a bug.
But this is only with some older PCRE2 versions. With >= 10.36, I wasn't able
to reproduce this, everything is fine and working as expected.
Adding _FRR_PCRE2_POSIX definition because pcre2posix.h does not have
include's guard.
zebra: Reuse netinet/if_ether.h to avoid redefinition of struct ethhdr
In file included from /usr/include/net/ethernet.h:10,
from ./lib/prefix.h:26,
from zebra/tc_netlink.c:32:
/usr/include/netinet/if_ether.h:115:8: error: redefinition of 'struct ethhdr'
115 | struct ethhdr {
| ^~~~~~
In file included from zebra/tc_netlink.c:28:
/usr/include/linux/if_ether.h:169:8: note: originally defined here
169 | struct ethhdr {
| ^~~~~~
Donald Sharp [Mon, 24 Oct 2022 13:25:54 +0000 (09:25 -0400)]
*: Add ability for daemons to notice resilience changes
This patch just introduces the callback mechanism for the
resilient nexthop changes so that upper level daemons
can take advantage of the change. This does nothing
at this point but just call some code.
Donald Sharp [Mon, 24 Oct 2022 18:08:35 +0000 (14:08 -0400)]
lib: When adding to front of list ensure we handle tail to
When inserting to the front of a list with listnode_add_head
if the list is empty, the tail will not be properly set and
subsuquent calls to insert/remove will cause the function
to crash.
Donald Sharp [Tue, 1 Nov 2022 12:00:14 +0000 (08:00 -0400)]
lib, zebra: Allow for zebra to recognize that a route has gotten desynced
FRR does not use the NLM_F_APPEND semantics ( in fact I would argue that
the NLM_F_APPEND semantics just introduce pain for all parties involved )
I would also argue that most people who use the kernel netlink api
have recognized that NLM_F_APPEND for a route is a recipe for disaster
that is well documented and as such it is not used as anything other
than a curiousity by operators.
Are 2 great examples of how confusing it is for anyone in user
space to know what the correct thing to do is. Given that
new fields can be added with no semantics to allow us to know
what has resulted in a change or not.
In an attempt to recognize this, let's note that FRR
believes it has gotten out of sync with the kernel.
Future commits will react to the desynchronized route
and request from the kernel a reload of that specific
route if possible.
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory
Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```
Another end received:
```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```