topotests: label per nexthop test adds add a while loop for mpls table
The bgp_vpnv4_per_nexthop_label tests only check to see if the mpls labels
are installed one time. Test runs show that all but one label is installed.
More than likely the test has asked for data while zebra is still installing
it. the mpls_label_check functions must check this result multiple times as
that system may be under heavy load.
A loop is introduced in order to let zebra check the mpls table.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
topotests: structural issues in bgp_local_as_dotplus_private_remove
This test has several issues:
A) The convergence function is spamming the show neighbor command until success,
if the neighbor never comes up the test will never finish. This adds unnecessary
load to an already loaded test system. Use run_and_expect to properly wait for
the neighbor relationship to come up.
B) The convergence function should not sleep for 1 second *After* the neighbor
is established
C) The _bgp_as_path() function fails if the prefix has not been received yet.
This looking for the prefix data should be within a run_and_expect() functionality.
Else a loaded test system will fail in this function because while we may be in
an established state, prefixes might not yet have been exchanged and there is no
point in failing the test without giving the system some time to actually converge.
Fix those points, similarly to what has been fixed in
bgp_local_as_private_remove test.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd: fix static analysis issue in subgroup_announce_check()
Remove the check about pi->peer value different from null.
Introducing this check introduces a SA warning on the value
of the from value (derived from pi->peer).
Actually, peer is set when bgp_path_info_make() call is
performed; peer is never null.
Fixes: 23bb4a9b5c64 ("bgpd: advertise mpls vpn routes with appropriate label") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd: fix accept-own routes received by a route reflector
When using the bgp-accept-own community, with the
'attribute-unchanged next-hop' command, the advertised
mpls vpn updates that are reflected by a route reflector
are received, but are not selected.
Once the accept-own community is detected, a new bgp_path
is created, in addition of the original one; then the
next-hop of the NLRI is checked, but fails for two reasons:
- the next-hop tracking returns the real IP reachability
status for prefixes that have the BGP_ROUTE_IMPORTED subtype.
This is what happens with bgp updates with the accept-own
community.
- as the next-hop was unchanged and was the peer IP in the VRF.
Consequently, the new bgp_path is considered inactive in the
default VRF, and is not selected.
The incoming bgp updates with the accept-own community should
not be checked against the next-hop tracking. As the bgp_path
subtype has been changed to BGP_ROUTE_IMPORTED, let us check
the bgp subtype before calling the 'bgp_find_or_add_nexthop()'
function in the 'bgp_update()' call.
Fixes: 46dbf9d0c0b9 bgpd: ("Implement ACCEPT_OWN extended community") Fixes: 376797711f4d - bgpd: track mpls vpn nexthops Fixes: e6110f755718 bgpd: ("fix use nexthop tracking for exported vpn paths") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd: fix use nexthop tracking for exported vpn paths
When exporting redistributed prefixes from a given VRF
to an MPLS VPN network, the paths are always considered
as valid whereas it should not always be the case.
At exportation, a new MPLS VPN path is built in. Then
nexthop tracking is applied to the new path, and the
SAFI_MPLS_VPN parameter is used to tell the NHT code
to just check for the next-hop reachability. The previous
commit was wrongly considering that nexthop tracking was
never applied to mpls vpn networks. Ensure that nexthop
tracking for exported paths behaves as usual.
Fix this by not returning always 1 in the 'bgp_find_or_add_nexthop()'
function if the passed 'pi' parameter is a 'BGP_IMPORTED_ROUTE'
sub-type entry.
Philippe Guibert [Fri, 21 Apr 2023 10:25:30 +0000 (12:25 +0200)]
topotests: mpls vpn routes redistribution, add asbr test
This setup demonstrates the redistribution and the proper
switching operations in an asbr device.
The setup interconnects an internal AS with an external
connected AS.
- the iBGP AS uses BGP-LU as MPLS transport
- the eBGP peering is directly connected and does use the
'mpls bgp forwarding' configuration to accept exterior
updates.
The setup performs the following tests:
- it checks for end to end connectivity from one interior
host h1 to two external hosts h2, and h3.
- it checks that the proper label values are advertised
by the ASBR to the iBGP peer, and the eBGP peer.
- it checks that the 'show mpls table' has additional
MPLS entries that permit transit mpls traffic to transit
across the ASBR. That behaviour is possible with the
'mpls bgp allocate-label-on-nexthop-change' command.
- it checks that withdraw of routes will remve the MPLS
entries.
- it checks that by unconfiguring the 'next-hop-self' option,
the external routes advertised to the internal maintain the
next-hop.
- it checks that a second prefix advertised by r3 with the
same RD, but different label value is using a new label on r2,
and that this new label value is used.
- it checks that when filtering out prefixes from r1, on r2,
then the MPLS label is deallocated, and the MPLS entry is not
present.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There is no 'show command' to use for troubleshooting
purposes.
Add a new show command to dump the cache entry of the
MPLS VPN nexthop label bind cache table.
> show bgp [vrf NAME] mplsvpn-nh-label-bind [detail]
The below command illustrates its output:
> dut# show bgp mplsvpn-nh-label-bind detail
> Current BGP mpls-vpn nexthop label bind cache, VRF default
> 192.168.1.3, label 102, local label 18 #paths 3
> interface r2-eth1
> Last update: Mon May 22 14:39:42 2023
> Paths:
> 1/3 172.31.3.0/24 VRF default flags 0x418
> 1/3 172.31.2.0/24 VRF default flags 0x418
> 1/3 172.31.1.0/24 VRF default flags 0x418
> 192.0.2.1, label 101, local label 19 #paths 1
> interface r2-eth0
> Last update: Mon May 22 14:39:43 2023
> Paths:
> 1/3 172.31.0.0/24 VRF default flags 0x418
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 11 May 2023 13:42:08 +0000 (15:42 +0200)]
bgpd: update the mpls entry to handle return traffic
When advertising an mpls vpn entry with a new label,
the return traffic is redirected to the local machine,
but the MPLS traffic is dropped.
Add an MPLS entry to handle MPLS packets which have
the new label value. Traffic is swapped to the original
label value from the mpls vpn next-hop entry; then it is
sent to the resolved next-hop of the original next-hop
from the mpls vpn next-hop entry.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 11 May 2023 13:26:50 +0000 (15:26 +0200)]
bgpd: advertise mpls vpn routes with appropriate label
The advertised label value from mpls vpn routes is not modified
when the advertised next-hop is modified to next-hop-self.
Actually, the original label value received is redistributed as
is, whereas the new_label value bound in the nexthop label
bind entry should be used.
Only the VPN entries that contain MPLS information, and that
are redistributed between distinct peers, will have a label
value to advertise.
- no SRv6 attribute
- no local prefix
- no exported VPN prefixes from a VRF
If the advertisement to a given peer has the next-hop modified,
then the new label value will be picked up. The considered cases
are peers configured with 'next-hop-self' option, or ebgp peerings
without the 'next-hop-unchanged' option.
Note that the the NLRI format will follow the rfc3107 format, as
multiple label values for MPLS VPN NLRIs are not supported (the
rfc8277 is not supported).
Note also that the case where an outgoing route-map is applied to
the outgoing neighbor is not considered in this commit.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
An mplsvpn next-hop label binding entry is created in an mpls
vpn nexthop label bind hash table of the current BGP instance.
That mpls vpn next-hop label entry is indexed by the (next-hop,
orig_label) values provided by the incoming updates, and shared
with other updates having the same (next-hop, orig_label) values.
A new 'LP_TYPE_BGP_L3VPN_BIND' label value is picked up from the
zebra mpls label pool, and assigned to the new_label attribute.
The 'bgp_path_info' appends a 'bgp_mplsvpn_nh_label_bind' structure
to the 'mplsvpn' union structure. Both structures in the union are not
used at the same, as the paths are either VRF updates to export, or MPLS
VPN updates. Using an union gives a 24 bytes memory gain compared to if
the structures had not been in an union (24 bytes compared to 48 bytes).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Thu, 25 May 2023 10:40:45 +0000 (12:40 +0200)]
bgpd: move label allocation code to a specific function
The label allocation mechanism is called implicitly for
labeled unicast paths. The check should be explicit, because
the current patch set will extend the mechanism for mpls vpn
paths, and the code should explicitly tell which safi calls
which code.
Fix this implicit call by checking the safi value. Move the
code to a specific function.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 24 May 2023 15:26:13 +0000 (17:26 +0200)]
bgpd: move the label per nexthop structs of bgp_path to an union
The label per nexthop attributes take 24 bytes per bgp path
entry on AMD64 platform, and are only used for unicast paths.
The current patch-set introduces a similar attributes, but that
will be used only for l3vpn paths. To gain some memory on the
bgp_path_info structure in the next commit, do some changes.
Create an 'mplsvpn' union structure that will either include the
label per nexthop structs for ipv4 paths, or the l3vpn paths
structures. The 'label_nexthop_cache' and the 'label_nh_thread'
attributes of the 'bgp_path_info' structure are moved into an
union under a new structure called 'bgp_mplsvpn_label_nh_blnc'.
The flags attribute of 'bgp_path_info' is increased from 16 bits
to 32 bits, and the BGP_PATH_MPLSVPN_LABEL_NH flag is added to
know the 'mplsvpn' usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In the context of the ASBR facing an EBGP neighbor, or
facing an IBGP neighbor where the BGP updates received
are re-advertised with a modified next-hop, a new local
label will be re-advertised too, to replace the original
one.
Create a binding table, in the form of a hash list, from the
original labels to the new labels. Since labels can be the
same on several routers, set the next-hop and the label as
the keys. Add the needed API functions to manage the hash
list.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Louis Scalbert [Tue, 2 May 2023 14:30:20 +0000 (16:30 +0200)]
bgpd: add LP_TYPE_BGP_L3VPN_BIND label type
Redistributing mpls vpn prefixes with a new label
requires picking up unused MPLS local labels.
Today, there is an MPLS label pool which is owned
by the zebra daemon. BGP gets a chunk of labels
that are shared with the multiple usage of the BGP
daemon. A label type attribute is used whenever
BGP needs a new label value.
The 'LP_TYPE_BGP_L3VPN_BIND' label type will be used
to allocate the MPLS labels that will be bound to
the original next-hop, and label value of an L3VPN
update.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 21 Apr 2023 12:28:28 +0000 (14:28 +0200)]
bgpd: track mpls vpn nexthops
There is no nexthop reachability information for
received MPLS VPN prefixes.
This information is necessary when BGP also acts
as LSR device, and is needed to create an MPLS entry
between two BGP speakers: the next-hop to pick-up
in the MPLS entry has to be connected.
The nexthop reachability information is available
for other non MPLS VPN prefixes, and is handled
by the bgp nexthop cache (bnc) contexts.
Extend the usage of the BNC contexts for L3VPN
prefixes.
Note that the MPLS VPN routes had to be redistributed
as before, to avoid breaking existing deployments
that use FRR as route reflectors. Because of this, the
nexthop reachability status has been maintained to OK
for MPLS VPN prefixes.
Note also that the label allocation per nexthop tracking
was wrongly using the MPLS VPN safi to get a valid BNC
context, when choosing which label to return in the
'vpn_leak_from_vrf_get_per_nexthop_label()' function.
Fix this by using SAFI_UNICAST instead.
Fixes: 577be36a41be ("bgpd: add support for l3vpn per-nexthop label") Link: https://github.com/FRRouting/frr/pull/13380 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 24 May 2023 11:50:37 +0000 (13:50 +0200)]
bgpd: fix label allocation per next-hop applied to unicast
The label allocation per next-hop functionality is calling
the 'bgp_find_or_add_nexthop()' method using the SAFI_MPLS_VPN
safi parameter, whereas the call is supposed to apply to
unicast paths.
Fix this by using the SAFI_UNICAST safi parameter in the call.
Simplify the vpn_leak_from_vrf_get_per_nexthop_label() API by
removing the safi parameter from the function.
Fixes: 577be36a41be ("bgpd: add support for l3vpn per-nexthop label") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When acting as intermediate device for BGP signaling, and
as transit device for data traffic, the device is not able
to modify the label value from incoming MPLS VPN updates:
- as BGP device, modifying the label value is necessary
when redistributing VPN prefixes with its own next-hop.
- as transit device that connects two ethernet segments
on separate interfaces, the return MPLS traffic must be
handled: the modified label value must be swapped with
the original label value and sent back to the original
next-hop.
The border router use case can be taken as example, when
it acts both as transit and as BGP device:
- When receiving updates from a border router peer, and where
interior traffic is expected to transit through the local
border router.
- When receiving updates from interior devices, and where
exterior traffic will transit through the local border router.
In those two situations, a new label is bound to the received
entry, and the entry is advertised to a new peer with the new
label. In the same time, an MPLS entry is created to handle
return traffic with the new mpls label: the traffic would be
swapped to the original MPLS label and the original next-hop.
This is the first commit of a series of patches, that address
the above mentioned issue.
The first commit introduces a new per-interface command:
This command will authorise mpls vpn updates to have a new
label value bound to the mpls vpn routes received over that
interface.
Link: https://www.rfc-editor.org/rfc/rfc3107.html#section-3
> When a BGP speaker redistributes a route, the label(s) assigned to
> that route must not be changed (except by omission), unless the
> speaker changes the value of the Next Hop attribute of the route.
Donald Sharp [Wed, 14 Jun 2023 16:25:18 +0000 (12:25 -0400)]
bgpd: some safi's do not mix with bgp suppress-fib
BGP cannot decide to disseminate the safi based upon the
bgp suppress-fib command. Modify the code to look at
the safi for the decision to communicate to a peer the
particular node.
Ticket: #3402926 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donatas Abraitis [Tue, 13 Jun 2023 13:08:24 +0000 (16:08 +0300)]
bgpd: Use enum bgp_create_error_code as argument in header
```
bgpd/bgp_vty.c:865:5: warning: conflicting types for ‘bgp_vty_return’ due to enum/integer mismatch; have ‘int(struct vty *, enum bgp_create_error_code)’ [-Wenum-int-mismatch]
865 | int bgp_vty_return(struct vty *vty, enum bgp_create_error_code ret)
| ^~~~~~~~~~~~~~
In file included from ./bgpd/bgp_mplsvpn.h:15,
from bgpd/bgp_vty.c:48:
./bgpd/bgp_vty.h:148:12: note: previous declaration of ‘bgp_vty_return’ with type ‘int(struct vty *, int)’
148 | extern int bgp_vty_return(struct vty *vty, int ret);
| ^~~~~~~~~~~~~~
```
Donatas Abraitis [Tue, 13 Jun 2023 13:01:40 +0000 (16:01 +0300)]
bgpd: Use enum bgp_fsm_state_progress for bgp_stop()
```
bgpd/bgp_fsm.c:1360:29: warning: conflicting types for ‘bgp_stop’ due to enum/integer mismatch; have ‘enum bgp_fsm_state_progress(struct peer *)’ [-Wenum-int-mismatch]
1360 | enum bgp_fsm_state_progress bgp_stop(struct peer *peer)
| ^~~~~~~~
In file included from bgpd/bgp_fsm.c:29:
./bgpd/bgp_fsm.h:111:12: note: previous declaration of ‘bgp_stop’ with type ‘int(struct peer *)’
111 | extern int bgp_stop(struct peer *peer);
| ^~~~~~~~
```
Christian Hopps [Fri, 9 Jun 2023 20:54:54 +0000 (16:54 -0400)]
lib: mgmtd: improvements in logging and commentary
- log names of datastores not numbers
- improve logging for mgmt_msg_read
- Rather than use a bool, instead store the pending const string name of
the command being run that has postponed the CLI. This adds some nice
information to the logging when enabled.
pimd,pim6d: Correct the socket to send reg-stop msg
We were using the pim interface socket to send the register
stop msg, it works fine in cases where the interface on which
register msg is received and the interface on which the register-stop
msg is supposed to be sent is the same.
But when the interfaces are different, msg send fails because
the outgoing interface is not right.
Mark Stapp [Tue, 23 May 2023 19:31:31 +0000 (15:31 -0400)]
pbrd, zebra: fix zapi and netlink rule encoding
In pbrd, don't encode a rule without a table. There are cases
where the zapi encoding was incorrect because the 4-octet
table id was missing. In zebra, mask off the ECN bits in the
TOS byte when encoding an iprule to match netlink's
expectation.
Donald Sharp [Wed, 7 Jun 2023 14:09:27 +0000 (10:09 -0400)]
bgpd: Add some color to why nexthop_set failed
We are seeing some frequent test failures with
setting the nexthop correctly. At this point
in time, I have no idea what is going wrong,
but I don't have a bunch of information either,
so let's add the local and remote values.
Christian Hopps [Mon, 12 Jun 2023 02:13:48 +0000 (22:13 -0400)]
lib: mgmtd: session create and destroy both short-circuit
For creation this is the first thing done so short-circuit just means inline
sync response. However, for destroy there could be commands in-flight, these
will be discarded when they match no session, and the state cleaned up
immediately when the message short-circuits.
Christian Hopps [Sun, 11 Jun 2023 21:53:10 +0000 (17:53 -0400)]
vtysh: stop reading config file if user `exit`s from root level.
This is required to make sure that we properly send the
XFRR_end_configuration tag to the daemons. Previously if the user had an
`exit` at the root level the parser would just drop out of the config
node and so XFRR_end_configuration, even if sent, would be ignored
Abhishek N R [Mon, 12 Jun 2023 04:54:24 +0000 (21:54 -0700)]
pim6d: Correcting the help string
Max response time in the code is being used as decisecond but the user input is taken in millisecond.
Also yang expects the field to be in decisecond.
The below condition in yang is failing due to the mismatch in units.
```
units deciseconds;
must ". <= ../query-interval * 10";
```
Donald Sharp [Thu, 8 Jun 2023 16:03:49 +0000 (12:03 -0400)]
zebra: Prevent crash because nl is NULL on shutdown
When shutting down the main pthread was first closing
the sockets associated with the dplane pthread and
then telling it to shutdown the pthread at a later point
in time. This caused the dplane to crash because the nl
data has been freed already. Change the shutdown order
to stop the dplane pthread *and* then close the sockets.
Christian Hopps [Thu, 8 Jun 2023 08:12:26 +0000 (04:12 -0400)]
tests: convert old pim test to more cleanly use pytest fixture
This is a good way to run a per-test background helper process. Here the
helper object is created before the test function requesting it (through param
name match), and then cleaned up after the test function exits (pass or failed).
A context manager is used to further guarantee the cleanup is done.
Christian Hopps [Thu, 8 Jun 2023 06:42:32 +0000 (02:42 -0400)]
tests: fixing pim6 topotest bugs
- Remove use of bespoke socat
- Use ipv6 support in mcast-tester.py
- do not run processes in the background behind munet/micronet's
back with `&` (ever) -- use popen or the helper class