Philippe Guibert [Thu, 25 May 2023 10:40:45 +0000 (12:40 +0200)]
bgpd: move label allocation code to a specific function
The label allocation mechanism is called implicitly for
labeled unicast paths. The check should be explicit, because
the current patch set will extend the mechanism for mpls vpn
paths, and the code should explicitly tell which safi calls
which code.
Fix this implicit call by checking the safi value. Move the
code to a specific function.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 24 May 2023 15:26:13 +0000 (17:26 +0200)]
bgpd: move the label per nexthop structs of bgp_path to an union
The label per nexthop attributes take 24 bytes per bgp path
entry on AMD64 platform, and are only used for unicast paths.
The current patch-set introduces a similar attributes, but that
will be used only for l3vpn paths. To gain some memory on the
bgp_path_info structure in the next commit, do some changes.
Create an 'mplsvpn' union structure that will either include the
label per nexthop structs for ipv4 paths, or the l3vpn paths
structures. The 'label_nexthop_cache' and the 'label_nh_thread'
attributes of the 'bgp_path_info' structure are moved into an
union under a new structure called 'bgp_mplsvpn_label_nh_blnc'.
The flags attribute of 'bgp_path_info' is increased from 16 bits
to 32 bits, and the BGP_PATH_MPLSVPN_LABEL_NH flag is added to
know the 'mplsvpn' usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In the context of the ASBR facing an EBGP neighbor, or
facing an IBGP neighbor where the BGP updates received
are re-advertised with a modified next-hop, a new local
label will be re-advertised too, to replace the original
one.
Create a binding table, in the form of a hash list, from the
original labels to the new labels. Since labels can be the
same on several routers, set the next-hop and the label as
the keys. Add the needed API functions to manage the hash
list.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Louis Scalbert [Tue, 2 May 2023 14:30:20 +0000 (16:30 +0200)]
bgpd: add LP_TYPE_BGP_L3VPN_BIND label type
Redistributing mpls vpn prefixes with a new label
requires picking up unused MPLS local labels.
Today, there is an MPLS label pool which is owned
by the zebra daemon. BGP gets a chunk of labels
that are shared with the multiple usage of the BGP
daemon. A label type attribute is used whenever
BGP needs a new label value.
The 'LP_TYPE_BGP_L3VPN_BIND' label type will be used
to allocate the MPLS labels that will be bound to
the original next-hop, and label value of an L3VPN
update.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Fri, 21 Apr 2023 12:28:28 +0000 (14:28 +0200)]
bgpd: track mpls vpn nexthops
There is no nexthop reachability information for
received MPLS VPN prefixes.
This information is necessary when BGP also acts
as LSR device, and is needed to create an MPLS entry
between two BGP speakers: the next-hop to pick-up
in the MPLS entry has to be connected.
The nexthop reachability information is available
for other non MPLS VPN prefixes, and is handled
by the bgp nexthop cache (bnc) contexts.
Extend the usage of the BNC contexts for L3VPN
prefixes.
Note that the MPLS VPN routes had to be redistributed
as before, to avoid breaking existing deployments
that use FRR as route reflectors. Because of this, the
nexthop reachability status has been maintained to OK
for MPLS VPN prefixes.
Note also that the label allocation per nexthop tracking
was wrongly using the MPLS VPN safi to get a valid BNC
context, when choosing which label to return in the
'vpn_leak_from_vrf_get_per_nexthop_label()' function.
Fix this by using SAFI_UNICAST instead.
Fixes: 577be36a41be ("bgpd: add support for l3vpn per-nexthop label") Link: https://github.com/FRRouting/frr/pull/13380 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Philippe Guibert [Wed, 24 May 2023 11:50:37 +0000 (13:50 +0200)]
bgpd: fix label allocation per next-hop applied to unicast
The label allocation per next-hop functionality is calling
the 'bgp_find_or_add_nexthop()' method using the SAFI_MPLS_VPN
safi parameter, whereas the call is supposed to apply to
unicast paths.
Fix this by using the SAFI_UNICAST safi parameter in the call.
Simplify the vpn_leak_from_vrf_get_per_nexthop_label() API by
removing the safi parameter from the function.
Fixes: 577be36a41be ("bgpd: add support for l3vpn per-nexthop label") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When acting as intermediate device for BGP signaling, and
as transit device for data traffic, the device is not able
to modify the label value from incoming MPLS VPN updates:
- as BGP device, modifying the label value is necessary
when redistributing VPN prefixes with its own next-hop.
- as transit device that connects two ethernet segments
on separate interfaces, the return MPLS traffic must be
handled: the modified label value must be swapped with
the original label value and sent back to the original
next-hop.
The border router use case can be taken as example, when
it acts both as transit and as BGP device:
- When receiving updates from a border router peer, and where
interior traffic is expected to transit through the local
border router.
- When receiving updates from interior devices, and where
exterior traffic will transit through the local border router.
In those two situations, a new label is bound to the received
entry, and the entry is advertised to a new peer with the new
label. In the same time, an MPLS entry is created to handle
return traffic with the new mpls label: the traffic would be
swapped to the original MPLS label and the original next-hop.
This is the first commit of a series of patches, that address
the above mentioned issue.
The first commit introduces a new per-interface command:
This command will authorise mpls vpn updates to have a new
label value bound to the mpls vpn routes received over that
interface.
Link: https://www.rfc-editor.org/rfc/rfc3107.html#section-3
> When a BGP speaker redistributes a route, the label(s) assigned to
> that route must not be changed (except by omission), unless the
> speaker changes the value of the Next Hop attribute of the route.
Donatas Abraitis [Tue, 13 Jun 2023 13:08:24 +0000 (16:08 +0300)]
bgpd: Use enum bgp_create_error_code as argument in header
```
bgpd/bgp_vty.c:865:5: warning: conflicting types for ‘bgp_vty_return’ due to enum/integer mismatch; have ‘int(struct vty *, enum bgp_create_error_code)’ [-Wenum-int-mismatch]
865 | int bgp_vty_return(struct vty *vty, enum bgp_create_error_code ret)
| ^~~~~~~~~~~~~~
In file included from ./bgpd/bgp_mplsvpn.h:15,
from bgpd/bgp_vty.c:48:
./bgpd/bgp_vty.h:148:12: note: previous declaration of ‘bgp_vty_return’ with type ‘int(struct vty *, int)’
148 | extern int bgp_vty_return(struct vty *vty, int ret);
| ^~~~~~~~~~~~~~
```
Donatas Abraitis [Tue, 13 Jun 2023 13:01:40 +0000 (16:01 +0300)]
bgpd: Use enum bgp_fsm_state_progress for bgp_stop()
```
bgpd/bgp_fsm.c:1360:29: warning: conflicting types for ‘bgp_stop’ due to enum/integer mismatch; have ‘enum bgp_fsm_state_progress(struct peer *)’ [-Wenum-int-mismatch]
1360 | enum bgp_fsm_state_progress bgp_stop(struct peer *peer)
| ^~~~~~~~
In file included from bgpd/bgp_fsm.c:29:
./bgpd/bgp_fsm.h:111:12: note: previous declaration of ‘bgp_stop’ with type ‘int(struct peer *)’
111 | extern int bgp_stop(struct peer *peer);
| ^~~~~~~~
```
Christian Hopps [Fri, 9 Jun 2023 20:54:54 +0000 (16:54 -0400)]
lib: mgmtd: improvements in logging and commentary
- log names of datastores not numbers
- improve logging for mgmt_msg_read
- Rather than use a bool, instead store the pending const string name of
the command being run that has postponed the CLI. This adds some nice
information to the logging when enabled.
pimd,pim6d: Correct the socket to send reg-stop msg
We were using the pim interface socket to send the register
stop msg, it works fine in cases where the interface on which
register msg is received and the interface on which the register-stop
msg is supposed to be sent is the same.
But when the interfaces are different, msg send fails because
the outgoing interface is not right.
Mark Stapp [Tue, 23 May 2023 19:31:31 +0000 (15:31 -0400)]
pbrd, zebra: fix zapi and netlink rule encoding
In pbrd, don't encode a rule without a table. There are cases
where the zapi encoding was incorrect because the 4-octet
table id was missing. In zebra, mask off the ECN bits in the
TOS byte when encoding an iprule to match netlink's
expectation.
Donald Sharp [Wed, 7 Jun 2023 14:09:27 +0000 (10:09 -0400)]
bgpd: Add some color to why nexthop_set failed
We are seeing some frequent test failures with
setting the nexthop correctly. At this point
in time, I have no idea what is going wrong,
but I don't have a bunch of information either,
so let's add the local and remote values.
Christian Hopps [Mon, 12 Jun 2023 02:13:48 +0000 (22:13 -0400)]
lib: mgmtd: session create and destroy both short-circuit
For creation this is the first thing done so short-circuit just means inline
sync response. However, for destroy there could be commands in-flight, these
will be discarded when they match no session, and the state cleaned up
immediately when the message short-circuits.
Christian Hopps [Sun, 11 Jun 2023 21:53:10 +0000 (17:53 -0400)]
vtysh: stop reading config file if user `exit`s from root level.
This is required to make sure that we properly send the
XFRR_end_configuration tag to the daemons. Previously if the user had an
`exit` at the root level the parser would just drop out of the config
node and so XFRR_end_configuration, even if sent, would be ignored
Donald Sharp [Thu, 8 Jun 2023 16:03:49 +0000 (12:03 -0400)]
zebra: Prevent crash because nl is NULL on shutdown
When shutting down the main pthread was first closing
the sockets associated with the dplane pthread and
then telling it to shutdown the pthread at a later point
in time. This caused the dplane to crash because the nl
data has been freed already. Change the shutdown order
to stop the dplane pthread *and* then close the sockets.
Christian Hopps [Thu, 8 Jun 2023 08:12:26 +0000 (04:12 -0400)]
tests: convert old pim test to more cleanly use pytest fixture
This is a good way to run a per-test background helper process. Here the
helper object is created before the test function requesting it (through param
name match), and then cleaned up after the test function exits (pass or failed).
A context manager is used to further guarantee the cleanup is done.
Christian Hopps [Thu, 8 Jun 2023 06:42:32 +0000 (02:42 -0400)]
tests: fixing pim6 topotest bugs
- Remove use of bespoke socat
- Use ipv6 support in mcast-tester.py
- do not run processes in the background behind munet/micronet's
back with `&` (ever) -- use popen or the helper class
pimd, pim6d: Move mld/igmp deletion code to a common api
Move the mld/igmp deletion common code to api pim_gm_interface_delete
code for IPv6 deletion(gm_ifp_teardown) for MLD was missing in this flow
Making the code common fixes this too.
Move pim_if_membership_clear api from pimd/pim_nb_config.c
to pimd/pim_iface.c
Also fixed curly braces warning
WARNING: braces {} are not necessary for single statement blocks
1773: FILE: /tmp/f1-127504/pim_iface.c:1773:
Christian Hopps [Tue, 6 Jun 2023 19:12:58 +0000 (15:12 -0400)]
mgmtd: assert an assertion for coverity
I believe coverity can't tell the length of the return value from strftime based
on the format string (like we can), so it allows `n` to be larger than it could
be which then allows `sz - n` to be negative which is size_t positive and very
large so it thinks an overrun is possible.
Chirag Shah [Tue, 6 Jun 2023 04:48:12 +0000 (21:48 -0700)]
tools: fix list value remove in frr-reload
There might be a time element(s) from
temporary list are removed more than once
which leads to valueError in certain python3
version.
commit-id 1543f58b5 did not handle valueError
properly. This caused regression where
prefix-list config leads to delete followed
by add.
The new fix should just pass the exception as
value removal from list_to_add or list_to_del
is best effort.
This allows prefix-list config has no change
then removes the lines from lines_to_del and
lines_to_add properly.
Configure prefix-list in frr.conf and perform
multiple frr-reload. After first reload operatoin
subsequent ones should not result in delete followed
by add of the prefix-list but rather no-op operation.