Igor Ryzhov [Wed, 9 Feb 2022 23:51:49 +0000 (02:51 +0300)]
tools: fix frr-reload context keywords
There are singline-line commands inside `router bgp` that start with
`vnc ` or `bmp `. Those commands are currently treated as node-entering
commands. We need to specify such commands more precisely.
Igor Ryzhov [Wed, 9 Feb 2022 22:43:37 +0000 (01:43 +0300)]
bgpd: remove bgp_attr_undup
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Igor Ryzhov [Wed, 9 Feb 2022 22:23:41 +0000 (01:23 +0300)]
bgpd: fix aspath memleak on error in vnc_direct_bgp_add_nve
bgp_attr_default_set creates a new empty aspath. If family error happens,
this aspath is not freed. Move attr initialization after we checked the
family.
Have added topotest to verify below combination.
Auth support for md5
Auth support for hmac-sha-256
Auth support with keychain for md5
Auth support with keychain for hmac-sha-256
Have sussessfully run all 4 test cases in my local setup.
Abhinay Ramesh [Tue, 8 Jun 2021 07:54:18 +0000 (07:54 +0000)]
ospf6d: Documentation for authentication trailer support.
Problem Statement:
=================
This commit is to add document support for OSPF6 authentication
trailer feature, which is adding support for RFC7166.
RCA:
====
NA
Fix:
====
To add detailed description for feature support.
This document caputres
Configuration CLI
Show commands
Debug commands
Clear command
That are added as part of the feature with examples.
It supports below show commands:
--------------------------------
frr# show ipv6 ospf6 interface ens192
ens192 is up, type BROADCAST
Interface ID: 5
Number of I/F scoped LSAs is 2
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication trailer is enabled with manual key ==> new info added
Packet drop Tx 0, Packet drop Rx 0 ==> drop counters
frr# show ipv6 ospf6 neighbor 2.2.2.2 detail
Neighbor 2.2.2.2%ens192
Area 1 via interface ens192 (ifindex 3)
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication header present ==> new info added
hello DBDesc LSReq LSUpd LSAck
Higher sequence no 0x0 0x0 0x0 0x0 0x0
Lower sequence no 0x242E 0x1DC4 0x1DC3 0x23CC 0x1DDA
frr# show ipv6 ospf6
OSPFv3 Routing Process (0) with Router-ID 2.2.2.2
Number of areas in this router is 1
Authentication Sequence number info ==> new info added
Higher sequence no 3, Lower sequence no 1656
Risk:
=====
Low risk
Tests Executed:
===============
Have executed the combination of commands.
Abhinay Ramesh [Sun, 30 May 2021 16:22:41 +0000 (16:22 +0000)]
ospf6d: Auth trailer CLI implementation.
Problem Statement:
==================
RFC 7166 support for OSPF6 in FRR code.
RCA:
====
This feature is newly supported in FRR
Fix:
====
Changes are done to add support for two new CLIs to configure
ospf6 authentication trailer feature.
One CLI is to support manual key configuration.
Other CLI is to configure key using keychain.
below CLIs are implemented as part of this commit. this configuration
is applied on interface level.
Without openssl:
ipv6 ospf6 authentication key-id (1-65535) hash-algo <md5|hmac-sha-256> key WORD
With openssl:
ipv6 ospf6 authentication key-id (1-65535) hash-algo <md5|hmac-sha-256|hmac-sha-1|hmac-sha-384|hmac-sha-512> key WORD
With keychain support:
ipv6 ospf6 authentication keychain KEYCHAIN_NAME
Running config for these command:
frr# show running-config
Building configuration...
Abhinay Ramesh [Tue, 11 May 2021 12:50:05 +0000 (12:50 +0000)]
ospf6d: support keychain for ospf6 authentication
Problem Statement:
==================
As of now there is no support for ospf6 authentication.
To support ospf6 authentication need to have keychain support for
managing the auth key.
RCA:
====
New support
Fix:
====
Enabling keychain for ospf6 authentication feature.
Risk:
=====
Low risk
Tests Executed:
===============
Have verified the support for ospf6 auth trailer feature.
Donald Sharp [Wed, 2 Feb 2022 18:28:42 +0000 (13:28 -0500)]
zebra: Make netlink buffer reads resizeable when needed
Currently when the kernel sends netlink messages to FRR
the buffers to receive this data is of fixed length.
The kernel, with certain configurations, will send
netlink messages that are larger than this fixed length.
This leads to situations where, on startup, zebra gets
really confused about the state of the kernel. Effectively
the current algorithm is this:
read up to buffer in size
while (data to parse)
get netlink message header, look at size
parse if you can
The problem is that there is a 32k buffer we read.
We get the first message that is say 1k in size,
subtract that 1k to 31k left to parse. We then
get the next header and notice that the length
of the message is 33k. Which is obviously larger
than what we read in. FRR has no recover mechanism
nor is there a way to know, a priori, what the maximum
size the kernel will send us.
Modify FRR to look at the kernel message and see if the
buffer is large enough, if not, make it large enough to
read in the message.
This code has to be per netlink socket because of the usage
of pthreads. So add to `struct nlsock` the buffer and current
buffer length. Growing it as necessary.
Fixes: #10404 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Wed, 2 Feb 2022 18:21:52 +0000 (13:21 -0500)]
zebra: Remove `struct nlsock` from dataplane information and use `int fd`
Store the fd that corresponds to the appropriate `struct nlsock` and pass
that around in the dplane context instead of the pointer to the nlsock.
Modify the kernel_netlink.c code to store in a hash the `struct nlsock`
with the socket fd as the key.
Why do this? The dataplane context is used to pass around the `struct nlsock`
but the zebra code has a bug where the received buffer for kernel netlink
messages from the kernel is not big enough. So we need to dynamically
grow the receive buffer per socket, instead of having a non-dynamic buffer
that we read into. By passing around the fd we can look up the `struct nlsock`
that will soon have the associated buffer and not have to worry about `const`
issues that will arise.
Donald Sharp [Tue, 8 Feb 2022 14:47:24 +0000 (09:47 -0500)]
zebra: Store the sequence number to use as part of the dp_info
Store and use the sequence number instead of using what is in
the `struct nlsock`. Future commits are going away from storing
the `struct nlsock` and the copy of the nlsock was guaranteeing
unique sequence numbers per message. So let's store the
sequence number to use instead.
Juraj Vijtiuk [Wed, 13 Oct 2021 16:32:53 +0000 (18:32 +0200)]
isisd: fix router capability TLV parsing issues
isis_tlvs.c would fail at multiple places if incorrect TLVs were
received causing stream assertion violations.
This patch fixes the issues by adding missing length checks, missing
consumed length updates and handling malformed Segment Routing subTLVs.
Signed-off-by: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
Small adjustments by Igor Ryzhov:
- fix incorrect replacement of srgb by srlb on lines 3052 and 3054
- add length check for ISIS_SUBTLV_ALGORITHM
- fix conflict in fuzzing data during rebase
Donald Sharp [Thu, 24 Jun 2021 16:23:33 +0000 (12:23 -0400)]
zebra: Fix ships in the night issue
When using wait for install there exists situations where
zebra will issue several route change operations to the kernel
but end up in a state where we shouldn't be at the end
due to extra data being received. Example:
a) zebra receives from bgp a route change, installs sends the
route to the kernel.
b) zebra receives a route deletion from bgp, removes the
struct route entry and then sends to the kernel a deletion.
c) zebra receives an asynchronous notification that (a) succeeded
but we treat this as a new route.
This is the ships in the night problem. In this case if we receive
notification from the kernel about a route that we know nothing
about and we are not in startup and we are doing asic offload
then we can ignore this update.
Ticket: #2563300 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Querier to non-querier transistion to be ignored in a case
As per RFC 2236 section 3, when the leave message is received at a querier,
it starts sending Query messages for "last Member Query Interval*query count"
During this time there should not be any querier to non-querier
transition and the same router needs to send the remaning queries.
qingkaishi [Fri, 4 Feb 2022 21:41:11 +0000 (16:41 -0500)]
babeld: fix #10502 #10503 by repairing the checks on length
This patch repairs the checking conditions on length in four functions:
babel_packet_examin, parse_hello_subtlv, parse_ihu_subtlv, and parse_update_subtlv
Donald Sharp [Wed, 10 Nov 2021 21:58:58 +0000 (16:58 -0500)]
zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:
a) New nexthop group insertion seq 1
b) Route delete operation seq 3
c) Route insertion operation seq 2
Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response. Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.
This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:
a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12
Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:
unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one. This fixes the test code to do the right thing on the second installation.
Donald Sharp [Wed, 10 Nov 2021 20:09:37 +0000 (15:09 -0500)]
zebra: set zd_is_update in 1 spot
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.