Igor Ryzhov [Wed, 9 Feb 2022 23:51:49 +0000 (02:51 +0300)]
tools: fix frr-reload context keywords
There are singline-line commands inside `router bgp` that start with
`vnc ` or `bmp `. Those commands are currently treated as node-entering
commands. We need to specify such commands more precisely.
Igor Ryzhov [Wed, 9 Feb 2022 22:23:41 +0000 (01:23 +0300)]
bgpd: fix aspath memleak on error in vnc_direct_bgp_add_nve
bgp_attr_default_set creates a new empty aspath. If family error happens,
this aspath is not freed. Move attr initialization after we checked the
family.
Juraj Vijtiuk [Wed, 13 Oct 2021 16:32:53 +0000 (18:32 +0200)]
isisd: fix router capability TLV parsing issues
isis_tlvs.c would fail at multiple places if incorrect TLVs were
received causing stream assertion violations.
This patch fixes the issues by adding missing length checks, missing
consumed length updates and handling malformed Segment Routing subTLVs.
Signed-off-by: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
Small adjustments by Igor Ryzhov:
- fix incorrect replacement of srgb by srlb on lines 3052 and 3054
- add length check for ISIS_SUBTLV_ALGORITHM
- fix conflict in fuzzing data during rebase
Donald Sharp [Wed, 1 Dec 2021 22:03:38 +0000 (17:03 -0500)]
lib: Update hash.h documentation to warn of a possible crash
Multiple deletions from the hash_walk or hash_iteration calls
during a single invocation of the passed in function can and
will cause the program to crash. Warn against doing such a
thing.
Donald Sharp [Wed, 1 Dec 2021 21:28:42 +0000 (16:28 -0500)]
zebra: Ensure zebra_nhg_sweep_table accounts for double deletes
I'm seeing this crash in various forms:
Program terminated with signal SIGSEGV, Segmentation fault.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f418efbc7c0 (LWP 3580253))]
(gdb) bt
(gdb) f 4
267 (*func)(hb, arg);
(gdb) p hb
$1 = (struct hash_bucket *) 0x558cdaafb250
(gdb) p *hb
$2 = {len = 0, next = 0x0, key = 0, data = 0x0}
(gdb)
I've also seen a crash where data is 0x03.
My suspicion is that hash_iterate is calling zebra_nhg_sweep_entry which
does delete the particular entry we are looking at as well as possibly other
entries when the ref count for those entries gets set to 0 as well.
Then we have this loop in hash_iterate.c:
for (i = 0; i < hash->size; i++)
for (hb = hash->index[i]; hb; hb = hbnext) {
/* get pointer to next hash bucket here, in case (*func)
* decides to delete hb by calling hash_release
*/
hbnext = hb->next;
(*func)(hb, arg);
}
Suppose in the previous loop hbnext is set to hb->next and we call
zebra_nhg_sweep_entry. This deletes the previous entry and also
happens to cause the hbnext entry to be deleted as well, because of nhg
refcounts. At this point in time the memory pointed to by hbnext is
not owned by the pthread anymore and we can end up on a state where
it's overwritten by another pthread in zebra with data for other incoming events.
What to do? Let's change the sweep function to a hash_walk and have
it stop iterating and to start over if there is a possible double
delete operation.
qingkaishi [Fri, 4 Feb 2022 21:41:11 +0000 (16:41 -0500)]
babeld: fix #10502 #10503 by repairing the checks on length
This patch repairs the checking conditions on length in four functions:
babel_packet_examin, parse_hello_subtlv, parse_ihu_subtlv, and parse_update_subtlv
When I run pimd. Looking at the code there are 3 places where pim_bsm.c removes the
NHT BSR tracking. In 2 of them the code ensures that the address is already setup
in 1 place it is not. Fix.
Donald Sharp [Wed, 10 Nov 2021 21:58:58 +0000 (16:58 -0500)]
zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:
a) New nexthop group insertion seq 1
b) Route delete operation seq 3
c) Route insertion operation seq 2
Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response. Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.
This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:
a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12
Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:
unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one. This fixes the test code to do the right thing on the second installation.
Donald Sharp [Wed, 10 Nov 2021 20:09:37 +0000 (15:09 -0500)]
zebra: set zd_is_update in 1 spot
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.
Donald Sharp [Wed, 10 Nov 2021 19:12:39 +0000 (14:12 -0500)]
zebra: When we get an implicit or ack or full failure mark status
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code. Let's immediately
mark all of those in the batch pass/fail as needed. Instead
of having them marked else where.
Stephen Worley [Thu, 27 Jan 2022 17:41:49 +0000 (12:41 -0500)]
pbrd: pbr route maps get addr family of nhgs
When adding a nhg to a route map, make sure to specify the `family`
of the rm by looking at the contents of the nhg. Installation in the
kernel (for DSCP rules in particular) relies on this being specified in
the netlink message.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com> Signed-off-by: Stephen Worley <sworley@nvidia.com>
(cherry picked from commit 9a7ea213c072a24aa2059e04cb51502a3e956705)
Tomi Salminen [Wed, 2 Feb 2022 09:19:09 +0000 (11:19 +0200)]
ospfd: Core in ospf_if_down during shutdown.
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Igor Ryzhov [Sun, 23 Jan 2022 17:22:42 +0000 (20:22 +0300)]
zebra: fix cleanup of meta queues on vrf disable
Current code treats all metaqueues as lists of route_node structures.
However, some queues contain other structures that need to be cleaned up
differently. Casting the elements of those queues to struct route_node
and dereferencing them leads to a crash. The crash may be seen when
executing bgp_multi_vrf_topo2.
Fix the code by using the proper list element types.
Donald Sharp [Mon, 31 Jan 2022 17:49:55 +0000 (12:49 -0500)]
ospfd: Convert output to host order from network order for route_tag
FRR stores the route_tag in network byte order. Bug filed indicates
that the `show ip ospf route` command shows the correct value.
Every place route_tag is dumped in ospf_vty.c the ntohl function
is used first.
Fixes: #10450 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Igor Ryzhov [Thu, 27 Jan 2022 18:05:40 +0000 (21:05 +0300)]
vrrpd: use ipaddr_is_zero when needed
Replace custom implementation or call to ipaddr_isset with a call to
ipaddr_is_zero.
ipaddr_isset is not fully correct, because it's fine to have some
non-zero bytes at the end of the struct in case of IPv4 and the function
doesn't allow that.
Donald Sharp [Mon, 24 Jan 2022 20:37:54 +0000 (15:37 -0500)]
zebra: Don't double delete the table we are cleaning up
vrf_disable is always called first before
vrf_delete. The rnh_table and rnh_table_multicast tables
are already deleted as part of vrf_disable. No need
to do it again.
ckishimo [Tue, 25 Jan 2022 17:49:27 +0000 (18:49 +0100)]
ospf6d: show if area is NSSA
This PR will include if the area is NSSA in the output of "show ipv6 ospf"
r2# show ipv6 ospf
...
Area 0.0.0.0
Number of Area scoped LSAs is 8
Interface attached to this area: r2-eth1
SPF last executed 20.46717s ago
Area 0.0.0.1[Stub]
Number of Area scoped LSAs is 9
Interface attached to this area: r2-eth0
SPF last executed 20.46911s ago
Area 0.0.0.2[NSSA]
Number of Area scoped LSAs is 14
Interface attached to this area: r2-eth2
SPF last executed 20.46801s ago
Trey Aspelund [Tue, 16 Nov 2021 21:11:26 +0000 (21:11 +0000)]
bgpd: retain peer asn even with remove-private-AS
In situations where remove-private-AS is configured for eBGP peers
residing in a private ASN, the peer's ASN was not being retained
in the AS-Path which can allow loops to occur. This was addressed
in a prior commit but it only addressed cases where the "replace-AS"
keyword was configured.
This commit ensures we retain the peer's ASN when using
"remove-private-AS" for eBGP peers in a private ASN regardless of other
keywords.
Before ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 i <<<<< empty as-path, no way to prevent loop
After ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 42000000014200000001 i <<<< retain peer's asn, breaks loop
Igor Ryzhov [Sun, 23 Jan 2022 13:08:46 +0000 (16:08 +0300)]
*: do not send opaque data to zebra by default
Opaque data takes up a lot of memory when there are a lot of routes on
the box. Given that this is just a cosmetic info, I propose to disable
it by default to not shock people who start using FRR for the first time
or upgrades from an old version.
Quentin Young [Thu, 20 Jan 2022 22:24:30 +0000 (17:24 -0500)]
pimd: fix misuse of xpath buf size constants
XPATH_MAXLEN denotes the maximum length of an XPATH. It does not make
sense to allocate a buffer intended to contain an XPATH with a size
larger than the maximum allowable size of an XPATH. Consequently this PR
removes buffers that do this. Prints into these buffers are now checked
for overflow.
Donatas Abraitis [Fri, 21 Jan 2022 21:31:58 +0000 (23:31 +0200)]
bgpd: Show negative form of capability extended-nexthop for interface peers
```
exit1-debian-11(config-router)# neighbor 192.168.100.3 remote-as external
exit1-debian-11(config-router)# do sh run | include extended
exit1-debian-11(config-router)# neighbor 192.168.100.3 capability extended-nexthop
exit1-debian-11(config-router)# do sh run | include extended
neighbor 192.168.100.3 capability extended-nexthop
exit1-debian-11(config-router)# no neighbor 192.168.100.3 capability extended-nexthop
exit1-debian-11(config-router)# do sh run | include extended
exit1-debian-11(config-router)# neighbor eth0 interface remote-as external
exit1-debian-11(config-router)# do sh run | include extended
exit1-debian-11(config-router)# neighbor eth0 capability extended-nexthop
exit1-debian-11(config-router)# do sh run | include extended
exit1-debian-11(config-router)# no neighbor eth0 capability extended-nexthop
exit1-debian-11(config-router)# do sh run | include extended
no neighbor eth0 capability extended-nexthop
exit1-debian-11(config-router)#
```