Has broken `make check` with recently new compilers:
/usr/bin/ld: staticd/libstatic.a(static_nb_config.o): warning: relocation against `zebra_ecmp_count' in read-only section `.text'
CCLD tests/bgpd/test_peer_attr
CCLD tests/bgpd/test_packet
/usr/bin/ld: staticd/libstatic.a(static_zebra.o): in function `static_zebra_capabilities':
/home/sharpd/frr5/staticd/static_zebra.c:208: undefined reference to `zebra_ecmp_count'
/usr/bin/ld: staticd/libstatic.a(static_zebra.o): in function `static_zebra_route_add':
/home/sharpd/frr5/staticd/static_zebra.c:418: undefined reference to `zebra_ecmp_count'
/usr/bin/ld: staticd/libstatic.a(static_nb_config.o): in function `static_nexthop_create':
/home/sharpd/frr5/staticd/static_nb_config.c:174: undefined reference to `zebra_ecmp_count'
/usr/bin/ld: /home/sharpd/frr5/staticd/static_nb_config.c:175: undefined reference to `zebra_ecmp_count'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
make: *** [Makefile:8679: tests/lib/test_grpc] Error 1
make: *** Waiting for unfinished jobs....
Essentially the newly introduced variable zebra_ecmp_count is not available in the
libstatic.a compiled and make check has code that compiles against it.
The fix is to just move the variable to the library.
Donald Sharp [Sat, 26 Feb 2022 20:40:15 +0000 (15:40 -0500)]
zebra: Allow *BSD to specify a receive buffer size
End operator is reporting that they are receiving buffer overruns
when attempting to read from the kernel receive socket. It is
possible to adjust this size to more modern levels especially
for when the system is under load. Modify the code base
so that *BSD operators can use the zebra `-s XXX` option
to specify a read buffer.
Additionally setup the default receive buffer size on *BSD
to be 128k instead of the 8k so that FRR does not run into
this issue again.
rgirada [Thu, 24 Feb 2022 17:33:08 +0000 (09:33 -0800)]
ospfd: NULL passed instead of ei pointer in external lsa origination
Description:
NULL pointer wrongly passed instead of 'ei' pointer to
ospf_external_lsa_originate() API in opaque capability enable/disable
which always make it to fail in origination.
Corrected it by passing actual ei pointer.
Donald Sharp [Fri, 18 Feb 2022 15:45:46 +0000 (10:45 -0500)]
bfdd: Fix overflow possibility with time statements
If time ( a uint64_t ) is large enough doing division
and subtraction can still lead to situations where
the resulting number is greater than a uint32_t.
Just use uint32_t as an intermediate storage spot.
This is unlikely to every occur in a time frame
I could possibly care about but makes Coverity happy.
Donald Sharp [Wed, 16 Feb 2022 00:47:23 +0000 (19:47 -0500)]
ripd: Fix packet send for non primary addresses
When rip is configured to work on secondary addresses
on an interface, rip was not properly sending out
the packets on secondary addresses because the source of the
packet was never properly being setup and rip would
send the packet out multiple times for the primary address
not once for each address on the interface that is setup to work.
Donald Sharp [Tue, 15 Feb 2022 20:53:30 +0000 (15:53 -0500)]
bgpd: Convert bgp error codes for cli input to an enum
Conversion of bgp error codes returned for cli input into
an enum and then properly handling all the error cases
in bgp_vty_return.
Because not all error codes returned were properly handled
in this function there existed configuration examples that
were accepted on the cli without an error message but not
saved.
Donald Sharp [Tue, 15 Feb 2022 21:04:50 +0000 (16:04 -0500)]
bgpd: Move some error codes to bgp_vty_return handling
BGP_ERR_PEER_GROUP_MEMBER and BGP_ERR_PEER_GROUP_PEER_TYPE_DIFFERENT
both are not handled by bgp_vty_return, but both can be handled by
this function as that there is nothing special going on here.
Donald Sharp [Tue, 15 Feb 2022 20:54:53 +0000 (15:54 -0500)]
bgpd: Remove impossible invalid state
confederations are checking to see that the bgp pointer
is non-null. But it's impossible to have a null pointer
in the cli and in all paths we have already deref'ed the bgp
pointer. Let's remove that error code as that it is impossible
to happen.
Donald Sharp [Mon, 14 Feb 2022 12:57:45 +0000 (07:57 -0500)]
bgp: Add a 15 minute warning to missing policy
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Igor Ryzhov [Wed, 9 Feb 2022 23:51:49 +0000 (02:51 +0300)]
tools: fix frr-reload context keywords
There are singline-line commands inside `router bgp` that start with
`vnc ` or `bmp `. Those commands are currently treated as node-entering
commands. We need to specify such commands more precisely.
Igor Ryzhov [Wed, 9 Feb 2022 22:23:41 +0000 (01:23 +0300)]
bgpd: fix aspath memleak on error in vnc_direct_bgp_add_nve
bgp_attr_default_set creates a new empty aspath. If family error happens,
this aspath is not freed. Move attr initialization after we checked the
family.
Juraj Vijtiuk [Wed, 13 Oct 2021 16:32:53 +0000 (18:32 +0200)]
isisd: fix router capability TLV parsing issues
isis_tlvs.c would fail at multiple places if incorrect TLVs were
received causing stream assertion violations.
This patch fixes the issues by adding missing length checks, missing
consumed length updates and handling malformed Segment Routing subTLVs.
Signed-off-by: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
Small adjustments by Igor Ryzhov:
- fix incorrect replacement of srgb by srlb on lines 3052 and 3054
- add length check for ISIS_SUBTLV_ALGORITHM
- fix conflict in fuzzing data during rebase
Donald Sharp [Wed, 1 Dec 2021 22:03:38 +0000 (17:03 -0500)]
lib: Update hash.h documentation to warn of a possible crash
Multiple deletions from the hash_walk or hash_iteration calls
during a single invocation of the passed in function can and
will cause the program to crash. Warn against doing such a
thing.
Donald Sharp [Wed, 1 Dec 2021 21:28:42 +0000 (16:28 -0500)]
zebra: Ensure zebra_nhg_sweep_table accounts for double deletes
I'm seeing this crash in various forms:
Program terminated with signal SIGSEGV, Segmentation fault.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f418efbc7c0 (LWP 3580253))]
(gdb) bt
(gdb) f 4
267 (*func)(hb, arg);
(gdb) p hb
$1 = (struct hash_bucket *) 0x558cdaafb250
(gdb) p *hb
$2 = {len = 0, next = 0x0, key = 0, data = 0x0}
(gdb)
I've also seen a crash where data is 0x03.
My suspicion is that hash_iterate is calling zebra_nhg_sweep_entry which
does delete the particular entry we are looking at as well as possibly other
entries when the ref count for those entries gets set to 0 as well.
Then we have this loop in hash_iterate.c:
for (i = 0; i < hash->size; i++)
for (hb = hash->index[i]; hb; hb = hbnext) {
/* get pointer to next hash bucket here, in case (*func)
* decides to delete hb by calling hash_release
*/
hbnext = hb->next;
(*func)(hb, arg);
}
Suppose in the previous loop hbnext is set to hb->next and we call
zebra_nhg_sweep_entry. This deletes the previous entry and also
happens to cause the hbnext entry to be deleted as well, because of nhg
refcounts. At this point in time the memory pointed to by hbnext is
not owned by the pthread anymore and we can end up on a state where
it's overwritten by another pthread in zebra with data for other incoming events.
What to do? Let's change the sweep function to a hash_walk and have
it stop iterating and to start over if there is a possible double
delete operation.
qingkaishi [Fri, 4 Feb 2022 21:41:11 +0000 (16:41 -0500)]
babeld: fix #10502 #10503 by repairing the checks on length
This patch repairs the checking conditions on length in four functions:
babel_packet_examin, parse_hello_subtlv, parse_ihu_subtlv, and parse_update_subtlv
When I run pimd. Looking at the code there are 3 places where pim_bsm.c removes the
NHT BSR tracking. In 2 of them the code ensures that the address is already setup
in 1 place it is not. Fix.
Donald Sharp [Wed, 10 Nov 2021 21:58:58 +0000 (16:58 -0500)]
zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:
a) New nexthop group insertion seq 1
b) Route delete operation seq 3
c) Route insertion operation seq 2
Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response. Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.
This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:
a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12
Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:
unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one. This fixes the test code to do the right thing on the second installation.
Donald Sharp [Wed, 10 Nov 2021 20:09:37 +0000 (15:09 -0500)]
zebra: set zd_is_update in 1 spot
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.
Donald Sharp [Wed, 10 Nov 2021 19:12:39 +0000 (14:12 -0500)]
zebra: When we get an implicit or ack or full failure mark status
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code. Let's immediately
mark all of those in the batch pass/fail as needed. Instead
of having them marked else where.
Stephen Worley [Thu, 27 Jan 2022 17:41:49 +0000 (12:41 -0500)]
pbrd: pbr route maps get addr family of nhgs
When adding a nhg to a route map, make sure to specify the `family`
of the rm by looking at the contents of the nhg. Installation in the
kernel (for DSCP rules in particular) relies on this being specified in
the netlink message.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com> Signed-off-by: Stephen Worley <sworley@nvidia.com>
(cherry picked from commit 9a7ea213c072a24aa2059e04cb51502a3e956705)
Tomi Salminen [Wed, 2 Feb 2022 09:19:09 +0000 (11:19 +0200)]
ospfd: Core in ospf_if_down during shutdown.
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Igor Ryzhov [Sun, 23 Jan 2022 17:22:42 +0000 (20:22 +0300)]
zebra: fix cleanup of meta queues on vrf disable
Current code treats all metaqueues as lists of route_node structures.
However, some queues contain other structures that need to be cleaned up
differently. Casting the elements of those queues to struct route_node
and dereferencing them leads to a crash. The crash may be seen when
executing bgp_multi_vrf_topo2.
Fix the code by using the proper list element types.
Donald Sharp [Mon, 31 Jan 2022 17:49:55 +0000 (12:49 -0500)]
ospfd: Convert output to host order from network order for route_tag
FRR stores the route_tag in network byte order. Bug filed indicates
that the `show ip ospf route` command shows the correct value.
Every place route_tag is dumped in ospf_vty.c the ntohl function
is used first.
Fixes: #10450 Signed-off-by: Donald Sharp <sharpd@nvidia.com>