Donald Sharp [Thu, 31 Mar 2022 19:56:24 +0000 (15:56 -0400)]
isisd, lib, ospfd, pathd: Null out free'd pointer
The commands:
router isis 1
mpls-te on
no mpls-te on
mpls-te on
no mpls-te on
!
Will crash
Valgrind gives us this:
==652336== Invalid read of size 8
==652336== at 0x49AB25C: typed_rb_min (typerb.c:495)
==652336== by 0x4943B54: vertices_const_first (link_state.h:424)
==652336== by 0x493DCE4: vertices_first (link_state.h:424)
==652336== by 0x493DADC: ls_ted_del_all (link_state.c:1010)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Address 0x6f928e0 is 272 bytes inside a block of size 320 free'd
==652336== at 0x48399AB: free (vg_replace_malloc.c:538)
==652336== by 0x494BA30: qfree (memory.c:141)
==652336== by 0x493D99D: ls_ted_del (link_state.c:997)
==652336== by 0x493DC20: ls_ted_del_all (link_state.c:1018)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Block was alloc'd at
==652336== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==652336== by 0x494B6F8: qcalloc (memory.c:116)
==652336== by 0x493D7D2: ls_ted_new (link_state.c:967)
==652336== by 0x47E4DD: isis_instance_mpls_te_create (isis_nb_config.c:1832)
==652336== by 0x495BB29: nb_callback_create (northbound.c:1034)
==652336== by 0x495B547: nb_callback_configuration (northbound.c:1348)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== by 0x495D23E: nb_cli_apply_changes (northbound_cli.c:268)
Let's null out the pointer. After this change. Valgrind no longer reports issues
and isisd no longer crashes.
Fixes: #10939 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
David Lamparter [Thu, 31 Mar 2022 15:35:32 +0000 (17:35 +0200)]
build: stick `-g` into LDFLAGS
Without `-g` in LDFLAGS we won't get debug info even if it's enabled in
CFLAGS. Since we're controlling debug info through CFLAGS, there's no
harm in always having `-g` in LDFLAGS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Javier Garcia [Thu, 10 Mar 2022 15:32:29 +0000 (16:32 +0100)]
docker: Adding support for ubi-8 images.
- Create frr docker container based in new Red Hat Universal Base
Images.
- This build a docker container based in ubi-8.
- Need to get the devel packages from centos-8 stream repos.
- Centos-8 stream repos added : base, appstream, powertools and epel
Signed-off-by: Javier Garcia <javier.martin.garcia@ibm.com>
Donald Sharp [Tue, 29 Mar 2022 21:47:48 +0000 (17:47 -0400)]
tools: Modify all_stop to actually try to stop all daemons
If a user enters:
/usr/lib/frr/frrinit.sh start
<modifies daemons file to remove a daemon from being used>
then calls:
/usr/lib/frr/frrinit.sh stop
The daemon(s) that are removed from the file are not stopped.
Apparently in shell scripting naming the variables the same
as the callee does not work as one would think. Just renaming
the variable to something else solved the problem.
David Lamparter [Mon, 28 Mar 2022 15:57:50 +0000 (17:57 +0200)]
pim6d: box out IPv4 fragmentation code
... this shouldn't run for IPv6. (We'll switch to not using
IPV6_HDRINCL later, so the kernel will handle it, but for the time being
let's just stop trying to use the IPv4 code for IPv6.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donatas Abraitis [Mon, 28 Mar 2022 08:41:35 +0000 (11:41 +0300)]
bgpd: Stop LLGR timer when the connection is established
When the connection goes up, the timer is not stopped and if we have a
subsequent GR event we have an old timer which is not as we expect.
Before:
```
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 95
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 08:27:53 2022
Time until Long-lived stale route deleted: 23 <<<<<<<<<<<<
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 103
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.0.1
65001 47583
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Last update: Mon Mar 28 08:43:29 2022
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 103
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 08:43:30 2022
Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<<
```
After:
```
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 79
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (0.0.0.0)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 09:05:18 2022
Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<<
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 87
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.0.1
65001 47583
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Last update: Mon Mar 28 09:05:25 2022
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 87
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 09:05:29 2022
Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<<
```
Donald Sharp [Sat, 26 Mar 2022 20:20:53 +0000 (16:20 -0400)]
lib: Ensure order of operations is expected with SECONDS
These 3 values:
ONE_DAY_SECOND
ONE_WEEK_SECOND
ONE_YEAR_SECOND
Are defined based upon the number of seconds. Unfortunately doing math
on these values say something like:
days = t->tv_sec / ONE_DAY_SECOND;
Once you go over about a day causes the order of operations to cause the multiplication
to get messed up:
204 if (!t)
(gdb) n
207 w = d = h = m = ms = 0;
(gdb) set t->tv_sec = ONE_DAY_SECOND + 30
(gdb) n
208 memset(buf, 0, size);
(gdb)
210 us = t->tv_usec;
(gdb)
211 if (us >= 1000) {
(gdb)
212 ms = us / 1000;
(gdb)
213 us %= 1000;
(gdb)
217 if (ms >= 1000) {
(gdb)
222 if (t->tv_sec > ONE_WEEK_SECOND) {
(gdb)
227 if (t->tv_sec > ONE_DAY_SECOND) {
(gdb)
228 d = t->tv_sec / ONE_DAY_SECOND;
(gdb) n
229 t->tv_sec -= d * ONE_DAY_SECOND;
(gdb) n
232 if (t->tv_sec >= HOUR_IN_SECONDS) {
(gdb) p d
$6 = 2073600
(gdb) p t->tv_sec
$7 = -179158953570
(gdb)
Converting to adding paranthesis around around the ONE_DAY_SECOND causes
the order of operations to work as expected.
Fixes: #10880 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Eugene Crosser [Wed, 23 Mar 2022 10:17:28 +0000 (11:17 +0100)]
autoconf: do not .gitignore m4/ax_lua.m4
The file m4/ax_lua.m4 needs to be a part of distribution, but it is not
inluded in the git repository by default becuase .gitignore file has a
wildcard for all *.m4 files, while individual files that must _not_ be
ignored are listed one by one as exceptions. ax_lua.m4 needs to be added
to this list of exceptions too.
One failure scenario is when you put a snapshot of the source tree in a
new git repository (e.g. the one used for a local CI/CD), this file is
not included in the repository, and subsequently build fails.
This commit adds the exception into m4/.gitignore file
Donald Sharp [Fri, 25 Mar 2022 23:08:14 +0000 (19:08 -0400)]
zebra: Note when the netlink DUMP command is interrupted
There exists code paths in the linux kernel where a dump command
will be interrupted( I am not sure I understand what this really
means ) and the data sent back from the kernel is wrong or incomplete.
At this point in time I am not 100% certain what should be done, but
let's start noticing that this has happened so we can formulate a plan
or allow the end operator to know bad stuff is a foot at the circle K.
That commit aim is to fix an invalid behavior when
default-information is activated on ospf router without always option.
Consider an ASBR with:
-one default route coming from ospf,
-and another default route coming from another deaemon (such BGP or static).
When the daemon bgp stops advertising its default route,
-ospf continues to advertise its previous default route (with aging 0),
-this may create default routing loops.
Expected behavior: is to update the removed external default route with
MAXAGING value.
Updating with MAXAGING value will notify the fact the route is currently
invalid. A later removal from ospf external LSA database will be made.
Analysis: all default routes have their type overwritten by a
DEFAULT_ROUTE type. Thus all default routes whatever its origin (ospf,
bgp, static...) is treated in a same way. But this is not pertinent for
ospf originated default routes.
Fix: avoid overwiting of route type when default route is ospf type.
Philippe Guibert [Tue, 15 Feb 2022 16:18:30 +0000 (17:18 +0100)]
ospfd: silently remove prefix sid already stored in config
There is no need to have an interface available to configure
SRGB. Conversely, it should be possible to remove the SRGB
when no interfaces are available.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Fri, 25 Mar 2022 11:44:55 +0000 (07:44 -0400)]
bgpd: Fix possible insufficient stream data
When reading the BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE
it is possible that the length read in the packet is insufficiently
large enough to read a BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE.
Let's ensure that it is.
Fixes: #10860 Signed-off-by: Donald Sharp <sharpd@nvidia.com>