Donald Sharp [Thu, 24 Mar 2022 22:08:29 +0000 (18:08 -0400)]
lib: Fix `terminal monitor` uninited memory usage on freebsd
When `terminal monitor` is issued I am seeing this for valgrind on freebsd:
2022/03/24 18:07:45 ZEBRA: [RHJDG-5FNSK][EC 100663304] can't open configuration file [/usr/local/etc/frr/zebra.conf]
==52993== Syscall param sendmsg(sendmsg.msg_control) points to uninitialised byte(s)
==52993== at 0x4CE268A: _sendmsg (in /lib/libc.so.7)
==52993== by 0x4B96245: ??? (in /lib/libthr.so.3)
==52993== by 0x4CDF329: sendmsg (in /lib/libc.so.7)
==52993== by 0x49A9994: vtysh_do_pass_fd (vty.c:2041)
==52993== by 0x49A9994: vtysh_flush (vty.c:2070)
==52993== by 0x499F4CE: thread_call (thread.c:2002)
==52993== by 0x495D317: frr_run (libfrr.c:1196)
==52993== by 0x2B4068: main (main.c:471)
==52993== Address 0x7fc000864 is on thread 1's stack
==52993== in frame #3, created by vtysh_flush (vty.c:2065)
Donald Sharp [Thu, 24 Mar 2022 16:57:01 +0000 (12:57 -0400)]
zebra: Don't send uninited data to kernel on FreeBSD
When running zebra w/ valgrind, it was noticed that there
was a bunch of passing uninitialized data to the kernel:
==38194== Syscall param ioctl(generic) points to uninitialised byte(s)
==38194== at 0x4CDF88A: ioctl (in /lib/libc.so.7)
==38194== by 0x49A4031: vrf_ioctl (vrf.c:860)
==38194== by 0x2AFE29: vrf_if_ioctl (ioctl.c:91)
==38194== by 0x2AFF39: if_get_mtu (ioctl.c:161)
==38194== by 0x2B12C3: ifm_read (kernel_socket.c:653)
==38194== by 0x2A7F76: interface_list (if_sysctl.c:129)
==38194== by 0x2E9958: zebra_ns_enable (zebra_ns.c:127)
==38194== by 0x2E9958: zebra_ns_init (zebra_ns.c:214)
==38194== by 0x2B3F82: main (main.c:401)
==38194== Address 0x7fc000967 is on thread 1's stack
==38194== in frame #3, created by if_get_mtu (ioctl.c:155)
==38194==
==38194== Syscall param ioctl(generic) points to uninitialised byte(s)
==38194== at 0x4CDF88A: ioctl (in /lib/libc.so.7)
==38194== by 0x49A4031: vrf_ioctl (vrf.c:860)
==38194== by 0x2AFE29: vrf_if_ioctl (ioctl.c:91)
==38194== by 0x2AFED9: if_get_metric (ioctl.c:143)
==38194== by 0x2B12CB: ifm_read (kernel_socket.c:655)
==38194== by 0x2A7F76: interface_list (if_sysctl.c:129)
==38194== by 0x2E9958: zebra_ns_enable (zebra_ns.c:127)
==38194== by 0x2E9958: zebra_ns_init (zebra_ns.c:214)
==38194== by 0x2B3F82: main (main.c:401)
==38194== Address 0x7fc000967 is on thread 1's stack
==38194== in frame #3, created by if_get_metric (ioctl.c:137)
==38194==
==38194== Syscall param ioctl(generic) points to uninitialised byte(s)
==38194== at 0x4CDF88A: ioctl (in /lib/libc.so.7)
==38194== by 0x49A4031: vrf_ioctl (vrf.c:860)
==38194== by 0x2AFE29: vrf_if_ioctl (ioctl.c:91)
==38194== by 0x2B052D: if_get_flags (ioctl.c:419)
==38194== by 0x2B1CF1: ifam_read (kernel_socket.c:930)
==38194== by 0x2A7F57: interface_list (if_sysctl.c:132)
==38194== by 0x2E9958: zebra_ns_enable (zebra_ns.c:127)
==38194== by 0x2E9958: zebra_ns_init (zebra_ns.c:214)
==38194== by 0x2B3F82: main (main.c:401)
==38194== Address 0x7fc000707 is on thread 1's stack
==38194== in frame #3, created by if_get_flags (ioctl.c:411)
Donatas Abraitis [Thu, 24 Mar 2022 10:00:57 +0000 (12:00 +0200)]
bgpd: Turn off thread when running `no bmp targets X`
Avoid use-after-free and prevent from crashing:
```
(gdb) bt
0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
1 0x00007f2a15c2c30d in core_handler (signo=11, siginfo=0x7fffb915e630, context=<optimized out>) at lib/sigevent.c:261
2 <signal handler called>
3 0x00007f2a156201e4 in bmp_stats (thread=<optimized out>) at bgpd/bgp_bmp.c:1330
4 0x00007f2a15c3d553 in thread_call (thread=thread@entry=0x7fffb915ebf0) at lib/thread.c:2001
5 0x00007f2a15bfa570 in frr_run (master=0x55c43a393ae0) at lib/libfrr.c:1196
6 0x000055c43930627c in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:519
(gdb)
```
pimd: correct prefix-list ssm range update for exclude mode
Modifying the code as per RFC 4604 section 2.2.1
EXCLUDE mode does not apply to SSM addresses, and an SSM-aware router
will ignore MODE_IS_EXCLUDE and CHANGE_TO_EXCLUDE_MODE requests in
the SSM range.
Issue is observed when a group in exclude mode was in ASM range
as per the prefix-list and then prefix-list is modified to make
it fall under SSM range. The (*,G) entry remains there.
So when the group moves to ssm range and it is exclude mode,
delete the group from the IGMP table.
anlan_cs [Sat, 19 Mar 2022 05:15:09 +0000 (13:15 +0800)]
bgpd: remove check returning value of RB_INSERT()
Since the `RB_INSERT()` is called after not found in RB tree, it MUST be ok and
and return zero. The check of returning value of `RB_INSERT()` is redundant,
just remove them.
Donald Sharp [Sat, 19 Mar 2022 11:44:54 +0000 (07:44 -0400)]
zebra: Do not complain if deletion fails
When issuing a RTM_DELETE operation and the kernel tells
us that the route is already deleted, let's not complain
about the situation:
2022/03/19 02:40:34 ZEBRA: [EC 100663303] kernel_rtm: 2a10:cc42:1d51::/48: rtm_write() unexpectedly returned -4 for command RTM_DELETE
I can recreate this issue on freebsd by doing this:
a) create a route using sharpd
b) shutdown the nexthop's interface
c) remove the route using sharpd
This would also be true of pretty much any routing protocol's behavior.
Let's just not complain about the situation if a RTM_DELETE
operation is issued and FRR is told that the route does not
exist to delete.
anlan_cs [Sat, 19 Mar 2022 05:04:43 +0000 (13:04 +0800)]
zebra: remove check returning value of RB_INSERT()
Since the `RB_INSERT()` is called after not found in RB tree, it MUST be ok and
and return zero. The check of returning value of `RB_INSERT()` is redundant,
just remove them.
The EAD-per-ES route carries ECs for all the ES-EVI RTs. As the number of VNIs
increase all RTs do not fit into a standard BGP UPDATE (4K) so the route needs
to be fragmented.
Each fragment is associated with a separate RD and frag-id -
1. Local ES-per-EAD -
ES route table - {ES-frag-ID, ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
2. Remote ES-per-EAD -
VNI route table - {ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
Note: The fragment ID is abandoned in the per-VNI routing table. At this
point that is acceptable as we dont expect more than one-ES-per-EAD fragment
to be imported into the per-VNI routing table. But that may need to be
re-worked at a later point.
CLI changes (sample with 4 VNIs per-fragment for experimental pruposes) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es 03:44:38:39:ff:ff:01:00:00:01"
ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
RD: 27.0.0.21:3
Originator-IP: 27.0.0.21
Local ES DF preference: 50000
VNI Count: 10
Remote VNI Count: 10
VRF Count: 3
MACIP EVI Path Count: 33
MACIP Global Path Count: 198
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
Fragments: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
27.0.0.21:3 EVIs: 4
27.0.0.21:13 EVIs: 4
27.0.0.21:22 EVIs: 2
VTEPs:
27.0.0.22 flags: EA df_alg: preference df_pref: 32767
27.0.0.23 flags: EA df_alg: preference df_pref: 32767
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es-evi vni 1002 detail"
VNI: 1002 ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
ES fragment RD: 27.0.0.21:13 >>>>>>>>>>>>>>>>>>>>>>>>>
Inconsistencies: -
VTEPs: 27.0.0.22(EV),27.0.0.23(EV)
anlan_cs [Wed, 16 Mar 2022 13:38:17 +0000 (21:38 +0800)]
bgpd: fix wrong check on local es routes
Importing local es routes should be skipped. But the check of it is a bit wrong.
It is ok that local es routes can't be imported, but importing local es will
wrongly enter `uninstall` procedure.
Just adjust this check to make it clear. Immediately return in the case
of importing local es routes.
Manoj Naragund [Thu, 17 Mar 2022 11:28:02 +0000 (16:58 +0530)]
ospf6d: crash in ospf6_decrement_retrans_count.
Problem:
ospf6d crash is observed when lsack is received from the neighbour for
AS External LSA.
RCA:
The crash is observed in ospf6_decrement_retrans_count while decrementing
retransmit counter for the LSA when lsack is recived. This is because in
ospf6_flood_interace when new LSA is being added to the neighbour's list
the incrementing is happening on the received LSA instead of the already
present LSA in scope DB which is already carrying counters.
when this new LSA replaces the old one, the already present counters are
not copied on the new LSA this creates counter mismatch which results in
a crash when lsack is recevied due to counter going to negative.
Fix:
The fix involves following changes.
1. In ospf6_flood_interace when LSA is being added to retrans list
check if there is alreday lsa in the scoped db and increment
the counter on that if present.
2. In ospf6_lsdb_add copy the retrans counter from old to new lsa
when its being replaced.
Donatas Abraitis [Thu, 17 Mar 2022 07:31:05 +0000 (09:31 +0200)]
pimd: Show all groups matched by an arbitrary prefix for `pim rp-info`
```
r1# show ip pim rp-info
RP address group/prefix-list OIF I am RP Source Group-Type
192.168.10.123 225.0.0.0/24 eth2 yes Static ASM
192.168.10.123 239.0.0.0/8 eth2 yes Static ASM
192.168.10.123 239.4.0.0/24 eth2 yes Static SSM
r1# show ip pim rp-info 239.4.0.0/25
RP address group/prefix-list OIF I am RP Source Group-Type
192.168.10.123 239.0.0.0/8 eth2 yes Static ASM
192.168.10.123 239.4.0.0/24 eth2 yes Static SSM
```
tests: Adding database information in ospfv3 testcase
Improving the test case to show database info as well
to help narrow down whether its a LSA origination problem or
route calculation problem in case of failures.
ron [Thu, 10 Feb 2022 11:26:53 +0000 (19:26 +0800)]
lib: wheel's typo fix
The wheel data structure is a array of list pointers
but the alloc for it is using the sizeof (struct listnode *)
as the amount to allocate. Even though the (struct listnode *)
and (struct list *) sizes are the same, let's list the correct
values.