Steps to reproduce:
1. R1(LHR) sends IGMP join, R4(FHR) sends multicast traffic.
Verify traffic is flowing from FHR to LHR.
2. Restart R1(LHR).
3. Below sequence of events are happening after FRR restart in R1(LHR).
4. R1(LHR) Register RP address to Zebra.
5. R1(LHR) Receive update from Zebra that R2(RP) is reachable via R3.
6. R1(LHR) Receive IGMP join for group 225.1.1.1, will create pim upstream
and (*,G) mroute with IIF towards R3.
7. R1(LHR) Receive update from Zebra that RP is reachable via R2(RP).
8. R1(LHR) Update the PIM upstream IIF, but not updating the (*,G) IIF
even there is RPF change.
9. R1(LHR) receives IGMP join for group 225.1.1.2, will create (*,G) with
IIF towards R2(RP), both upstream and (,G) created with IIF towards R2(RP).
Root Cause:
Mroute IIF is not getting updated when better route update
received. It is still pointing to the older nexthop.
Fix:
Update the mroute IIF when there is change in nexthop.
Donald Sharp [Wed, 27 Jul 2022 16:17:50 +0000 (12:17 -0400)]
ospfd: Coverity warns that we could possibly use unininted data
In ospf_handle_exnl_lsa_lsId_chg there is a code path
where that we may be using uninitialized data for decisions.
Doubtful that this happens but let's make it less likely to
even more.
Donald Sharp [Wed, 27 Jul 2022 13:36:17 +0000 (09:36 -0400)]
bgpd: Ensure we are not using AFI_MAX
When using bgp_vty_afi_from_str it can
return AFI_MAX( but in practice never will with
our cli ). In bgp_default_afi_safi_cmd the code
directly references:
bgp->default_afi[afi][safi] = TRUE;
and if afi is AFI_MAX FRRR would be accessing
memory where it should not be.
Let's just provide some assurances for coverity
that this never happens.
the Zapi ZEBRA_RULE_ADD message was modified but
the bgp version was not updated appropriately and
when zebra received the message it did not properly
read it.
Kuldeep Kashyap [Sun, 8 May 2022 09:31:01 +0000 (02:31 -0700)]
tests: [PIMv6] APIs for multicast PIMv6 config
Enhanced few exsiting PIM APIs to support both
IPv4 and IPv6 configuration. Added few new APIs
for PIMv6. Tested all existing tests with new
API changes.
Donald Sharp [Wed, 20 Jul 2022 20:43:17 +0000 (16:43 -0400)]
ospfclient: Ensure ospf_apiclient_lsa_originate cannot accidently write into stack
Even though OSPF_MAX_LSA_SIZE is quite large and holds the upper bound
on what can be written into a lsa, let's add a small check to ensure
it is not possible to do a bad thing.
This wins one of the long standing bug awards. 2003!
Fixes: #11602 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Thu, 21 Jul 2022 19:42:51 +0000 (15:42 -0400)]
bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer. Not have
one's that write over others. Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC. This is insufficient in that
we were accidently overwriting the one LL with other
data. This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.
Philippe Guibert [Tue, 12 Jul 2022 10:12:01 +0000 (12:12 +0200)]
topotests: add bfd_vrflite_topo1 test
This tests checks that there are no errors when receiving BFD
packets over the various linux vrf interfaces. For example, if
an incoming packet is received by the wrong socket, a VRF
mismatch error would occur, and BFD flapping would be observed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Fri, 3 Jun 2022 14:59:31 +0000 (10:59 -0400)]
bgpd: Convert thread_cancel to THREAD_OFF and use THREAD_ARG
Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Donald Sharp [Fri, 3 Jun 2022 14:43:45 +0000 (10:43 -0400)]
bgpd: Remove various macros that overlap THREAD_OFF
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Donald Sharp [Fri, 3 Jun 2022 14:37:34 +0000 (10:37 -0400)]
ripngd: Remove various macros that overlap THREAD_OFF
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Donald Sharp [Fri, 3 Jun 2022 14:33:12 +0000 (10:33 -0400)]
ripd: Remove various macros that overlap THREAD_OFF
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Donald Sharp [Fri, 3 Jun 2022 14:28:11 +0000 (10:28 -0400)]
ospfd: Remove various macros that overlap THREAD_OFF
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
bfdd: allow l3vrf bfd sessions without udp leaking
Until now, when in vrf-lite mode, the BFD implementation
creates a single UDP socket and relies on the following
sysctl value to 1:
echo 1 > /proc/sys/net/ipv4/udp_l3mdev_accept
With this setting, the incoming BFD packets from a given
vrf, would leak to the default vrf, and would match the
UDP socket.
The drawback of this solution is that udp packets received
on a given vrf may leak to an other vrf. This may be a
security concern.
The commit addresses this issue by avoiding this leak
mechanism. An UDP socket is created for each vrf, and each
socket uses new setsockopt option: SO_REUSEADDR + SO_REUSEPORT.
With this option, the incoming UDP packets are distributed on
the available sockets. The impact of those options with l3mdev
devices is unknown. It has been observed that this option is not
needed, until the default vrf sockets are created.
To ensure the BFD packets are correctly routed to the appropriate
socket, a BPF filter has been put in place and attached to the
sockets : SO_ATTACH_REUSEPORT_CBPF. This option adds a criterium
to force the packet to choose a given socket. If initial criteria
from the default distribution algorithm were not good, at least
two sockets would be available, and the CBPF would force the
selection to the same socket. This would come to the situation
where an incoming packet would be processed on a different vrf.
if (setsockopt(sd, SOL_SOCKET, SO_ATTACH_REUSEPORT_CBPF, &p, sizeof(p))) {
zlog_warn("unable to set SO_ATTACH_REUSEPORT_CBPF on socket: %s",
strerror(errno));
return -1;
}
Some tests have been done with by creating vrf contexts, and by using
the below vtysh configuration:
The results showed no issue related to packets received by
the wrong vrf. Even changing the udp_l3mdev_accept flag to
1 did not change the test results.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Tue, 19 Jul 2022 17:57:56 +0000 (13:57 -0400)]
zebra: Add some more data to rtadv socket failures
The creation of the rtadv socket can fail but there
is very very little data associated with this event
to let the operator know something has gone terribly
wrong.
Please note if this socket fails to create or fails
the setsockopt's rtadv is basically just really really
messed up. I am not sure what can be done here.
Donald Sharp [Fri, 8 Jul 2022 20:35:51 +0000 (16:35 -0400)]
tests: Make bgp_snmp_mplsl3vpn be more forgiving
I rarely get this failure:
@classname: bgp_snmp_mplsl3vpn.test_bgp_snmp_mplsvpn
@name: test_pe1_converge_evpn
@time: 44.875
@message: AssertionError: BGP SNMP does not seem to be running
assert False
+ where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
+ where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
"Wait for protocol convergence"
tgen = get_topogen()
assertmsg = "BGP SNMP does not seem to be running"
> assert r1_snmp.test_oid("bgpVersion", "10"), assertmsg
E AssertionError: BGP SNMP does not seem to be running
E assert False
E + where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
E + where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
Under heavy system load a quick test before BGP can fully come up can result in a failed
test. Add some extra time for snmp to come up properly.