Donald Sharp [Wed, 13 Oct 2021 13:03:27 +0000 (09:03 -0400)]
tests: Fix `Invalid escape sequence` warnings in test runs
Test runs are creating these warnings:
bgp_l3vpn_to_bgp_vrf/test_bgp_l3vpn_to_bgp_vrf.py::test_check_linux_mpls
<string>:7: DeprecationWarning: invalid escape sequence \d
Igor Ryzhov [Fri, 8 Oct 2021 21:22:31 +0000 (00:22 +0300)]
lib: set type for newly created interfaces
Currently, the ll_type is set only in `netlink_interface` which is
executed only during startup. If the interface is created when the FRR
is already running, the type is not stored.
Igor Ryzhov [Thu, 7 Oct 2021 12:53:10 +0000 (15:53 +0300)]
*: don't pass pointers to a local variables to thread_add_*
We should never pass pointers to local variables to thread_add_* family.
When an event is executed, the library writes into this pointer, which
means it writes into some random memory on a stack.
ospfd: ospf nbr in full although mismatch in hello packet contents
Issue:
===================
OSPF neighbors are not going down even after 10 mins when
having a mismatch in hello and dead interval.
First neighbors are formed and then a mismatch in the interval
is created, it is observed that the neighbor is not going down.
Root Cause Analysis:
====================
The event HelloReceived defined in RFC 2328 was named as PacketReceived
and this event was scheduled whenever LS Update, LS Ack, LS Request,
DD description packet or Hello packet is received.
Although there is a mismatch in the Hello packet contents, the
event PacketReceived gets triggered due to LS Update received and the
dead timer gets reset and hence the neighbor was never going Down and
remains FULL.
Fix:
==================
As per RFC 2328, the HelloReceived needs to be triggered only when
valid OSPF Hello packet is received and not when other OSPF packets
are received. Modified the function name as well.
Donald Sharp [Tue, 5 Oct 2021 00:32:25 +0000 (20:32 -0400)]
watchfrr: Allow an integrated config to work within a namespace
Since watchfrr invokes vtysh to gather the show run output and
write the data, if we are operating inside of a namespace FRR
must also pass this in.
Yes. This seems hacky. I don't fully understand why vtysh
is invoked this way.
New output:
sharpd@eva:~/frr3$ sudo vtysh -N one
Hello, this is FRRouting (version 8.1-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
eva# wr mem
Note: this version of vtysh never writes vtysh.conf
% Can't open configuration file /etc/frr/one/vtysh.conf due to 'No such file or directory'.
Building Configuration...
Integrated configuration saved to /etc/frr/one/frr.conf
[OK]
eva#
Igor Ryzhov [Wed, 6 Oct 2021 14:35:07 +0000 (17:35 +0300)]
lib: fix incorrect thread management
The current code passes an address of a local variable to `thread_add_read`
which stores it into `thread->ref` by the lib. The next time the thread
callback is executed, the lib stores NULL into the `thread->ref` which
means it writes into some random memory on the stack.
To fix this, we should pass a pointer to the vector entry to the lib.
Donald Sharp [Wed, 6 Oct 2021 11:58:35 +0000 (07:58 -0400)]
bgpd: Check return from generic_set_add
Coverity found a couple of spots where FRR was
ignoring the return code of generic_set_add.
Just follow the code pattern for the rest of
the usage in the code.
Rafael Zalamena [Tue, 5 Oct 2021 15:37:34 +0000 (12:37 -0300)]
topotests: justify code sleep
Document the `sleep` statement so people know that we are sleeping
because we are waiting for the BFD down notification. If we don't
sleep here it is possible that we get outdated `show` command results.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Igor Ryzhov [Tue, 5 Oct 2021 14:38:21 +0000 (17:38 +0300)]
isisd: fix redistribute CLI
Currently, it is possible to configure IPv6 protocols for IPv4
redistribution and vice versa in CLI. The YANG model doesn't allow this
so the user receives the following error:
```
nfware(config-router)# redistribute ipv4 ospf6 level-1
% Failed to edit configuration.
YANG error(s):
Invalid enumeration value "ospf6".
Invalid enumeration value "ospf6".
Invalid enumeration value "ospf6".
YANG path: Schema location /frr-isisd:isis/instance/redistribute/ipv4/protocol.
```
Let's make CLI more user-friendly and allow only supported protocols in
redistribution commands.
Rafael Zalamena [Mon, 4 Oct 2021 21:10:58 +0000 (18:10 -0300)]
lib: prevent gRPC assert on missing YANG node
`yang_dnode_get` will `assert` if no YANG node/model exist, so lets test for
its existence first before trying to access it.
This `assert` is only acceptable for internal FRR usage otherwise we
might miss typos or unmatching YANG models nodes/leaves. For gRPC usage
we should let users attempt to use non existing models without
`assert`ing.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
rgirada [Tue, 5 Oct 2021 07:52:36 +0000 (00:52 -0700)]
ospf6d: ospf6d is crashing upon receiving duplicated Grace LSA.
Description:
When grace lsa received, DUT is adding
the copy of the lsas to all nbrs retransmission list as part of
flooding procedure and subsequently incrementing the rmt counter in
the original the LSA. This counter is supposed to be decremented
when ack is received by nbr and the lsa will be removed from retransmission list.
But in our current scenario,
Step-1:
When GR helper is disabled, if DUT receives the grace lsa
it adds the lsa copy to nbrs retransmission list but original
LSA will be discarded since GR helper disabled.
Step-2:
GR helper enabled and DUT receives the grace lsa, as part
of flooding process all nbrs have same copy of lsa in their
corresponding rmt list which was added in step -1 due to this
the corresponding rmt counter in the original lsa is not getting
incremented.
Step-3:
If the same copy of the grace lsa received by DUT, It considers
as implicit ack from nbr if the same copy of the lsa exits in its
rmt list and subsequently decrement the rmt counter.
Since counter is zero (because of step-1 and 2) , it is asserting while decrement.
Donald Sharp [Mon, 4 Oct 2021 12:37:16 +0000 (08:37 -0400)]
ospf6d: Ensure expire thread is properly stopped
The lsa->expire thread is for keeping track of when we
are expecting to expire(remove/delete) a lsa. There
are situations where we just decide to straight up
delete the lsa, but we are not ensuring that the
lsa is not already setup for expiration.
In that case just stop the expiry thread and
do the deletion.
Additionally there was a case where ospf6d was
just dropping the fact that a thread was already
scheduled for expiration. In that case we
should just setup the timer again and it will
reset it appropriately.
Fixes: #9721 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Mon, 4 Oct 2021 13:47:29 +0000 (09:47 -0400)]
eigrpd: Ensure better `struct thread *` semantics
1) Do not explicitly set the thread pointer to NULL.
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
2) Fix mixup of `struct eigrp_interface` and `struct eigrp`
usage of the same thread pointer.
Donald Sharp [Mon, 4 Oct 2021 13:36:27 +0000 (09:36 -0400)]
ripd: Ensure better `struct thread *` semantics
Do not explicitly set the thread pointer to NULL.
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Donald Sharp [Mon, 4 Oct 2021 13:28:36 +0000 (09:28 -0400)]
ripngd: Ensure better `struct thread *` semantics
1) Remove `struct thread *` pointers that are never used
2) Do not explicitly set the thread pointer to NULL.
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Donald Sharp [Wed, 29 Sep 2021 15:42:19 +0000 (11:42 -0400)]
bgpd: When removing v6 address being used as a nexthop ensure peer is reset
With v6 interface based peering, we send the global as well as the LL address
as nexthops to the peer. When either of these were removed on the interface
we were not necessarily resetting the connection. Leaving bgp in a state
where the peer had reachability for addresses that are no longer in use.
Modify the code that when we receive an interface address deletion
event. Check to see that we are using the v6 address as nexthops
for that peer and if so, tell it to reset.
I initially struggled with a hard reset of the peer or a clear but
choose to follow other places in the code that we noticed address
changes that resulted in hard resets.
Ticket: #2799568 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
rgirada [Fri, 1 Oct 2021 18:59:11 +0000 (11:59 -0700)]
ospfd: GR helper functionality change in helper exit
Description:
As per the RFC 3623 section 3.2,
OSPF nbr shouldn't be deleted even in unsuccessful helper exit.
1. Made the changes to keep neighbour even after exit.
2. Restart the dead timer after expiry in helper. Otherwise, Restarter
will be in FULL state in helper forever until it receives the 'hello'.
Low overhead bgp-evpn TPs have been added which push data out in a binary
format -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@switch:~# lttng list --userspace |grep "frr_bgp:evpn"
frr_bgp:evpn_mh_nh_rmac_zsend (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
frr_bgp:evpn_mh_nh_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
frr_bgp:evpn_mh_nhg_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
frr_bgp:evpn_mh_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
frr_bgp:evpn_bum_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
frr_bgp:evpn_mac_ip_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
root@switch:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
In addition to the tracepoints a babeltrace python plugin for pretty
printing (binary data is converted into grepable strings). Sample usage -
frr_babeltrace.py trace_path