Christian Hopps [Fri, 9 Jul 2021 07:58:02 +0000 (03:58 -0400)]
ospf6d: fix backlink check
This code has been wrong ~ever (according to git history). There are 3
conditional blocks with the added assertion that both the LSA and the
vertex being checked can't both be network LSAs.
The third block is clearly assuming both LSA and vertex are router
LSAs b/c it is accessing the backlink and lsdesc as router lsdesc's also
making sure both are p2p links (which they would have to be to point at
each other).
The programming error here is that (A && B) == False does NOT imply !A,
but the code is written that way.
So we end up in the third block one of LSA or vertex being network LSAs
rather easily (whenever that is the case and the desc isn't the backlink
being sought).
This was caught by ASAN b/c the lsdesc and backlinks are being accessed
(> 4 byte field offsets) as if they were router lsdesc's in the third
block, when in fact one of them is a network lsdesc which is only 4
bytes long -- so ASAN flags the access beyond bounds.
Yash Ranjan [Wed, 2 Jun 2021 12:45:18 +0000 (05:45 -0700)]
ospf6d: Set metric from default values when route-map metric not present
For redistribute commands if route map is set, then route takes it's metric
values from the route map. But in the case that routemap is set but the
metric for the route map is not, then the route metric is set to type-1.
Check if the metric is set for the routemap. If not then set it as per
redistributed default values (type 2)
Yash Ranjan [Mon, 24 May 2021 10:44:12 +0000 (03:44 -0700)]
ospf6d: Change the default metric parameter for redistribute
The default values will be set at the time of ospf6_redist_add
similar to ospfv2. Hence the default values will change as well.
Default metric type = E2
Default metic value = 20
lynne [Fri, 11 Jun 2021 14:38:42 +0000 (10:38 -0400)]
ospf6: Drop LSA with bad seqnumber
In RFC 2328 seqnumber 0x80000000 is reserved and not used. If an LSA
is received with seqnumber the LSA is now dropped and a debug error log
is printed.
Igor Ryzhov [Mon, 21 Jun 2021 15:04:46 +0000 (18:04 +0300)]
lib: remove vrf-interface config when removing the VRF
If we have the following configuration:
```
vrf red
smth
exit-vrf
!
interface red vrf red
smth
```
And we delete the VRF using "no vrf red" command, we end up with:
```
interface red
smth
```
Interface config is preserved but moved to the default VRF.
This is not an expected behavior. We should remove the interface config
when the VRF is deleted.
ospf6d: route-map config changed, not getting applied on all types of routes
Problem Statement:
==================
when route-map config is changed from permit to deny, it is not getting
applied to both connected and static and vice versa
RCA:
==================
When route-map changes from permit to deny or vice versa, a notification is
sent to ospf6 daemon via ospf6_asbr_routemap_update. In this function, a thread
is scheduled after 5 seconds to apply the route-map changes. In this thread
(ospf6_asbr_routemap_update_timer), only the first type is passed as argument
and only the first type i.e "connected" is passed and hence in callback only
on this type of route route-map gets applied.
Fix:
====
Need to loop through all the route-types in the call back and process
the route-map changes. Added a flag to mark which all route-types needs
to be processed.
Test Executed:
===============
1. Change route-map from permit to deny.
2. Change route-map from deny to permit.
3. Add new route and checked.
4. Verified summarised routes.
ospf6d: Max aged LSAs are not getting deleted from DB
Problem Statement:
==================
Max aged LSAs are not getting deleted from DB when there are multiple
neighbors in a LAN.
Root Cause Analysis:
====================
When the LSA is added to the neighbor's retransmit list, the LSA retrans count
is incremented but it is not checked if the LSA is already present in the
retransmit list leading to the count being incremented multiple times
untill the ack is not received and when the ack is received the count is
decremented once and hence the count never becomes 0 and
it remains in the DB forever.
Fix:
====================
Do not increment the retrans count multiple times if the LSA is already
present in the retransmit list of the neighbor. Also do not add the LSA
in the retransmit List if already present.
Martin Winter [Wed, 21 Jul 2021 13:06:19 +0000 (15:06 +0200)]
FRR Release 8.0
Major changes:
* A new daemon, `pathd`, has been added. This daemon implements support
for segment routing.
* EVPN Multihoming is now fully supported
* OSPFv3 now supports VRFs
* TI-LFA has been implemented in IS-IS and OSPF
* Zebra now has the ability to dump netlink messages in a human-friendly format
* LDP gained SNMP support
* Minimum libyang version is now 2.0
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Problem: Sometimes the configured Local GR state is not reflected in
show command and peer node. This is causing failures in few of the
BGP-GR topotests.
RCA: This problem is seen when the configuration of local GR state
happens when the BGP session is in OpenSent state and moves to
Established after the configuration is complete.
When the session gets established, we move the GR state value from stub peer
to the config peer. This will result in overriding the GR state to
previous value.
Fix: The local GR state is modified only through CLI configuration and
does not change during BGP FSM transition. In this case it is not necessary
to transfer the GR state value from stub peer to config peer. This way we
can ensure that always the most recent config value is present in peer
datastructure.
Igor Ryzhov [Mon, 12 Jul 2021 19:51:49 +0000 (22:51 +0300)]
ospf6d: fix freebsd mcast group issues
There's a delay in FreeBSD between issuing a command to leave a
multicast group and an actual leave. If we execute "no router ospf6" and
"router ospf6" fast enough, we can end up in a situation when OS
performs the leave later than it performs the join and the interface
remains without a multicast group.
Instead of counting on a one second delay, we must wait until the
interface actually leaves the group.
Donald Sharp [Wed, 7 Jul 2021 20:00:12 +0000 (16:00 -0400)]
lib: Allow ZAPI_MESSAGE_OPAQUE_LENGTH length of data
We are sending up to ZAPI_MESSAGE_OPAQUE_LENGTH but checking
for one less. We know the data will fit in it to that size.
Also we have asserts on the write to ensure we don't go over
it
Ondřej Surý [Mon, 15 Feb 2021 07:43:26 +0000 (08:43 +0100)]
doc: Use dpkg-buildpackage to build packages (add note about debuild)
The debuild command fails when we are doing source package only build
because it expects the arch-dependent .changes file to be present. Thus
in the instructions we switch to using dpkg-buildpackage directly and
add a note about using debuild in more complicated scenarios.
Ondřej Surý [Mon, 15 Feb 2021 07:40:08 +0000 (08:40 +0100)]
doc: Add instructions on how to build only source packages
In the CI, it's better to build the source package only once and then
instead of checking out the whole repository, only distribute the source
packages to the individual jobs.
Igor Ryzhov [Fri, 25 Jun 2021 11:59:28 +0000 (14:59 +0300)]
ospf6d: fix duplicated packet read
When OSPFv3 router is configured in both default and non-default VRFs,
every packet destined to a non-default VRF is read twice. This makes it
impossible to establish neighborship because every DbDesc packet is
treated as duplicated and we end up infinitely exchanging DbDescs.
We should drop packets received in the default VRF if an interface we
received it on is bound to another VRF.
Igor Ryzhov [Tue, 22 Jun 2021 21:27:55 +0000 (00:27 +0300)]
isisd: fix interface ldp-sync configuration
There are two checks done when configuring ldp-sync on an interface:
- interface is not a loopback
- interface is in the default VRF
Both checks are incorrectly done using the operational data.
The second check can be done using only config data - do that.
The first check can't be done using only configurational data, but it's
not necessary. LDP sync code doesn't operate on loopback interfaces
already. There's no harm in allowing this to be configured.
Igor Ryzhov [Fri, 18 Jun 2021 10:06:13 +0000 (13:06 +0300)]
lib: remove pure attribute from functions that modify memory
Almost all functions currently marked with pure attribute acquire a
route_node lock. By marking them pure we allow compiler to optimize the
code and not call them when it already knows the return value. This is
completely incorrect.
Only two of eleven functions can be marked as pure. And they still won't
be optimized because they are never called from the same function twice.
Let's remove the ext_pure macro completely to reduce the chance of
repeating this mistake in the future.
Wesley Coakley [Wed, 9 Jun 2021 03:50:43 +0000 (23:50 -0400)]
docker: Use tini unilaterally and stop tailing /dev/null
tini is a hyper-minimal PID 0 which spawns a child process (watchfrr.sh
in our case), reaps zombies and forwards signals to the script. Starting
watchfrr.sh directly instead of through the old `tail /dev/null` or
`sleep 365d` helps keep things clean too :)
While tini was previously only used in the Alpine container it is useful
to apply this PID 0 to all containers except the special CI ones.
Igor Ryzhov [Thu, 10 Jun 2021 17:21:51 +0000 (20:21 +0300)]
bgpd: fix routemap update with disabled delay timer
- vnc_routemap_update is called only for the last bgp instance
- vpn_policy_routemap_event is not called at all
- unguarded debug (there's already a debug inside the called function)
Make the code consistent with the callback code that is used when the
delay timer is enabled.
Igor Ryzhov [Fri, 11 Jun 2021 15:27:46 +0000 (18:27 +0300)]
isisd: per-instance dynamic hostname cache
Currently, the dynamic hostname cache is global. It is incorrect because
neighbors in different VRFs may have the same system ID and different
hostnames.
This also fixes a memory leak - when the instance is deleted, the cache
must be cleaned up and the cleanup thread must be cancelled.
Igor Ryzhov [Thu, 17 Jun 2021 16:31:03 +0000 (19:31 +0300)]
ospfd: fix routemap update
Currently, if the routemap already exists, we delete the pointer to it
when it is updated. We should delete the pointer only if the route-map
is actually deleted.
Igor Ryzhov [Tue, 25 May 2021 22:49:30 +0000 (01:49 +0300)]
ospf6d: fix interface area configuration
Currently the interface area is configured from the router node using
"interface IFNAME area ID" command. There are multiple problems with
this command:
- it is not in line with all other interface-related commands - other
parameters are configured from the interface node using "ipv6 ospf6"
prefix
- it is not in line with OSPFv2 - area is configured from the interface
node using "ip ospf area" command
- most importantly, it doesn't work correctly when the interface is in
a different VRF - instead of configuring the interface, it creates a
new fake interface and configuring it instead
To fix all the problems, this commit adds a new command to the interface
configuration node - "ipv6 ospf6 area ID". The purpose of the command is
completely the same, but it works correctly in a multi-VRF environment.
The old command is preserved for the backward compatibility, but the
warning is added that it is deprecated because it doesn't work correctly
with VRFs.
Igor Ryzhov [Mon, 7 Jun 2021 18:58:26 +0000 (21:58 +0300)]
tests: fix ospf6_topo1_vrf
ospf6d (and all other daemons except zebra) doesn't correctly process
`interface X vrf Y`, because it doesn't know existing VRFs at the time
of configuration file reading. Therefore it doesn't apply configuration
provided in the interface node.
Fix the problem by removing `vrf Y` part, having just an interface name
is enough.
Igor Ryzhov [Tue, 8 Jun 2021 14:01:56 +0000 (17:01 +0300)]
ospfd: fix memory leaks in summarization
To reproduce the issue:
1. Create summary-address: `summary-address 1.1.1.0/24`.
2. Try to delete it with the wrong tag: `no summary-address 1.1.1.0/24 tag 1`.
Each time this command is executed, route_node_lookup is called which
locks route node one more time. As the tag is wrong, the function
return immediately without unlock.
3. Finally delete the summary-address: `no summary-address 1.1.1.0/24`.
Donald Sharp [Tue, 8 Jun 2021 19:38:11 +0000 (15:38 -0400)]
zebra: Give extra space and stop processing if we run out of space
When processing bulk messages we need more space to handle more
mroutes. In this case we are doubling the stream size from
16k -> 32k, which should roughly double the number of mroutes
we can handle in one go.
Additionally. If we cannot parse the passed message into
the stream to pass up to pimd then gracefully stop processing
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
ospfd: fix crash when displaying neighbor data in JSON
Add a null check to protect against the case where the neighbor
inactive timer is disabled. That can happen when the router is
acting as a helper for another router that is attempting to restart
gracefully.
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
lib, ospfd, ospf6d: fix logging of pointer addresses
The %p printf format specifier does already print the pointer address
with a leading "0x" prefix (indicating a hexadecimal number). There's
no need to add that prefix manually.
While here, replace explicit function names in log messages by
__func__.
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
ospfd: fix cleanup of MaxAge LSAs on exit
During shutdown, the ospf->maxage_lsa table is iterated over to
clean up all existing entries. While doing that, route_unlock_node()
should be called only for the nodes that have an associated entry,
otherwise the table will get corrupted and ospfd will crash.
As a side note, using a routing table to store MaxAge LSAs was a
very poor choice of a data structure, considering that a simple
rb-tree or hash table would get the job done with a much simpler
(and less error-prone) API. Something to cleanup in the future...
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
ospfd: fix dangling pointer when exiting from the helper mode
When exiting from the helper mode for a given router after an
unsuccessful graceful restart, removing the neighborship to that
router straight away leads to a dangling pointer in the associated
interface, which inevitably leads to a crash. To solve this
problem, schedule the removal of the neighbor instead of removing
it immediately.
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
ospfd: fix small issue when exiting from the GR helper mode
When exiting from the GR helper mode, recalculate the DR only for
interfaces of the appropriate types (broadcast and NMBA).
This fixes a problem where the state of a neighbor reachable over a
p2p interface was changing from Full/DROther to Full/Backup across
a graceful restart.
Renato Westphal [Mon, 31 May 2021 13:27:51 +0000 (10:27 -0300)]
ospfd: fix GR helper initialization and termination
Since a single ospfd process can have multiple OSPF interfaces
configured, we need to separate the global GR initialization and
termination from per-instance initialization and termination.
Igor Ryzhov [Fri, 4 Jun 2021 14:47:32 +0000 (17:47 +0300)]
ospfd: fix passive interface configuration
Currently, passive interface flag is configured from the router node
using "passive-interface IFNAME". There are multiple problems with this
command:
- it is not in line with all other interface-related commands - other
parameters are configured from the interface node using "ip ospf"
prefix
- it is not in line with OSPFv3 - passive flag is configured from the
interface node using "ipv6 ospf6 passive" command
- most importantly, it doesn't work correctly when the interface is in
a different VRF - when using VRF-lite, it incorrectly changes the
vrf_id of the interface and it becomes desynced with the actual state;
when using netns, it creates a new fake interface and configures it
instead of configuring the necessary interface
To fix all the problems, this commit adds a new command to the interface
configuration node - "ip ospf passive". The purpose of the command is
completely the same, but it works correctly in a multi-VRF environment.
The old command is preserved for the backward compatibility, but the
warning is added that it is deprecated because it doesn't work correctly
with VRFs.
Rafael Zalamena [Mon, 7 Jun 2021 14:02:16 +0000 (11:02 -0300)]
lib: fix address sanitizer crash on `find`
Fix the following address sanitizer crash when running the command `find`:
ERROR: AddressSanitizer: dynamic-stack-buffer-overflow
WRITE of size 1 at 0x7fff4840fc1d thread T0
0 in print_cmd ../lib/command.c:1541
1 in cmd_find_cmds ../lib/command.c:2364
2 in find ../vtysh/vtysh.c:3732
3 in cmd_execute_command_real ../lib/command.c:995
4 in cmd_execute_command ../lib/command.c:1055
5 in cmd_execute ../lib/command.c:1219
6 in vtysh_execute_func ../vtysh/vtysh.c:486
7 in vtysh_execute ../vtysh/vtysh.c:671
8 in main ../vtysh/vtysh_main.c:721
9 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
10 in _start (/usr/bin/vtysh+0x21f64d)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Igor Ryzhov [Wed, 2 Jun 2021 14:27:02 +0000 (17:27 +0300)]
zebra: fix config after exit from vrf
When the VRF node is exited using "exit" or "quit", there's still a VRF
pointer stored in the vty context. If you try to configure some router
related command, it will be applied to the previous VRF instead of the
default VRF. For example:
```
(config)# vrf test
(config-vrf)# ip router-id 1.1.1.1
(config-vrf)# do show run
...
!
vrf test
ip router-id 1.1.1.1
exit-vrf
!
...
(config-vrf)# exit
(config)# ip router-id 2.2.2.2
(config)# do show run
...
!
vrf test
ip router-id 2.2.2.2
exit-vrf
!
...
```
`vrf-exit` works correctly, because it stores a pointer to the default
VRF into the vty context (but weirdly keeping the VRF_NODE instead of
changing it to CONFIG_NODE).
Instead of relying on the behavior of exit function, always use the
default VRF when in CONFIG_NODE.
Another problem is missing `VTY_CHECK_CONTEXT`. If someone deletes the
VRF in which node the user enters the command, then zebra applies the
command to the default VRF instead of throwing an error.