Tomi Salminen [Wed, 2 Feb 2022 09:19:09 +0000 (11:19 +0200)]
ospfd: Core in ospf_if_down during shutdown.
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Rafael Zalamena [Mon, 13 Dec 2021 20:21:56 +0000 (17:21 -0300)]
bgpd: fix aggregate route AS Path attribute
Always free the locally allocated attribute not the one we are using for
return. This fixes a memory leak and a crash when AS Path is set with
route-map.
Igor Ryzhov [Tue, 14 Dec 2021 13:28:08 +0000 (16:28 +0300)]
isisd: fix use after free
Pointers to the adjacency must be cleared only when the adjacency is
deleted. Otherwise, when the ISIS router is deleted later, the adjacency
is not deleted and a crash happens because of UAF.
Igor Ryzhov [Wed, 24 Nov 2021 12:01:41 +0000 (15:01 +0300)]
bfdd: fix detection timeout update
Per RFC 5880 section 6.8.12, the use of a Poll Sequence is not necessary
when the Detect Multiplier is changed. Currently, we update the Detection
Timeout only when a Poll Sequence is terminated, therefore we ignore the
Detect Multiplier change if it's not accompanied with RX/TX timer change.
To fix the problem, we should update the Detection Timeout on every
received packet.
Igor Ryzhov [Fri, 12 Nov 2021 16:32:06 +0000 (19:32 +0300)]
bgpd: fix source-address for BFD sessions when using update-source IFNAME
When "update-source IFNAME" is used for the neighbor, p->update_source
is set to NULL, so we can't use it as a source address and should use
the address from p->su_local.
Quentin Young [Fri, 12 Nov 2021 19:45:36 +0000 (14:45 -0500)]
doc: update & clarify language in process arch doc
There was a historical blurb at the top of the process architecture
document that in several instances caused some confusion regarding
whether or not FRR supports multithreading. Remove this paragraph and
replace it with a summary of the page contents.
Donald Sharp [Thu, 11 Nov 2021 18:25:35 +0000 (13:25 -0500)]
ospfd: Prevent use after free on shutdown
Running ospf_topo_vrf1 leads us to this valgrind issue:
==2386518== Invalid read of size 8
==2386518== at 0x4971520: route_top (table.c:401)
==2386518== by 0x181F08: ospf_interface_bfd_apply (ospf_bfd.c:126)
==2386518== by 0x182069: ospf_interface_disable_bfd (ospf_bfd.c:158)
==2386518== by 0x18BF51: ospf_del_if_params (ospf_interface.c:557)
==2386518== by 0x18C584: ospf_if_delete_hook (ospf_interface.c:712)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Address 0x5df39a0 is 0 bytes inside a block of size 56 free'd
==2386518== at 0x48399AB: free (vg_replace_malloc.c:538)
==2386518== by 0x492A03E: qfree (memory.c:141)
==2386518== by 0x4970C6F: route_table_free (table.c:141)
==2386518== by 0x4970A36: route_table_finish (table.c:61)
==2386518== by 0x18C543: ospf_if_delete_hook (ospf_interface.c:708)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Block was alloc'd at
==2386518== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==2386518== by 0x4929EFC: qcalloc (memory.c:116)
==2386518== by 0x49709F8: route_table_init_with_delegate (table.c:53)
==2386518== by 0x49717F4: route_table_init (table.c:528)
==2386518== by 0x18C328: ospf_if_new_hook (ospf_interface.c:659)
==2386518== by 0x490C97D: hook_call_if_add (if.c:60)
==2386518== by 0x490CE85: if_create_name (if.c:223)
==2386518== by 0x490DF32: if_get_by_name (if.c:622)
==2386518== by 0x4993F73: zclient_interface_add (zclient.c:2186)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518==
Fix the ordering to do the individual node tree cleanup after we delete
the data we care about.
Donald Sharp [Thu, 4 Nov 2021 12:01:14 +0000 (08:01 -0400)]
zebra: Send up ifindex for redistribution when appropriate
Currently the NEXTHOP_TYPE_IPV4 and NEXTHOP_TYPE_IPV6 are
not sending up the resolved ifindex for the route. This
is causing upper level protocols that have something like
this:
route-map FOO permit 10
match interface swp13
!
router ospf
redistribute static
!
ip route 4.5.6.7/32 10.10.10.10
where 10.10.10.10 resolves to interface swp13. The route-map
will never match in this case.
Since FRR has the resolved nexthop interface, FRR might as
well send it up to be selected on by the upper level protocol
as needed.
Rafael Zalamena [Tue, 2 Nov 2021 21:54:23 +0000 (18:54 -0300)]
bgpd: fix BFD configuration update on TTL change
When altering the TTL of a eBGP peer also update the BFD
configuration. This was only working when the configuration happened
after the peer connection had been established.
Abhishek Naik [Tue, 19 Oct 2021 23:45:26 +0000 (23:45 +0000)]
bgpd: Reset dynamic peer counter
Dynamic peer count is inconsistent in
"show bgp summary json" and "show bgp summary failed json" due to
dynamic peer counter 'dn_count' being reused without resetting
Igor Ryzhov [Mon, 18 Oct 2021 14:16:35 +0000 (17:16 +0300)]
ospfd: fix crash when creating vlink in unknown vrf
if_create_name crashes when vrf_id is VRF_UNKNOWN:
```
nfware# conf t
nfware(config)# router ospf vrf doesnt-exist
nfware(config-router)# area 1.1.1.1 virtual-link 2.2.2.2
vtysh: error reading from ospfd: Success (0)Warning: closing connection to ospfd because of an I/O error!
```
In startup, zebra would dump interface information from Kernel in 3
steps w/o lock: step1, get interface information; step2, get interface
ipv4 address; step3, get interface ipv6 address.
If any interface gets added after step1, but before step2/3, zebra
would get extra interface addresses in step2/3 that has not been added
into zebra in step1. Returning error in the referenced interface lookup
would cause the startup interface retrieval to be incomplete.
Igor Ryzhov [Fri, 8 Oct 2021 21:22:31 +0000 (00:22 +0300)]
lib: set type for newly created interfaces
Currently, the ll_type is set only in `netlink_interface` which is
executed only during startup. If the interface is created when the FRR
is already running, the type is not stored.
Igor Ryzhov [Fri, 1 Oct 2021 14:25:57 +0000 (17:25 +0300)]
vtysh: fix node walkup
The current code executes either "exit" or "end" once - so vtysh switches
either to the parent node or straight to the config node. But sometimes,
we need to exit to the grandparent node, which is not the config node.
Another issue is that some nodes are completely missing in this long
checklist, for example, BFD peer and profile nodes.
Instead of doing all this special checking, we should just always exit
the exact number of times we need - it is stored in "tried" variable.
David Lamparter [Mon, 27 Sep 2021 08:33:33 +0000 (10:33 +0200)]
pimd: fix UAF/heap corruption in BSM code
This `XFREE()` call is in plainly in the wrong spot. `rp_all` (the
224.0.0.0/4 entry) isn't supposed to be free'd ever, and the
conditional above makes quite clear that it remains in use.
It may be possible to exploit this as a heap corruption bug, maybe even
as RCE. I haven't tried; I randomly noticed this while working on the
BSM code. Luckily this code is only run by the CLI for the clear
command, so the surface is very small.
Igor Ryzhov [Wed, 15 Sep 2021 19:45:23 +0000 (22:45 +0300)]
bgpd: fix memory leaks when using route-maps
There are places where we use route-maps using duplicated attributes and
neither intern nor flush them after the usage. If a route-map has set
rules for aspath/communities, they will be allocated and never freed.
We should always flush unneeded duplicated attributes.
introduced the idea of v6 LL using interface up/down events
instead of nexthop resolution to know when a peering should
happen or not. This above commit left a hole where if the remote
peer connected to this bgp, the bgp code would still believe
the peering is down. Modify the code to double check and
ensure that we have proper v6 LL resolution flags set.
Igor Ryzhov [Wed, 8 Sep 2021 18:06:44 +0000 (21:06 +0300)]
bgpd: fix default-originate route-map processing
When processing a route-map for default-originate, we actually want to
match by attributes in routes from the RIB, but set attributes in the
newly originated route. Currently, it's not the case. Instead, we
construct a dummy path combining attributes from both routes, and we end
up with multiple problems:
- match by as-path doesn't work
- communities from the matched RIB route are copied to the newly
originated route
- we corrupt the RIB routes
To fix the issue, we should use the new route-map API that allows using
separate match/set objects.
David Lamparter [Mon, 28 Jun 2021 14:29:56 +0000 (16:29 +0200)]
ospf6d: don't create Adv-ID:0.0.0.0 LSAs at start
When ospf6d comes up, it gets interface and address state before it
decides on its router ID. This results in a bunch of LSAs with
advertising router ID 0.0.0.0 in the LSDB. Not quite right.
There's a whole bunch of paths leading to this, so just drop the LSA in
ospf6_lsa_originate. The router-ID change causes everything to be
readvertised anyway (... but the delete doesn't catch the 0.0.0.0 stuff
because the router-ID is now different.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
ospfd: Summary LSA is not originated when process is reset
Problem Statement:
==================
Summary LSA is not originated when router-id is modified or process is reset
Root Cause Analysis:
====================
When router-id is modified or process is cleared, all the external LSAs are
flushed then LSA is re-originated using ospf_external_lsa_rid_change
When the LSAs are flushed, the aggregate flags are not reset.
Fix:
===============
Reset the aggregation flag when the LSAs
are flushed.
ospfd: Memory Leak seen at show_ip_ospf_neighbor_all_common.
Problem Statement:
==================
Memory Leak seen at show_ip_ospf_neighbor_all_common (ospf_vty.c:4635)
RCA:
=================
In function show_ip_ospf_neighbor_all_common, one child json object is not
added to the parent child object when there is no nbma neighbor. Hence
the memory leak.
Fix:
=================
Add the child object to the parent json object.
Igor Ryzhov [Thu, 2 Sep 2021 12:29:18 +0000 (15:29 +0300)]
bgpd: fix bgp_get_bound_name to handle views better
The vrf socket code needs a interface/vrf name to be passed
in, in order for it to properly bind to the correct vrf.
In the case where bgp is using a view based instance
the bgp_get_bound_name should handle views better and
not return anything to be bound to.
Fixes #9519. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Donald Sharp [Thu, 2 Sep 2021 00:50:31 +0000 (20:50 -0400)]
bgpd: Do not randomly generate a vrf id for -Z
When FRR added the -Z parameter the bgp daemon was setting
a vrf identifier based upon a number starting at 1. This
caused issues when we upgraded the code to the outgoing
sockets to use vrf_bind always.
FRR should never just randomly select a vrf identifier.
Let's just use VRF_DEFAULT when we are in a -Z environment.
It's a safe bet.
Fixes: #9519 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Wed, 1 Sep 2021 10:30:33 +0000 (06:30 -0400)]
ospf6d: Prevent crash of show ipv6 ospf data adv-router 0.0.0.0 linkstate-id 0.0.0.0
With this sequence of events:
eva# conf
eva(config)# router ospf6
eva(config-ospf6)# end
eva# show ipv6 ospf data adv-router 0.0.0.0 linkstate-id 0.0.0.0
OSPF6: Received signal 11 at 1630442431 (si_addr 0x0, PC 0x559dcfa3a656); aborting...
OSPF6: zlog_signal+0x18c 7fd2cc8229f77fff606775d0 /lib/libfrr.so.0 (mapped at 0x7fd2cc770000)
OSPF6: core_handler+0xe3 7fd2cc8616ad7fff606776f0 /lib/libfrr.so.0 (mapped at 0x7fd2cc770000)
OSPF6: funlockfile+0x50 7fd2cc74f1407fff60677840 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7fd2cc73b000)
OSPF6: ---- signal ----
OSPF6: ospf6_lsdb_type_show_wrapper+0x5d 559dcfa3a6567fff60677dd0 /usr/lib/frr/ospf6d (mapped at 0x559dcf9a5000)
OSPF6: show_ipv6_ospf6_database_adv_router_linkstate_id+0x1f9 559dcfa3c24a7fff60677e50 /usr/lib/frr/ospf6d (mapped at 0x559dcf9a5000)
ospfd: add dead-interval 40 if configured in show running
Problem Statement:
==================
When hello-interval is configured as 5, automatically dead interval becomes
4 times of hello i.e 20 seconds. But user wants the dead interval as
40 seconds and hello as 5 seconds. Therefore user configures it.
Now "ip ospf dead-interval 40" is not shown in "show running-config"
Therefore when user restarts the daemon, the dead interval goes back to
20 seconds and the neighbors are down.
Fix:
==================
If user configures dead-interval as 40, show it in show running config.